dicee.eval_static_funcs
Static evaluation functions for KGE models.
This module provides backward compatibility by re-exporting from the new dicee.evaluation module.
Deprecated since version Use: dicee.evaluation submodules instead. This module will be
removed in a future version.
Functions
|
Evaluate link prediction performance with head and tail prediction. |
Evaluate link prediction with reciprocal relations. |
|
Evaluate link prediction with BPE encoding (head and tail). |
|
|
Evaluate link prediction with BPE encoding and reciprocals. |
|
Evaluate BPE link prediction with KvsAll scoring. |
|
Evaluate trained literal prediction model on a test file. |
Evaluate link prediction performance of an ensemble of KGE models. |
Module Contents
- dicee.eval_static_funcs.evaluate_link_prediction_performance(model, triples, er_vocab: Dict[Tuple, List], re_vocab: Dict[Tuple, List]) Dict[str, float]
Evaluate link prediction performance with head and tail prediction.
Performs filtered evaluation where known correct answers are filtered out before computing ranks.
- Parameters:
model – KGE model wrapper with entity/relation mappings.
triples – Test triples as list of (head, relation, tail) strings.
er_vocab – Mapping (entity, relation) -> list of valid tail entities.
re_vocab – Mapping (relation, entity) -> list of valid head entities.
- Returns:
Dictionary with H@1, H@3, H@10, and MRR metrics.
- dicee.eval_static_funcs.evaluate_link_prediction_performance_with_reciprocals(model, triples, er_vocab: Dict[Tuple, List]) Dict[str, float]
Evaluate link prediction with reciprocal relations.
Optimized for models trained with reciprocal triples where only tail prediction is needed.
- Parameters:
model – KGE model wrapper.
triples – Test triples as list of (head, relation, tail) strings.
er_vocab – Mapping (entity, relation) -> list of valid tail entities.
- Returns:
Dictionary with H@1, H@3, H@10, and MRR metrics.
- dicee.eval_static_funcs.evaluate_link_prediction_performance_with_bpe(model, within_entities: List[str], triples: List[Tuple[str]], er_vocab: Dict[Tuple, List], re_vocab: Dict[Tuple, List]) Dict[str, float]
Evaluate link prediction with BPE encoding (head and tail).
- Parameters:
model – KGE model wrapper with BPE support.
within_entities – List of entities to evaluate within.
triples – Test triples as list of (head, relation, tail) tuples.
er_vocab – Mapping (entity, relation) -> list of valid tail entities.
re_vocab – Mapping (relation, entity) -> list of valid head entities.
- Returns:
Dictionary with H@1, H@3, H@10, and MRR metrics.
- dicee.eval_static_funcs.evaluate_link_prediction_performance_with_bpe_reciprocals(model, within_entities: List[str], triples: List[List[str]], er_vocab: Dict[Tuple, List]) Dict[str, float]
Evaluate link prediction with BPE encoding and reciprocals.
- Parameters:
model – KGE model wrapper with BPE support.
within_entities – List of entities to evaluate within.
triples – Test triples as list of [head, relation, tail] strings.
er_vocab – Mapping (entity, relation) -> list of valid tail entities.
- Returns:
Dictionary with H@1, H@3, H@10, and MRR metrics.
- dicee.eval_static_funcs.evaluate_lp_bpe_k_vs_all(model, triples: List[List[str]], er_vocab: Dict = None, batch_size: int = None, func_triple_to_bpe_representation: Callable = None, str_to_bpe_entity_to_idx: Dict = None) Dict[str, float]
Evaluate BPE link prediction with KvsAll scoring.
- Parameters:
model – The KGE model wrapper.
triples – List of string triples.
er_vocab – Entity-relation vocabulary for filtering.
batch_size – Batch size for processing.
func_triple_to_bpe_representation – Function to convert triples to BPE.
str_to_bpe_entity_to_idx – Mapping from string entities to BPE indices.
- Returns:
Dictionary with H@1, H@3, H@10, and MRR metrics.
- dicee.eval_static_funcs.evaluate_literal_prediction(kge_model, eval_file_path: str = None, store_lit_preds: bool = True, eval_literals: bool = True, loader_backend: str = 'pandas', return_attr_error_metrics: bool = False) pandas.DataFrame | None
Evaluate trained literal prediction model on a test file.
Evaluates the literal prediction capabilities of a KGE model by computing MAE and RMSE metrics for each attribute.
- Parameters:
kge_model – Trained KGE model with literal prediction capability.
eval_file_path – Path to the evaluation file containing test literals.
store_lit_preds – If True, stores predictions to CSV file.
eval_literals – If True, evaluates and prints error metrics.
loader_backend – Backend for loading dataset (‘pandas’ or ‘rdflib’).
return_attr_error_metrics – If True, returns the metrics DataFrame.
- Returns:
DataFrame with per-attribute MAE and RMSE if return_attr_error_metrics is True, otherwise None.
- Raises:
RuntimeError – If the KGE model doesn’t have a trained literal model.
AssertionError – If model is invalid or test set has no valid data.
Example
>>> from dicee import KGE >>> from dicee.evaluation import evaluate_literal_prediction >>> model = KGE(path="pretrained_model") >>> metrics = evaluate_literal_prediction( ... model, ... eval_file_path="test_literals.csv", ... return_attr_error_metrics=True ... ) >>> print(metrics)
- dicee.eval_static_funcs.evaluate_ensemble_link_prediction_performance(models: List, triples, er_vocab: Dict[Tuple, List], weights: List[float] | None = None, batch_size: int = 512, weighted_averaging: bool = True, normalize_scores: bool = True) Dict[str, float]
Evaluate link prediction performance of an ensemble of KGE models.
Combines predictions from multiple models using weighted or simple averaging, with optional score normalization.
- Parameters:
models – List of KGE models (e.g., snapshots from training).
triples – Test triples as numpy array or list, shape (N, 3), with integer indices (head, relation, tail).
er_vocab – Mapping (head_idx, rel_idx) -> list of tail indices for filtered evaluation.
weights – Weights for model averaging. Required if weighted_averaging is True. Must sum to 1 for proper averaging.
batch_size – Batch size for processing triples.
weighted_averaging – If True, use weighted averaging of predictions. If False, use simple mean.
normalize_scores – If True, normalize scores to [0, 1] range per sample before averaging.
- Returns:
Dictionary with H@1, H@3, H@10, and MRR metrics.
- Raises:
AssertionError – If weighted_averaging is True but weights are not provided or have wrong length.
Example
>>> from dicee.evaluation import evaluate_ensemble_link_prediction_performance >>> models = [model1, model2, model3] >>> weights = [0.5, 0.3, 0.2] >>> results = evaluate_ensemble_link_prediction_performance( ... models, test_triples, er_vocab, ... weights=weights, weighted_averaging=True ... ) >>> print(f"MRR: {results['MRR']:.4f}")