dicee.eval_static_funcs

Static evaluation functions for KGE models.

This module provides backward compatibility by re-exporting from the new dicee.evaluation module.

Deprecated since version Use: dicee.evaluation submodules instead. This module will be removed in a future version.

Functions

evaluate_link_prediction_performance(→ Dict[str, float])

Evaluate link prediction performance with head and tail prediction.

evaluate_link_prediction_performance_with_reciprocals(...)

Evaluate link prediction with reciprocal relations.

evaluate_link_prediction_performance_with_bpe(...)

Evaluate link prediction with BPE encoding (head and tail).

evaluate_link_prediction_performance_with_bpe_reciprocals(...)

Evaluate link prediction with BPE encoding and reciprocals.

evaluate_lp_bpe_k_vs_all(→ Dict[str, float])

Evaluate BPE link prediction with KvsAll scoring.

evaluate_literal_prediction(→ Optional[pandas.DataFrame])

Evaluate trained literal prediction model on a test file.

evaluate_ensemble_link_prediction_performance(...)

Evaluate link prediction performance of an ensemble of KGE models.

Module Contents

Evaluate link prediction performance with head and tail prediction.

Performs filtered evaluation where known correct answers are filtered out before computing ranks.

Parameters:
  • model – KGE model wrapper with entity/relation mappings.

  • triples – Test triples as list of (head, relation, tail) strings.

  • er_vocab – Mapping (entity, relation) -> list of valid tail entities.

  • re_vocab – Mapping (relation, entity) -> list of valid head entities.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

Evaluate link prediction with reciprocal relations.

Optimized for models trained with reciprocal triples where only tail prediction is needed.

Parameters:
  • model – KGE model wrapper.

  • triples – Test triples as list of (head, relation, tail) strings.

  • er_vocab – Mapping (entity, relation) -> list of valid tail entities.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

Evaluate link prediction with BPE encoding (head and tail).

Parameters:
  • model – KGE model wrapper with BPE support.

  • within_entities – List of entities to evaluate within.

  • triples – Test triples as list of (head, relation, tail) tuples.

  • er_vocab – Mapping (entity, relation) -> list of valid tail entities.

  • re_vocab – Mapping (relation, entity) -> list of valid head entities.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

Evaluate link prediction with BPE encoding and reciprocals.

Parameters:
  • model – KGE model wrapper with BPE support.

  • within_entities – List of entities to evaluate within.

  • triples – Test triples as list of [head, relation, tail] strings.

  • er_vocab – Mapping (entity, relation) -> list of valid tail entities.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

dicee.eval_static_funcs.evaluate_lp_bpe_k_vs_all(model, triples: List[List[str]], er_vocab: Dict = None, batch_size: int = None, func_triple_to_bpe_representation: Callable = None, str_to_bpe_entity_to_idx: Dict = None) Dict[str, float]

Evaluate BPE link prediction with KvsAll scoring.

Parameters:
  • model – The KGE model wrapper.

  • triples – List of string triples.

  • er_vocab – Entity-relation vocabulary for filtering.

  • batch_size – Batch size for processing.

  • func_triple_to_bpe_representation – Function to convert triples to BPE.

  • str_to_bpe_entity_to_idx – Mapping from string entities to BPE indices.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

dicee.eval_static_funcs.evaluate_literal_prediction(kge_model, eval_file_path: str = None, store_lit_preds: bool = True, eval_literals: bool = True, loader_backend: str = 'pandas', return_attr_error_metrics: bool = False) pandas.DataFrame | None

Evaluate trained literal prediction model on a test file.

Evaluates the literal prediction capabilities of a KGE model by computing MAE and RMSE metrics for each attribute.

Parameters:
  • kge_model – Trained KGE model with literal prediction capability.

  • eval_file_path – Path to the evaluation file containing test literals.

  • store_lit_preds – If True, stores predictions to CSV file.

  • eval_literals – If True, evaluates and prints error metrics.

  • loader_backend – Backend for loading dataset (‘pandas’ or ‘rdflib’).

  • return_attr_error_metrics – If True, returns the metrics DataFrame.

Returns:

DataFrame with per-attribute MAE and RMSE if return_attr_error_metrics is True, otherwise None.

Raises:
  • RuntimeError – If the KGE model doesn’t have a trained literal model.

  • AssertionError – If model is invalid or test set has no valid data.

Example

>>> from dicee import KGE
>>> from dicee.evaluation import evaluate_literal_prediction
>>> model = KGE(path="pretrained_model")
>>> metrics = evaluate_literal_prediction(
...     model,
...     eval_file_path="test_literals.csv",
...     return_attr_error_metrics=True
... )
>>> print(metrics)

Evaluate link prediction performance of an ensemble of KGE models.

Combines predictions from multiple models using weighted or simple averaging, with optional score normalization.

Parameters:
  • models – List of KGE models (e.g., snapshots from training).

  • triples – Test triples as numpy array or list, shape (N, 3), with integer indices (head, relation, tail).

  • er_vocab – Mapping (head_idx, rel_idx) -> list of tail indices for filtered evaluation.

  • weights – Weights for model averaging. Required if weighted_averaging is True. Must sum to 1 for proper averaging.

  • batch_size – Batch size for processing triples.

  • weighted_averaging – If True, use weighted averaging of predictions. If False, use simple mean.

  • normalize_scores – If True, normalize scores to [0, 1] range per sample before averaging.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

Raises:

AssertionError – If weighted_averaging is True but weights are not provided or have wrong length.

Example

>>> from dicee.evaluation import evaluate_ensemble_link_prediction_performance
>>> models = [model1, model2, model3]
>>> weights = [0.5, 0.3, 0.2]
>>> results = evaluate_ensemble_link_prediction_performance(
...     models, test_triples, er_vocab,
...     weights=weights, weighted_averaging=True
... )
>>> print(f"MRR: {results['MRR']:.4f}")