dicee.eval_static_funcs

Static evaluation functions for KGE models.

This module provides backward compatibility by re-exporting from the new dicee.evaluation module.

Deprecated since version Use: dicee.evaluation submodules instead. This module will be removed in a future version.

Functions

evaluate_ensemble_link_prediction_performance(...)

Evaluate link prediction performance of an ensemble of KGE models.

evaluate_link_prediction_performance(→ Dict[str, float])

Evaluate link prediction performance with head and tail prediction.

evaluate_link_prediction_performance_with_bpe(...)

Evaluate link prediction with BPE encoding (head and tail).

evaluate_link_prediction_performance_with_bpe_reciprocals(...)

Evaluate link prediction with BPE encoding and reciprocals.

evaluate_link_prediction_performance_with_reciprocals(...)

Evaluate link prediction with reciprocal relations.

evaluate_lp_bpe_k_vs_all(→ Dict[str, float])

Evaluate BPE link prediction with KvsAll scoring.

evaluate_literal_prediction(→ Optional[pandas.DataFrame])

Evaluate trained literal prediction model on a test file.

Module Contents

Evaluate link prediction performance of an ensemble of KGE models.

Combines predictions from multiple models using weighted or simple averaging, with optional score normalization.

Parameters:
  • models – List of KGE models (e.g., snapshots from training).

  • triples – Test triples as numpy array or list, shape (N, 3), with integer indices (head, relation, tail).

  • er_vocab – Mapping (head_idx, rel_idx) -> list of tail indices for filtered evaluation.

  • weights – Weights for model averaging. Required if weighted_averaging is True. Must sum to 1 for proper averaging.

  • batch_size – Batch size for processing triples.

  • weighted_averaging – If True, use weighted averaging of predictions. If False, use simple mean.

  • normalize_scores – If True, normalize scores to [0, 1] range per sample before averaging.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

Raises:

AssertionError – If weighted_averaging is True but weights are not provided or have wrong length.

Example

>>> from dicee.evaluation import evaluate_ensemble_link_prediction_performance
>>> models = [model1, model2, model3]
>>> weights = [0.5, 0.3, 0.2]
>>> results = evaluate_ensemble_link_prediction_performance(
...     models, test_triples, er_vocab,
...     weights=weights, weighted_averaging=True
... )
>>> print(f"MRR: {results['MRR']:.4f}")

Evaluate link prediction performance with head and tail prediction.

Performs filtered evaluation where known correct answers are filtered out before computing ranks.

Parameters:
  • model – KGE model wrapper with entity/relation mappings.

  • triples – Test triples as list of (head, relation, tail) strings.

  • er_vocab – Mapping (entity, relation) -> list of valid tail entities.

  • re_vocab – Mapping (relation, entity) -> list of valid head entities.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

Evaluate link prediction with BPE encoding (head and tail).

Parameters:
  • model – KGE model wrapper with BPE support.

  • within_entities – List of entities to evaluate within.

  • triples – Test triples as list of (head, relation, tail) tuples.

  • er_vocab – Mapping (entity, relation) -> list of valid tail entities.

  • re_vocab – Mapping (relation, entity) -> list of valid head entities.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

Evaluate link prediction with BPE encoding and reciprocals.

Parameters:
  • model – KGE model wrapper with BPE support.

  • within_entities – List of entities to evaluate within.

  • triples – Test triples as list of [head, relation, tail] strings.

  • er_vocab – Mapping (entity, relation) -> list of valid tail entities.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

Evaluate link prediction with reciprocal relations.

Optimized for models trained with reciprocal triples where only tail prediction is needed.

Parameters:
  • model – KGE model wrapper.

  • triples – Test triples as list of (head, relation, tail) strings.

  • er_vocab – Mapping (entity, relation) -> list of valid tail entities.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

dicee.eval_static_funcs.evaluate_lp_bpe_k_vs_all(model, triples: List[List[str]], er_vocab: Dict | None = None, batch_size: int | None = None, func_triple_to_bpe_representation: Callable | None = None, str_to_bpe_entity_to_idx: Dict | None = None) Dict[str, float][source]

Evaluate BPE link prediction with KvsAll scoring.

Parameters:
  • model – The KGE model wrapper.

  • triples – List of string triples.

  • er_vocab – Entity-relation vocabulary for filtering.

  • batch_size – Batch size for processing.

  • func_triple_to_bpe_representation – Function to convert triples to BPE.

  • str_to_bpe_entity_to_idx – Mapping from string entities to BPE indices.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

Raises:

ValueError – If batch_size is not provided.

dicee.eval_static_funcs.evaluate_literal_prediction(kge_model, eval_file_path: str = None, store_lit_preds: bool = True, eval_literals: bool = True, loader_backend: str = 'pandas', return_attr_error_metrics: bool = False) pandas.DataFrame | None[source]

Evaluate trained literal prediction model on a test file.

Evaluates the literal prediction capabilities of a KGE model by computing MAE and RMSE metrics for each attribute.

Parameters:
  • kge_model – Trained KGE model with literal prediction capability.

  • eval_file_path – Path to the evaluation file containing test literals.

  • store_lit_preds – If True, stores predictions to CSV file.

  • eval_literals – If True, evaluates and prints error metrics.

  • loader_backend – Backend for loading dataset (‘pandas’ or ‘rdflib’).

  • return_attr_error_metrics – If True, returns the metrics DataFrame.

Returns:

DataFrame with per-attribute MAE and RMSE if return_attr_error_metrics is True, otherwise None.

Raises:
  • RuntimeError – If the KGE model doesn’t have a trained literal model.

  • AssertionError – If model is invalid or test set has no valid data.

Example

>>> from dicee import KGE
>>> from dicee.evaluation import evaluate_literal_prediction
>>> model = KGE(path="pretrained_model")
>>> metrics = evaluate_literal_prediction(
...     model,
...     eval_file_path="test_literals.csv",
...     return_attr_error_metrics=True
... )
>>> print(metrics)