dicee.evaluator

Evaluator module for knowledge graph embedding models.

This module provides backward compatibility by re-exporting from the new dicee.evaluation module.

Deprecated since version Use: dicee.evaluation.Evaluator instead. This module will be removed in a future version.

Classes

Evaluator

Evaluator class for KGE models in various downstream tasks.

Module Contents

class dicee.evaluator.Evaluator(args, is_continual_training: bool = False)

Evaluator class for KGE models in various downstream tasks.

Orchestrates link prediction evaluation with different scoring techniques including standard evaluation and byte-pair encoding based evaluation.

er_vocab

Entity-relation to tail vocabulary for filtered ranking.

re_vocab

Relation-entity (tail) to head vocabulary.

ee_vocab

Entity-entity to relation vocabulary.

num_entities

Total number of entities in the knowledge graph.

num_relations

Total number of relations in the knowledge graph.

args

Configuration arguments.

report

Dictionary storing evaluation results.

during_training

Whether evaluation is happening during training.

Example

>>> from dicee.evaluation import Evaluator
>>> evaluator = Evaluator(args)
>>> results = evaluator.eval(dataset, model, 'EntityPrediction')
>>> print(f"Test MRR: {results['Test']['MRR']:.4f}")
re_vocab: Dict | None = None
er_vocab: Dict | None = None
ee_vocab: Dict | None = None
func_triple_to_bpe_representation = None
is_continual_training = False
num_entities: int | None = None
num_relations: int | None = None
domain_constraints_per_rel = None
range_constraints_per_rel = None
args
report: Dict
during_training = False
vocab_preparation(dataset) None

Prepare vocabularies from the dataset for evaluation.

Resolves any future objects and saves vocabularies to disk.

Parameters:

dataset – Knowledge graph dataset with vocabulary attributes.

eval(dataset, trained_model, form_of_labelling: str, during_training: bool = False) Dict | None

Evaluate the trained model on the dataset.

Parameters:
  • dataset – Knowledge graph dataset (KG instance).

  • trained_model – The trained KGE model.

  • form_of_labelling – Type of labelling (‘EntityPrediction’ or ‘RelationPrediction’).

  • during_training – Whether evaluation is during training.

Returns:

Dictionary of evaluation metrics, or None if evaluation is skipped.

eval_rank_of_head_and_tail_entity(*, train_set, valid_set=None, test_set=None, trained_model) None

Evaluate with negative sampling scoring.

eval_rank_of_head_and_tail_byte_pair_encoded_entity(*, train_set=None, valid_set=None, test_set=None, ordered_bpe_entities, trained_model) None

Evaluate with BPE-encoded entities and negative sampling.

eval_with_byte(*, raw_train_set, raw_valid_set=None, raw_test_set=None, trained_model, form_of_labelling) None

Evaluate BytE model with generation.

eval_with_bpe_vs_all(*, raw_train_set, raw_valid_set=None, raw_test_set=None, trained_model, form_of_labelling) None

Evaluate with BPE and KvsAll scoring.

eval_with_vs_all(*, train_set, valid_set=None, test_set=None, trained_model, form_of_labelling) None

Evaluate with KvsAll or 1vsAll scoring.

evaluate_lp_k_vs_all(model, triple_idx, info: str = None, form_of_labelling: str = None) Dict[str, float]

Filtered link prediction evaluation with KvsAll scoring.

Parameters:
  • model – The trained model to evaluate.

  • triple_idx – Integer-indexed test triples.

  • info – Description to print.

  • form_of_labelling – ‘EntityPrediction’ or ‘RelationPrediction’.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

evaluate_lp_with_byte(model, triples: List[List[str]], info: str = None) Dict[str, float]

Evaluate BytE model with text generation.

Parameters:
  • model – BytE model.

  • triples – String triples.

  • info – Description to print.

Returns:

Dictionary with placeholder metrics (-1 values).

evaluate_lp_bpe_k_vs_all(model, triples: List[List[str]], info: str = None, form_of_labelling: str = None) Dict[str, float]

Evaluate BPE model with KvsAll scoring.

Parameters:
  • model – BPE-enabled model.

  • triples – String triples.

  • info – Description to print.

  • form_of_labelling – Type of labelling.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

evaluate_lp(model, triple_idx, info: str) Dict[str, float]

Evaluate link prediction with negative sampling.

Parameters:
  • model – The model to evaluate.

  • triple_idx – Integer-indexed triples.

  • info – Description to print.

Returns:

Dictionary with H@1, H@3, H@10, and MRR metrics.

dummy_eval(trained_model, form_of_labelling: str) None

Run evaluation from saved data (for continual training).

Parameters:
  • trained_model – The trained model.

  • form_of_labelling – Type of labelling.

eval_with_data(dataset, trained_model, triple_idx: numpy.ndarray, form_of_labelling: str) Dict[str, float]

Evaluate a trained model on a given dataset.

Parameters:
  • dataset – Knowledge graph dataset.

  • trained_model – The trained model.

  • triple_idx – Integer-indexed triples to evaluate.

  • form_of_labelling – Type of labelling.

Returns:

Dictionary with evaluation metrics.

Raises:

ValueError – If scoring technique is invalid.