dicee.evaluation.evaluator
Main Evaluator class for KGE model evaluation.
This module provides the Evaluator class which orchestrates evaluation of knowledge graph embedding models across different datasets and scoring techniques.
Attributes
Classes
Evaluator class for KGE models in various downstream tasks. |
Module Contents
- dicee.evaluation.evaluator.VALID_SCORING_TECHNIQUES
- class dicee.evaluation.evaluator.Evaluator(args, is_continual_training: bool = False)[source]
Evaluator class for KGE models in various downstream tasks.
Orchestrates link prediction evaluation with different scoring techniques including standard evaluation and byte-pair encoding based evaluation.
- er_vocab
Entity-relation to tail vocabulary for filtered ranking.
- re_vocab
Relation-entity (tail) to head vocabulary.
- ee_vocab
Entity-entity to relation vocabulary.
- num_entities
Total number of entities in the knowledge graph.
- num_relations
Total number of relations in the knowledge graph.
- args
Configuration arguments.
- report
Dictionary storing evaluation results.
- during_training
Whether evaluation is happening during training.
Example
>>> from dicee.evaluation import Evaluator >>> evaluator = Evaluator(args) >>> results = evaluator.eval(dataset, model, 'EntityPrediction') >>> print(f"Test MRR: {results['Test']['MRR']:.4f}")
- re_vocab: Dict | None = None
- er_vocab: Dict | None = None
- ee_vocab: Dict | None = None
- func_triple_to_bpe_representation = None
- is_continual_training = False
- num_entities: int | None = None
- num_relations: int | None = None
- domain_constraints_per_rel = None
- range_constraints_per_rel = None
- args
- report: Dict
- during_training = False
- vocab_preparation(dataset) None[source]
Prepare vocabularies from the dataset for evaluation.
Resolves any future objects and saves vocabularies to disk.
- Parameters:
dataset – Knowledge graph dataset with vocabulary attributes.
- eval(dataset, trained_model, form_of_labelling: str, during_training: bool = False) Dict | None[source]
Evaluate the trained model on the dataset.
- Parameters:
dataset – Knowledge graph dataset (KG instance).
trained_model – The trained KGE model.
form_of_labelling – Type of labelling (‘EntityPrediction’ or ‘RelationPrediction’).
during_training – Whether evaluation is during training.
- Returns:
Dictionary of evaluation metrics, or None if evaluation is skipped.
- eval_rank_of_head_and_tail_entity(*, train_set, valid_set=None, test_set=None, trained_model) None[source]
Evaluate with negative sampling scoring.
- eval_rank_of_head_and_tail_byte_pair_encoded_entity(*, train_set=None, valid_set=None, test_set=None, ordered_bpe_entities, trained_model) None[source]
Evaluate with BPE-encoded entities and negative sampling.
- eval_with_byte(*, raw_train_set, raw_valid_set=None, raw_test_set=None, trained_model, form_of_labelling) None[source]
Evaluate BytE model with generation.
- eval_with_bpe_vs_all(*, raw_train_set, raw_valid_set=None, raw_test_set=None, trained_model, form_of_labelling) None[source]
Evaluate with BPE and KvsAll scoring.
- eval_with_vs_all(*, train_set, valid_set=None, test_set=None, trained_model, form_of_labelling) None[source]
Evaluate with KvsAll or 1vsAll scoring.
- evaluate_lp_k_vs_all(model, triple_idx, info: str | None = None, form_of_labelling: str | None = None) Dict[str, float][source]
Filtered link prediction evaluation with KvsAll scoring.
- Parameters:
model – The trained model to evaluate.
triple_idx – Integer-indexed test triples.
info – Description to print.
form_of_labelling – ‘EntityPrediction’ or ‘RelationPrediction’.
- Returns:
Dictionary with H@1, H@3, H@10, and MRR metrics.
- evaluate_lp_with_byte(model, triples: List[List[str]], info: str | None = None) Dict[str, float][source]
Evaluate BytE model with text generation.
- Parameters:
model – BytE model.
triples – String triples.
info – Description to print.
- Returns:
Dictionary with placeholder metrics (-1 values).
- evaluate_lp_bpe_k_vs_all(model, triples: List[List[str]], info: str | None = None, form_of_labelling: str | None = None) Dict[str, float][source]
Evaluate BPE model with KvsAll scoring.
- Parameters:
model – BPE-enabled model.
triples – String triples.
info – Description to print.
form_of_labelling – Type of labelling.
- Returns:
Dictionary with H@1, H@3, H@10, and MRR metrics.
- evaluate_lp(model, triple_idx, info: str) Dict[str, float][source]
Evaluate link prediction with negative sampling.
- Parameters:
model – The model to evaluate.
triple_idx – Integer-indexed triples.
info – Description to print.
- Returns:
Dictionary with H@1, H@3, H@10, and MRR metrics.
- dummy_eval(trained_model, form_of_labelling: str) None[source]
Run evaluation from saved data (for continual training).
- Parameters:
trained_model – The trained model.
form_of_labelling – Type of labelling.
- eval_with_data(dataset, trained_model, triple_idx: numpy.ndarray, form_of_labelling: str) Dict[str, float][source]
Evaluate a trained model on a given dataset.
- Parameters:
dataset – Knowledge graph dataset.
trained_model – The trained model.
triple_idx – Integer-indexed triples to evaluate.
form_of_labelling – Type of labelling.
- Returns:
Dictionary with evaluation metrics.
- Raises:
ValueError – If scoring technique is invalid.