dicee.evaluation.evaluator
==========================

.. py:module:: dicee.evaluation.evaluator

.. autoapi-nested-parse::

   Main Evaluator class for KGE model evaluation.

   This module provides the Evaluator class which orchestrates evaluation
   of knowledge graph embedding models across different datasets and
   scoring techniques.


Classes
-------

.. autoapisummary::

   dicee.evaluation.evaluator.Evaluator


Module Contents
---------------

.. py:class:: Evaluator(args, is_continual_training: bool = False)

   Evaluator class for KGE models in various downstream tasks.

   Orchestrates link prediction evaluation with different scoring techniques
   including standard evaluation and byte-pair encoding based evaluation.

   .. attribute:: er_vocab

      Entity-relation to tail vocabulary for filtered ranking.

   .. attribute:: re_vocab

      Relation-entity (tail) to head vocabulary.

   .. attribute:: ee_vocab

      Entity-entity to relation vocabulary.

   .. attribute:: num_entities

      Total number of entities in the knowledge graph.

   .. attribute:: num_relations

      Total number of relations in the knowledge graph.

   .. attribute:: args

      Configuration arguments.

   .. attribute:: report

      Dictionary storing evaluation results.

   .. attribute:: during_training

      Whether evaluation is happening during training.

   .. rubric:: Example

   >>> from dicee.evaluation import Evaluator
   >>> evaluator = Evaluator(args)
   >>> results = evaluator.eval(dataset, model, 'EntityPrediction')
   >>> print(f"Test MRR: {results['Test']['MRR']:.4f}")


   .. py:attribute:: re_vocab
      :type:  Optional[Dict]
      :value: None


   .. py:attribute:: er_vocab
      :type:  Optional[Dict]
      :value: None


   .. py:attribute:: ee_vocab
      :type:  Optional[Dict]
      :value: None


   .. py:attribute:: func_triple_to_bpe_representation
      :value: None


   .. py:attribute:: is_continual_training
      :value: False


   .. py:attribute:: num_entities
      :type:  Optional[int]
      :value: None


   .. py:attribute:: num_relations
      :type:  Optional[int]
      :value: None


   .. py:attribute:: domain_constraints_per_rel
      :value: None


   .. py:attribute:: range_constraints_per_rel
      :value: None


   .. py:attribute:: args


   .. py:attribute:: report
      :type:  Dict


   .. py:attribute:: during_training
      :value: False


   .. py:method:: vocab_preparation(dataset) -> None

      Prepare vocabularies from the dataset for evaluation.

      Resolves any future objects and saves vocabularies to disk.

      :param dataset: Knowledge graph dataset with vocabulary attributes.


   .. py:method:: eval(dataset, trained_model, form_of_labelling: str, during_training: bool = False) -> Optional[Dict]

      Evaluate the trained model on the dataset.

      :param dataset: Knowledge graph dataset (KG instance).
      :param trained_model: The trained KGE model.
      :param form_of_labelling: Type of labelling ('EntityPrediction' or 'RelationPrediction').
      :param during_training: Whether evaluation is during training.

      :returns: Dictionary of evaluation metrics, or None if evaluation is skipped.


   .. py:method:: eval_rank_of_head_and_tail_entity(*, train_set, valid_set=None, test_set=None, trained_model) -> None

      Evaluate with negative sampling scoring.


   .. py:method:: eval_rank_of_head_and_tail_byte_pair_encoded_entity(*, train_set=None, valid_set=None, test_set=None, ordered_bpe_entities, trained_model) -> None

      Evaluate with BPE-encoded entities and negative sampling.


   .. py:method:: eval_with_byte(*, raw_train_set, raw_valid_set=None, raw_test_set=None, trained_model, form_of_labelling) -> None

      Evaluate BytE model with generation.


   .. py:method:: eval_with_bpe_vs_all(*, raw_train_set, raw_valid_set=None, raw_test_set=None, trained_model, form_of_labelling) -> None

      Evaluate with BPE and KvsAll scoring.


   .. py:method:: eval_with_vs_all(*, train_set, valid_set=None, test_set=None, trained_model, form_of_labelling) -> None

      Evaluate with KvsAll or 1vsAll scoring.


   .. py:method:: evaluate_lp_k_vs_all(model, triple_idx, info: str = None, form_of_labelling: str = None) -> Dict[str, float]

      Filtered link prediction evaluation with KvsAll scoring.

      :param model: The trained model to evaluate.
      :param triple_idx: Integer-indexed test triples.
      :param info: Description to print.
      :param form_of_labelling: 'EntityPrediction' or 'RelationPrediction'.

      :returns: Dictionary with H@1, H@3, H@10, and MRR metrics.


   .. py:method:: evaluate_lp_with_byte(model, triples: List[List[str]], info: str = None) -> Dict[str, float]

      Evaluate BytE model with text generation.

      :param model: BytE model.
      :param triples: String triples.
      :param info: Description to print.

      :returns: Dictionary with placeholder metrics (-1 values).


   .. py:method:: evaluate_lp_bpe_k_vs_all(model, triples: List[List[str]], info: str = None, form_of_labelling: str = None) -> Dict[str, float]

      Evaluate BPE model with KvsAll scoring.

      :param model: BPE-enabled model.
      :param triples: String triples.
      :param info: Description to print.
      :param form_of_labelling: Type of labelling.

      :returns: Dictionary with H@1, H@3, H@10, and MRR metrics.


   .. py:method:: evaluate_lp(model, triple_idx, info: str) -> Dict[str, float]

      Evaluate link prediction with negative sampling.

      :param model: The model to evaluate.
      :param triple_idx: Integer-indexed triples.
      :param info: Description to print.

      :returns: Dictionary with H@1, H@3, H@10, and MRR metrics.


   .. py:method:: dummy_eval(trained_model, form_of_labelling: str) -> None

      Run evaluation from saved data (for continual training).

      :param trained_model: The trained model.
      :param form_of_labelling: Type of labelling.


   .. py:method:: eval_with_data(dataset, trained_model, triple_idx: numpy.ndarray, form_of_labelling: str) -> Dict[str, float]

      Evaluate a trained model on a given dataset.

      :param dataset: Knowledge graph dataset.
      :param trained_model: The trained model.
      :param triple_idx: Integer-indexed triples to evaluate.
      :param form_of_labelling: Type of labelling.

      :returns: Dictionary with evaluation metrics.

      :raises ValueError: If scoring technique is invalid.