dicee.evaluation.ensemble
=========================

.. py:module:: dicee.evaluation.ensemble

.. autoapi-nested-parse::

   Ensemble evaluation functions.

   This module provides functions for evaluating ensemble models,
   including weighted averaging and score normalization.


Functions
---------

.. autoapisummary::

   dicee.evaluation.ensemble.evaluate_ensemble_link_prediction_performance


Module Contents
---------------

.. py:function:: evaluate_ensemble_link_prediction_performance(models: List, triples, er_vocab: Dict[Tuple, List], weights: Optional[List[float]] = None, batch_size: int = 512, weighted_averaging: bool = True, normalize_scores: bool = True) -> Dict[str, float]

   Evaluate link prediction performance of an ensemble of KGE models.

   Combines predictions from multiple models using weighted or simple
   averaging, with optional score normalization.

   :param models: List of KGE models (e.g., snapshots from training).
   :param triples: Test triples as numpy array or list, shape (N, 3),
                   with integer indices (head, relation, tail).
   :param er_vocab: Mapping (head_idx, rel_idx) -> list of tail indices
                    for filtered evaluation.
   :param weights: Weights for model averaging. Required if weighted_averaging
                   is True. Must sum to 1 for proper averaging.
   :param batch_size: Batch size for processing triples.
   :param weighted_averaging: If True, use weighted averaging of predictions.
                              If False, use simple mean.
   :param normalize_scores: If True, normalize scores to [0, 1] range per
                            sample before averaging.

   :returns: Dictionary with H@1, H@3, H@10, and MRR metrics.

   :raises AssertionError: If weighted_averaging is True but weights are not
       provided or have wrong length.

   .. rubric:: Example

   >>> from dicee.evaluation import evaluate_ensemble_link_prediction_performance
   >>> models = [model1, model2, model3]
   >>> weights = [0.5, 0.3, 0.2]
   >>> results = evaluate_ensemble_link_prediction_performance(
   ...     models, test_triples, er_vocab,
   ...     weights=weights, weighted_averaging=True
   ... )
   >>> print(f"MRR: {results['MRR']:.4f}")