dicee.evaluation.ensemble ========================= .. py:module:: dicee.evaluation.ensemble .. autoapi-nested-parse:: Ensemble evaluation functions. This module provides functions for evaluating ensemble models, including weighted averaging and score normalization. Functions --------- .. autoapisummary:: dicee.evaluation.ensemble.evaluate_ensemble_link_prediction_performance Module Contents --------------- .. py:function:: evaluate_ensemble_link_prediction_performance(models: List, triples, er_vocab: Dict[Tuple, List], weights: Optional[List[float]] = None, batch_size: int = 512, weighted_averaging: bool = True, normalize_scores: bool = True) -> Dict[str, float] Evaluate link prediction performance of an ensemble of KGE models. Combines predictions from multiple models using weighted or simple averaging, with optional score normalization. :param models: List of KGE models (e.g., snapshots from training). :param triples: Test triples as numpy array or list, shape (N, 3), with integer indices (head, relation, tail). :param er_vocab: Mapping (head_idx, rel_idx) -> list of tail indices for filtered evaluation. :param weights: Weights for model averaging. Required if weighted_averaging is True. Must sum to 1 for proper averaging. :param batch_size: Batch size for processing triples. :param weighted_averaging: If True, use weighted averaging of predictions. If False, use simple mean. :param normalize_scores: If True, normalize scores to [0, 1] range per sample before averaging. :returns: Dictionary with H@1, H@3, H@10, and MRR metrics. :raises AssertionError: If weighted_averaging is True but weights are not provided or have wrong length. .. rubric:: Example >>> from dicee.evaluation import evaluate_ensemble_link_prediction_performance >>> models = [model1, model2, model3] >>> weights = [0.5, 0.3, 0.2] >>> results = evaluate_ensemble_link_prediction_performance( ... models, test_triples, er_vocab, ... weights=weights, weighted_averaging=True ... ) >>> print(f"MRR: {results['MRR']:.4f}")