dicee.static_funcs
Static utility functions for DICE embeddings.
This module provides utility functions for model initialization, data loading, serialization, and various helper operations.
Attributes
Functions
|
Add inverse triples to a DataFrame. |
|
Build entity-relation to tail vocabulary. |
|
Build relation-entity (tail) to head vocabulary. |
|
Build entity-entity to relation vocabulary. |
|
Decorator to measure and print execution time and memory usage. |
|
Save data to a pickle file. |
|
Load data from a pickle file. |
|
Load term-to-index mapping from pickle or CSV file. |
|
|
|
Load weights and initialize pytorch module from namespace arguments |
|
Construct Ensemble Of weights and initialize pytorch module from namespace arguments |
|
|
|
Detect most efficient data type for a given triples |
|
Store Pytorch model into disk |
|
|
|
Add randomly constructed triples |
|
|
|
Initialize a knowledge graph embedding model. |
|
Load JSON file into a dictionary. |
|
Save embeddings to a CSV file. |
|
|
|
Create a timestamped experiment folder. |
|
|
|
|
|
# @TODO: CD: Renamed this function |
|
|
|
|
|
|
Create |
|
Module Contents
- dicee.static_funcs.MODEL_REGISTRY: Dict[str, Tuple[Type, str]]
- dicee.static_funcs.create_recipriocal_triples(df: pandas.DataFrame) pandas.DataFrame
Add inverse triples to a DataFrame.
For each triple (s, p, o), creates an inverse triple (o, p_inverse, s).
- Parameters:
df – DataFrame with ‘subject’, ‘relation’, and ‘object’ columns.
- Returns:
DataFrame with original and inverse triples concatenated.
- dicee.static_funcs.get_er_vocab(data: numpy.ndarray, file_path: str | None = None) Dict[Tuple[int, int], List[int]]
Build entity-relation to tail vocabulary.
- Parameters:
data – Array of triples with shape (n, 3) where columns are (head, relation, tail).
file_path – Optional path to save the vocabulary as pickle.
- Returns:
Dictionary mapping (head, relation) pairs to list of tail entities.
- dicee.static_funcs.get_re_vocab(data: numpy.ndarray, file_path: str | None = None) Dict[Tuple[int, int], List[int]]
Build relation-entity (tail) to head vocabulary.
- Parameters:
data – Array of triples with shape (n, 3) where columns are (head, relation, tail).
file_path – Optional path to save the vocabulary as pickle.
- Returns:
Dictionary mapping (relation, tail) pairs to list of head entities.
- dicee.static_funcs.get_ee_vocab(data: numpy.ndarray, file_path: str | None = None) Dict[Tuple[int, int], List[int]]
Build entity-entity to relation vocabulary.
- Parameters:
data – Array of triples with shape (n, 3) where columns are (head, relation, tail).
file_path – Optional path to save the vocabulary as pickle.
- Returns:
Dictionary mapping (head, tail) pairs to list of relations.
- dicee.static_funcs.timeit(func: Callable) Callable
Decorator to measure and print execution time and memory usage.
- Parameters:
func – Function to be timed.
- Returns:
Wrapped function that prints timing information.
- dicee.static_funcs.save_pickle(*, data: object | None = None, file_path: str) None
Save data to a pickle file.
Note: Consider using more portable formats (JSON, Parquet) for new code.
- Parameters:
data – Object to serialize. If None, nothing is saved.
file_path – Path where the pickle file will be saved.
- dicee.static_funcs.load_pickle(file_path: str) object
Load data from a pickle file.
Note: Consider using more portable formats (JSON, Parquet) for new code.
- Parameters:
file_path – Path to the pickle file.
- Returns:
Deserialized object from the pickle file.
- dicee.static_funcs.load_term_mapping(file_path: str) dict | polars.DataFrame
Load term-to-index mapping from pickle or CSV file.
Attempts to load from pickle first, falls back to CSV if not found.
- Parameters:
file_path – Base path without extension.
- Returns:
Dictionary or Polars DataFrame containing the mapping.
- dicee.static_funcs.select_model(args: dict, is_continual_training: bool = None, storage_path: str = None)
- dicee.static_funcs.load_model(path_of_experiment_folder: str, model_name='model.pt', verbose=0) Tuple[object, Tuple[dict, dict]]
Load weights and initialize pytorch module from namespace arguments
- dicee.static_funcs.load_model_ensemble(path_of_experiment_folder: str) Tuple[dicee.models.base_model.BaseKGE, Tuple[pandas.DataFrame, pandas.DataFrame]]
Construct Ensemble Of weights and initialize pytorch module from namespace arguments
Detect models under given path
Accumulate parameters of detected models
Normalize parameters
Insert (3) into model.
- dicee.static_funcs.save_numpy_ndarray(*, data: numpy.ndarray, file_path: str)
- dicee.static_funcs.numpy_data_type_changer(train_set: numpy.ndarray, num: int) numpy.ndarray
Detect most efficient data type for a given triples :param train_set: :param num: :return:
- dicee.static_funcs.save_checkpoint_model(model, path: str) None
Store Pytorch model into disk
- dicee.static_funcs.store(trained_model, model_name: str = 'model', full_storage_path: str = None, save_embeddings_as_csv=False) None
- dicee.static_funcs.add_noisy_triples(train_set: pandas.DataFrame, add_noise_rate: float) pandas.DataFrame
Add randomly constructed triples :param train_set: :param add_noise_rate: :return:
- dicee.static_funcs.read_or_load_kg(args, cls)
- dicee.static_funcs.intialize_model(args: Dict, verbose: int = 0) Tuple[dicee.models.base_model.BaseKGE, str]
Initialize a knowledge graph embedding model.
- Parameters:
args – Dictionary containing model configuration including ‘model’ key.
verbose – Verbosity level. If > 0, prints initialization message.
- Returns:
Tuple of (initialized model, form of labelling string).
- Raises:
ValueError – If the model name is not recognized.
- dicee.static_funcs.load_json(path: str) Dict
Load JSON file into a dictionary.
- Parameters:
path – Path to the JSON file.
- Returns:
Dictionary containing the JSON data.
- Raises:
FileNotFoundError – If the file does not exist.
json.JSONDecodeError – If the file contains invalid JSON.
- dicee.static_funcs.save_embeddings(embeddings: numpy.ndarray, indexes: List, path: str) None
Save embeddings to a CSV file.
- Parameters:
embeddings – NumPy array of embeddings with shape (n_items, embedding_dim).
indexes – List of index labels (entity/relation names).
path – Output file path.
- dicee.static_funcs.vocab_to_parquet(vocab_to_idx, name, path_for_serialization, print_into)
- dicee.static_funcs.create_experiment_folder(folder_name: str = 'Experiments') str
Create a timestamped experiment folder.
- Parameters:
folder_name – Base directory name for experiments.
- Returns:
Full path to the created experiment folder.
- dicee.static_funcs.continual_training_setup_executor(executor) None
- dicee.static_funcs.exponential_function(x: numpy.ndarray, lam: float, ascending_order=True) torch.FloatTensor
- dicee.static_funcs.load_numpy(path) numpy.ndarray
- dicee.static_funcs.evaluate(entity_to_idx, scores, easy_answers, hard_answers)
# @TODO: CD: Renamed this function Evaluate multi hop query answering on different query types
- dicee.static_funcs.download_file(url, destination_folder='.')
- dicee.static_funcs.download_files_from_url(base_url: str, destination_folder='.') None
- Parameters:
base_url (e.g. “https://files.dice-research.org/projects/DiceEmbeddings/KINSHIP-Keci-dim128-epoch256-KvsAll”)
destination_folder (e.g. "KINSHIP-Keci-dim128-epoch256-KvsAll")
- dicee.static_funcs.download_pretrained_model(url: str) str
- dicee.static_funcs.write_csv_from_model_parallel(path: str)
Create
- dicee.static_funcs.from_pretrained_model_write_embeddings_into_csv(path: str) None