dicee.models

Submodules

Classes

ADOPT

ADOPT Optimizer.

BaseKGELightning

Thin PyTorch Lightning wrapper shared by all KGE models.

BaseKGE

Base class for all Knowledge Graph Embedding models.

IdentityClass

No-op normalisation / dropout placeholder.

Block

Base class for all neural network modules.

BaseKGE

Base class for all Knowledge Graph Embedding models.

DistMult

DistMult: bilinear diagonal knowledge graph embedding.

TransE

TransE: translation-based knowledge graph embedding.

Shallom

Shallom: shallow neural model for relation prediction.

Pyke

Pyke: Physical Embedding Model for Knowledge Graphs.

CoKEConfig

Configuration for the CoKE (Contextualized Knowledge Graph Embedding) model.

CoKE

Contextualized Knowledge Graph Embedding (CoKE) model.

BaseKGE

Base class for all Knowledge Graph Embedding models.

ConEx

Convolutional ComplEx Knowledge Graph Embeddings

AConEx

Additive Convolutional ComplEx Knowledge Graph Embeddings

ComplEx

Base class for all Knowledge Graph Embedding models.

BaseKGE

Base class for all Knowledge Graph Embedding models.

IdentityClass

No-op normalisation / dropout placeholder.

QMult

Base class for all Knowledge Graph Embedding models.

ConvQ

Convolutional Quaternion Knowledge Graph Embeddings

AConvQ

Additive Convolutional Quaternion Knowledge Graph Embeddings

BaseKGE

Base class for all Knowledge Graph Embedding models.

IdentityClass

No-op normalisation / dropout placeholder.

OMult

Base class for all Knowledge Graph Embedding models.

ConvO

Base class for all Knowledge Graph Embedding models.

AConvO

Additive Convolutional Octonion Knowledge Graph Embeddings

Keci

Keci: Knowledge Graph Embedding via Clifford Algebra.

CKeci

Without learning dimension scaling

DeCaL

Base class for all Knowledge Graph Embedding models.

KeciTransformer

Keci with Transformer architecture.

BaseKGE

Base class for all Knowledge Graph Embedding models.

PykeenKGE

A class for using knowledge graph embedding models implemented in Pykeen

BaseKGE

Base class for all Knowledge Graph Embedding models.

FMult

Learning Knowledge Neural Graphs

GFMult

Learning Knowledge Neural Graphs

FMult2

Learning Knowledge Neural Graphs

LFMult1

Embedding with trigonometric functions. We represent all entities and relations in the complex number space as:

LFMult

Embedding with polynomial functions. We represent all entities and relations in the polynomial space as:

DualE

Dual Quaternion Knowledge Graph Embeddings (https://ojs.aaai.org/index.php/AAAI/article/download/16850/16657)

Functions

quaternion_mul(→ Tuple[torch.Tensor, torch.Tensor, ...)

Perform quaternion multiplication

quaternion_mul_with_unit_norm(*, Q_1, Q_2)

octonion_mul(*, O_1, O_2)

octonion_mul_norm(*, O_1, O_2)

Package Contents

class dicee.models.ADOPT(params: torch.optim.optimizer.ParamsT, lr: float | torch.Tensor = 0.001, betas: Tuple[float, float] = (0.9, 0.9999), eps: float = 1e-06, clip_lambda: Callable[[int], float] | None = lambda step: ..., weight_decay: float = 0.0, decouple: bool = False, *, foreach: bool | None = None, maximize: bool = False, capturable: bool = False, differentiable: bool = False, fused: bool | None = None)[source]

Bases: torch.optim.optimizer.Optimizer

ADOPT Optimizer.

ADOPT is an adaptive learning rate optimization algorithm that combines momentum-based updates with adaptive per-parameter learning rates. It uses exponential moving averages of gradients and squared gradients, with gradient clipping for stability.

The algorithm performs the following key operations: 1. Normalizes gradients by the square root of the second moment estimate 2. Applies optional gradient clipping based on the training step 3. Updates parameters using momentum-smoothed normalized gradients 4. Supports decoupled weight decay (AdamW-style) or L2 regularization

Mathematical formulation:

m_t = β₁ * m_{t-1} + (1 - β₁) * clip(g_t / √(v_t)) v_t = β₂ * v_{t-1} + (1 - β₂) * g_t² θ_t = θ_{t-1} - α * m_t

where:
  • θ_t: parameter at step t

  • g_t: gradient at step t

  • m_t: first moment estimate (momentum)

  • v_t: second moment estimate (variance)

  • α: learning rate

  • β₁, β₂: exponential decay rates

  • clip(): optional gradient clipping function

Reference:

Original implementation: https://github.com/iShohei220/adopt

Parameters:
  • params (ParamsT) – Iterable of parameters to optimize or dicts defining parameter groups.

  • lr (float or Tensor, optional) – Learning rate. Can be a float or 1-element Tensor. Default: 1e-3

  • betas (Tuple[float, float], optional) – Coefficients (β₁, β₂) for computing running averages of gradient and its square. β₁ controls momentum, β₂ controls variance. Default: (0.9, 0.9999)

  • eps (float, optional) – Term added to denominator to improve numerical stability. Default: 1e-6

  • clip_lambda (Callable[[int], float], optional) – Function that takes the step number and returns the gradient clipping threshold. Common choices: - lambda step: step**0.25 (default, gradually increases clipping threshold) - lambda step: 1.0 (constant clipping) - None (no clipping) Default: lambda step: step**0.25

  • weight_decay (float, optional) – Weight decay coefficient (L2 penalty). Default: 0.0

  • decouple (bool, optional) – If True, uses decoupled weight decay (AdamW-style), applying weight decay directly to parameters. If False, adds weight decay to gradients (L2 regularization). Default: False

  • foreach (bool, optional) – If True, uses the faster foreach implementation for multi-tensor operations. Default: None (auto-select)

  • maximize (bool, optional) – If True, maximizes parameters instead of minimizing. Useful for reinforcement learning. Default: False

  • capturable (bool, optional) – If True, the optimizer is safe to capture in a CUDA graph. Requires learning rate as Tensor. Default: False

  • differentiable (bool, optional) – If True, the optimization step can be differentiated. Useful for meta-learning. Default: False

  • fused (bool, optional) – If True, uses fused kernel implementation (currently not supported). Default: None

Raises:
  • ValueError – If learning rate, epsilon, betas, or weight_decay are invalid.

  • RuntimeError – If fused is enabled (not currently supported).

  • RuntimeError – If lr is a Tensor with foreach=True and capturable=False.

Example

>>> # Basic usage
>>> optimizer = ADOPT(model.parameters(), lr=0.001)
>>> optimizer.zero_grad()
>>> loss.backward()
>>> optimizer.step()
>>> # With decoupled weight decay
>>> optimizer = ADOPT(model.parameters(), lr=0.001, weight_decay=0.01, decouple=True)
>>> # Custom gradient clipping
>>> optimizer = ADOPT(model.parameters(), clip_lambda=lambda step: max(1.0, step**0.5))

Note

  • For most use cases, the default hyperparameters work well

  • Consider using decouple=True for better generalization (similar to AdamW)

  • The clip_lambda function helps stabilize training in early steps

clip_lambda
__setstate__(state)[source]

Restore optimizer state from a checkpoint.

This method handles backward compatibility when loading optimizer state from older versions. It ensures all required fields are present with default values and properly converts step counters to tensors if needed.

Key responsibilities: 1. Set default values for newly added hyperparameters 2. Convert old-style scalar step counters to tensor format 3. Place step tensors on appropriate devices based on capturable/fused modes

Parameters:

state (dict) – Optimizer state dictionary (typically from torch.load()).

Note

  • This enables loading checkpoints saved with older ADOPT versions

  • Step counters are converted to appropriate device/dtype for compatibility

  • Capturable and fused modes require step tensors on parameter devices

step(closure=None)[source]

Perform a single optimization step.

This method executes one iteration of the ADOPT optimization algorithm across all parameter groups. It orchestrates the following workflow:

  1. Optionally evaluates a closure to recompute the loss (useful for algorithms like LBFGS or when loss needs multiple evaluations)

  2. For each parameter group: - Collects parameters with gradients and their associated state - Extracts hyperparameters (betas, learning rate, etc.) - Calls the functional adopt() API to perform the actual update

  3. Returns the loss value if a closure was provided

The functional API (adopt()) handles three execution modes: - Single-tensor: Updates one parameter at a time (default, JIT-compatible) - Multi-tensor (foreach): Batches operations for better performance - Fused: Uses fused CUDA kernels (not yet implemented)

Gradient scaling support: This method is compatible with automatic mixed precision (AMP) training. It can access grad_scale and found_inf attributes for gradient unscaling and inf/nan detection when used with GradScaler.

Parameters:

closure (Callable, optional) – A callable that reevaluates the model and returns the loss. The closure should: - Enable gradients (torch.enable_grad()) - Compute forward pass - Compute loss - Compute backward pass - Return the loss value Example: lambda: (loss := model(x), loss.backward(), loss)[-1] Default: None

Returns:

The loss value returned by the closure, or None if no

closure was provided.

Return type:

Optional[Tensor]

Example

>>> # Standard usage
>>> loss = criterion(model(input), target)
>>> loss.backward()
>>> optimizer.step()
>>> # With closure (e.g., for line search)
>>> def closure():
...     optimizer.zero_grad()
...     output = model(input)
...     loss = criterion(output, target)
...     loss.backward()
...     return loss
>>> loss = optimizer.step(closure)

Note

  • Call zero_grad() before computing gradients for the next step

  • CUDA graph capture is checked for safety when capturable=True

  • The method is thread-safe for different parameter groups

class dicee.models.BaseKGELightning(*args, **kwargs)[source]

Bases: lightning.LightningModule

Thin PyTorch Lightning wrapper shared by all KGE models.

Provides the standard Lightning training loop hooks (training_step, on_train_epoch_end, configure_optimizers) as well as a helper for reporting model size. All concrete KGE models should extend BaseKGE rather than this class directly.

training_step_outputs = []
mem_of_model() Dict[source]

Size of model in MB and number of params

training_step(batch, batch_idx=None)[source]

Execute one optimisation step for the given mini-batch.

Handles two- and three-element batches produced by the different dataset classes (KvsAll / NegSample vs. KvsSample).

Parameters:
  • batch (tuple) – (x, y) for standard scoring, or (x, y_select, y) for sample-based labelling.

  • batch_idx (int, optional) – Index of the current batch (unused, kept for Lightning API compat).

Returns:

Scalar loss value for this batch.

Return type:

torch.FloatTensor

loss_function(yhat_batch: torch.FloatTensor, y_batch: torch.FloatTensor) torch.FloatTensor[source]

Compute the loss between model predictions and targets.

Delegates to self.loss which is configured in BaseKGE.__init__ based on the scoring technique (BCEWithLogitsLoss for entity/relation prediction, CrossEntropyLoss for classification).

Parameters:
  • yhat_batch (torch.FloatTensor) – Model output scores, shape (batch_size, *).

  • y_batch (torch.FloatTensor) – Ground-truth labels of the same shape as yhat_batch.

Returns:

Scalar loss value.

Return type:

torch.FloatTensor

on_train_epoch_end(*args, **kwargs)[source]

Called in the training loop at the very end of the epoch.

To access all batch outputs at the end of the epoch, you can cache step outputs as an attribute of the LightningModule and access them in this hook:

class MyLightningModule(L.LightningModule):
    def __init__(self):
        super().__init__()
        self.training_step_outputs = []

    def training_step(self):
        loss = ...
        self.training_step_outputs.append(loss)
        return loss

    def on_train_epoch_end(self):
        # do something with all training_step outputs, for example:
        epoch_mean = torch.stack(self.training_step_outputs).mean()
        self.log("training_epoch_mean", epoch_mean)
        # free up the memory
        self.training_step_outputs.clear()
test_epoch_end(outputs: List[Any])[source]
test_dataloader() None[source]

An iterable or collection of iterables specifying test samples.

For more information about multiple dataloaders, see this section.

For data processing use the following pattern:

  • download in prepare_data()

  • process and split in setup()

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

  • test()

  • prepare_data()

  • setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

Note

If you don’t need a test dataset and a test_step(), you don’t need to implement this method.

val_dataloader() None[source]

An iterable or collection of iterables specifying validation samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set :paramref:`~lightning.pytorch.trainer.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.

It’s recommended that all data downloads and preparation happen in prepare_data().

  • fit()

  • validate()

  • prepare_data()

  • setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.

Note

If you don’t need a validation dataset and a validation_step(), you don’t need to implement this method.

predict_dataloader() None[source]

An iterable or collection of iterables specifying prediction samples.

For more information about multiple dataloaders, see this section.

It’s recommended that all data downloads and preparation happen in prepare_data().

  • predict()

  • prepare_data()

  • setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.

Returns:

A torch.utils.data.DataLoader or a sequence of them specifying prediction samples.

train_dataloader() None[source]

An iterable or collection of iterables specifying training samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set :paramref:`~lightning.pytorch.trainer.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.

For data processing use the following pattern:

  • download in prepare_data()

  • process and split in setup()

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

  • fit()

  • prepare_data()

  • setup()

Note

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

configure_optimizers(parameters=None)[source]

Instantiate and return the optimiser for training.

The optimiser type is taken from self.optimizer_name which is set in BaseKGE.init_params_with_sanity_checking() from the --optim argument. Supported values: 'SGD', 'Adam', 'Adopt', 'AdamW', 'NAdam', 'Adagrad', 'ASGD', 'Muon'.

Parameters:

parameters (iterable, optional) – Model parameters to optimise. Defaults to self.parameters() when None.

Returns:

The configured optimiser instance.

Return type:

torch.optim.Optimizer

class dicee.models.BaseKGE(args: dict)[source]

Bases: BaseKGELightning

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

args
embedding_dim = None
num_entities = None
num_relations = None
num_tokens = None
learning_rate = None
apply_unit_norm = None
input_dropout_rate = None
hidden_dropout_rate = None
optimizer_name = None
feature_map_dropout_rate = None
kernel_size = None
num_of_output_channels = None
weight_decay = None
loss
selected_optimizer = None
normalizer_class = None
normalize_head_entity_embeddings
normalize_relation_embeddings
normalize_tail_entity_embeddings
hidden_normalizer
param_init
input_dp_ent_real
input_dp_rel_real
hidden_dropout
loss_history = []
byte_pair_encoding
max_length_subword_tokens
block_size
forward_byte_pair_encoded_k_vs_all(x: torch.LongTensor) torch.FloatTensor[source]

KvsAll scoring for BPE-encoded head entities and relations.

Retrieves subword-unit embeddings for the head entity and relation, reduces them to fixed-size vectors via a linear projection, then scores against all BPE entity embeddings.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) BPE token indices where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

Shape (batch_size, num_bpe_entities) score matrix.

Return type:

torch.FloatTensor

forward_byte_pair_encoded_triple(x: Tuple[torch.LongTensor, torch.LongTensor]) torch.FloatTensor[source]

NegSample scoring for BPE-encoded (head, relation, tail) triples.

Retrieves subword-unit embeddings for all three elements and reduces them to fixed-size vectors via a linear projection before computing the triple score.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) BPE token indices.

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

init_params_with_sanity_checking() None[source]

Populate model hyper-parameters from self.args with safe defaults.

Reads embedding dimension, learning rate, dropout rates, normalisation strategy, optimizer name, and parameter initialisation scheme from the args dict. Falls back to sensible defaults for any missing key so that minimal args dicts (e.g. for unit tests) are still valid.

forward(x: torch.LongTensor | Tuple[torch.LongTensor, torch.LongTensor], y_idx: torch.LongTensor = None) torch.FloatTensor[source]

Route the forward pass to the appropriate scoring method.

Inspects the shape and type of x to decide which low-level scorer to call:

Parameters:
  • x (torch.LongTensor or Tuple[torch.LongTensor, torch.LongTensor]) – Either a plain index tensor or a (triple_idx, target_idx) tuple for sample-based labelling.

  • y_idx (torch.LongTensor, optional) – Target entity indices used by forward_k_vs_sample(). Ignored when x is a plain tensor.

Returns:

Score tensor whose shape depends on the selected scorer.

Return type:

torch.FloatTensor

forward_triples(x: torch.LongTensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_all(*args, **kwargs)[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_k_vs_sample(*args, **kwargs)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

get_triple_representation(idx_hrt) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for a triple index batch.

Parameters:

idx_hrt (torch.LongTensor) – Shape (batch_size, 3) integer tensor with columns [head_idx, relation_idx, tail_idx].

Returns:

head_ent_emb, rel_ent_emb, tail_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_head_relation_representation(indexed_triple) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for head entities and relations.

Parameters:

indexed_triple (torch.LongTensor) – Shape (batch_size, 2) integer tensor with columns [head_idx, relation_idx].

Returns:

head_ent_emb, rel_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_sentence_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve BPE subword-unit embeddings for a batch of triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) where T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb, tail_emb – Each has shape (batch_size, T, embedding_dim).

Return type:

torch.FloatTensor

get_bpe_head_and_relation_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve unit-normalised BPE embeddings for head entities and relations.

Each entity/relation is represented as a sequence of T subword tokens. Their token embeddings are L2-normalised across the sequence dimension so that the resulting matrix has unit Frobenius norm.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb – Each has shape (batch_size, T, embedding_dim), L2-normalised over the (T, D) dimensions.

Return type:

torch.FloatTensor

get_embeddings() Tuple[numpy.ndarray, numpy.ndarray][source]

Return the entity and relation embedding matrices as numpy arrays.

Returns:

  • entity_embeddings (numpy.ndarray) – Shape (num_entities, embedding_dim).

  • relation_embeddings (numpy.ndarray) – Shape (num_relations, embedding_dim).

class dicee.models.IdentityClass(args=None)[source]

Bases: torch.nn.Module

No-op normalisation / dropout placeholder.

Used whenever no normalisation layer is requested (--normalization None). All inputs are returned unchanged so that the rest of the model code does not need conditional checks around normalisation calls.

args = None
__call__(x)[source]
static forward(x)[source]
class dicee.models.Block(config)[source]

Bases: torch.nn.Module

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes:

import torch.nn as nn
import torch.nn.functional as F


class Model(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will also have their parameters converted when you call to(), etc.

Note

As per the example above, an __init__() call to the parent class must be made before assignment on the child.

Variables:

training (bool) – Boolean represents whether this module is in training or evaluation mode.

ln_1
attn
ln_2
mlp
forward(x)[source]
class dicee.models.BaseKGE(args: dict)[source]

Bases: BaseKGELightning

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

args
embedding_dim = None
num_entities = None
num_relations = None
num_tokens = None
learning_rate = None
apply_unit_norm = None
input_dropout_rate = None
hidden_dropout_rate = None
optimizer_name = None
feature_map_dropout_rate = None
kernel_size = None
num_of_output_channels = None
weight_decay = None
loss
selected_optimizer = None
normalizer_class = None
normalize_head_entity_embeddings
normalize_relation_embeddings
normalize_tail_entity_embeddings
hidden_normalizer
param_init
input_dp_ent_real
input_dp_rel_real
hidden_dropout
loss_history = []
byte_pair_encoding
max_length_subword_tokens
block_size
forward_byte_pair_encoded_k_vs_all(x: torch.LongTensor) torch.FloatTensor[source]

KvsAll scoring for BPE-encoded head entities and relations.

Retrieves subword-unit embeddings for the head entity and relation, reduces them to fixed-size vectors via a linear projection, then scores against all BPE entity embeddings.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) BPE token indices where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

Shape (batch_size, num_bpe_entities) score matrix.

Return type:

torch.FloatTensor

forward_byte_pair_encoded_triple(x: Tuple[torch.LongTensor, torch.LongTensor]) torch.FloatTensor[source]

NegSample scoring for BPE-encoded (head, relation, tail) triples.

Retrieves subword-unit embeddings for all three elements and reduces them to fixed-size vectors via a linear projection before computing the triple score.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) BPE token indices.

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

init_params_with_sanity_checking() None[source]

Populate model hyper-parameters from self.args with safe defaults.

Reads embedding dimension, learning rate, dropout rates, normalisation strategy, optimizer name, and parameter initialisation scheme from the args dict. Falls back to sensible defaults for any missing key so that minimal args dicts (e.g. for unit tests) are still valid.

forward(x: torch.LongTensor | Tuple[torch.LongTensor, torch.LongTensor], y_idx: torch.LongTensor = None) torch.FloatTensor[source]

Route the forward pass to the appropriate scoring method.

Inspects the shape and type of x to decide which low-level scorer to call:

Parameters:
  • x (torch.LongTensor or Tuple[torch.LongTensor, torch.LongTensor]) – Either a plain index tensor or a (triple_idx, target_idx) tuple for sample-based labelling.

  • y_idx (torch.LongTensor, optional) – Target entity indices used by forward_k_vs_sample(). Ignored when x is a plain tensor.

Returns:

Score tensor whose shape depends on the selected scorer.

Return type:

torch.FloatTensor

forward_triples(x: torch.LongTensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_all(*args, **kwargs)[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_k_vs_sample(*args, **kwargs)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

get_triple_representation(idx_hrt) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for a triple index batch.

Parameters:

idx_hrt (torch.LongTensor) – Shape (batch_size, 3) integer tensor with columns [head_idx, relation_idx, tail_idx].

Returns:

head_ent_emb, rel_ent_emb, tail_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_head_relation_representation(indexed_triple) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for head entities and relations.

Parameters:

indexed_triple (torch.LongTensor) – Shape (batch_size, 2) integer tensor with columns [head_idx, relation_idx].

Returns:

head_ent_emb, rel_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_sentence_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve BPE subword-unit embeddings for a batch of triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) where T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb, tail_emb – Each has shape (batch_size, T, embedding_dim).

Return type:

torch.FloatTensor

get_bpe_head_and_relation_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve unit-normalised BPE embeddings for head entities and relations.

Each entity/relation is represented as a sequence of T subword tokens. Their token embeddings are L2-normalised across the sequence dimension so that the resulting matrix has unit Frobenius norm.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb – Each has shape (batch_size, T, embedding_dim), L2-normalised over the (T, D) dimensions.

Return type:

torch.FloatTensor

get_embeddings() Tuple[numpy.ndarray, numpy.ndarray][source]

Return the entity and relation embedding matrices as numpy arrays.

Returns:

  • entity_embeddings (numpy.ndarray) – Shape (num_entities, embedding_dim).

  • relation_embeddings (numpy.ndarray) – Shape (num_relations, embedding_dim).

class dicee.models.DistMult(args)[source]

Bases: dicee.models.base_model.BaseKGE

DistMult: bilinear diagonal knowledge graph embedding.

Scores a triple (h, r, t) as the element-wise product of the head, relation, and tail embeddings summed over the embedding dimension:

f(h, r, t) = \sum_i  h_i  \cdot  r_i  \cdot  t_i

Simple yet effective baseline; incapable of modelling asymmetric relations.

References

Yang et al., Embedding Entities and Relations for Learning and Inference in Knowledge Bases, ICLR 2015. https://arxiv.org/abs/1412.6575

name = 'DistMult'
k_vs_all_score(emb_h: torch.FloatTensor, emb_r: torch.FloatTensor, emb_E: torch.FloatTensor) torch.FloatTensor[source]

Score a head/relation batch against all entity embeddings.

Computes (h * r) @ E^T after applying hidden dropout and normalisation to the element-wise product.

Parameters:
  • emb_h (torch.FloatTensor) – Head entity embeddings, shape (batch_size, embedding_dim).

  • emb_r (torch.FloatTensor) – Relation embeddings, shape (batch_size, embedding_dim).

  • emb_E (torch.FloatTensor) – All entity embeddings, shape (num_entities, embedding_dim).

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_k_vs_all(x: torch.LongTensor) torch.FloatTensor[source]

KvsAll forward pass: score head/relation against all entities.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2) integer tensor [head_idx, relation_idx].

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_k_vs_sample(x: torch.LongTensor, target_entity_idx: torch.LongTensor) torch.FloatTensor[source]

KvsSample forward pass: score head/relation against a sampled entity subset.

Parameters:
  • x (torch.LongTensor) – Shape (batch_size, 2) integer tensor [head_idx, relation_idx].

  • target_entity_idx (torch.LongTensor) – Shape (batch_size, k) indices of the k target entities per sample.

Returns:

Shape (batch_size, k) score matrix.

Return type:

torch.FloatTensor

score(h: torch.FloatTensor, r: torch.FloatTensor, t: torch.FloatTensor) torch.FloatTensor[source]

Score a batch of (head, relation, tail) embedding triples.

Parameters:
  • h (torch.FloatTensor) – Each has shape (batch_size, embedding_dim).

  • r (torch.FloatTensor) – Each has shape (batch_size, embedding_dim).

  • t (torch.FloatTensor) – Each has shape (batch_size, embedding_dim).

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

class dicee.models.TransE(args)[source]

Bases: dicee.models.base_model.BaseKGE

TransE: translation-based knowledge graph embedding.

Models a relation r as a translation in embedding space such that h + r t for a true triple (h, r, t). The score function is defined as:

f(h, r, t) = margin - ||h + r - t||_2

TransE is effective for 1-to-1 relations but struggles with reflexive, one-to-many, and many-to-one patterns.

References

Bordes et al., Translating Embeddings for Modeling Multi-relational Data, NeurIPS 2013. https://proceedings.neurips.cc/paper/2013/file/1cecc7a77928ca8133fa24680a88d2f9-Paper.pdf

name = 'TransE'
margin = 4
score(head_ent_emb: torch.FloatTensor, rel_ent_emb: torch.FloatTensor, tail_ent_emb: torch.FloatTensor) torch.FloatTensor[source]

Score a batch of triples using the TransE margin-distance formula.

Parameters:
  • head_ent_emb (torch.FloatTensor) – Each has shape (batch_size, embedding_dim).

  • rel_ent_emb (torch.FloatTensor) – Each has shape (batch_size, embedding_dim).

  • tail_ent_emb (torch.FloatTensor) – Each has shape (batch_size, embedding_dim).

Returns:

Shape (batch_size,) scores equal to margin - ||h + r - t||_2.

Return type:

torch.FloatTensor

forward_k_vs_all(x: torch.Tensor) torch.FloatTensor[source]

KvsAll forward pass: score head/relation against all entities.

Computes margin - ||h + r - e||_2 for every entity embedding e.

Parameters:

x (torch.Tensor) – Shape (batch_size, 2) integer tensor [head_idx, relation_idx].

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

class dicee.models.Shallom(args)[source]

Bases: dicee.models.base_model.BaseKGE

Shallom: shallow neural model for relation prediction.

Represents each triple as the concatenation of head and tail entity embeddings and feeds it through a two-layer MLP to predict the relation. Designed for the RelationPrediction labelling form.

References

Demir et al., A Shallow Neural Model for Relation Prediction, ISWC 2021. https://arxiv.org/abs/2101.09090

name = 'Shallom'
shallom
get_embeddings() Tuple[numpy.ndarray, None][source]

Return the entity and relation embedding matrices as numpy arrays.

Returns:

  • entity_embeddings (numpy.ndarray) – Shape (num_entities, embedding_dim).

  • relation_embeddings (numpy.ndarray) – Shape (num_relations, embedding_dim).

forward_k_vs_all(x) torch.FloatTensor[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_triples(x) torch.FloatTensor[source]

Score a batch of triples by looking up relation scores from forward_k_vs_all.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

class dicee.models.Pyke(args)[source]

Bases: dicee.models.base_model.BaseKGE

Pyke: Physical Embedding Model for Knowledge Graphs.

Scores a triple (h, r, t) based on the average pairwise distance between head-to-relation and relation-to-tail in embedding space:

f(h, r, t) = margin - (||h - r||_2 + ||r - t||_2) / 2

The model encodes geometric proximity between entities and the relations that connect them.

name = 'Pyke'
dist_func
margin = 1.0
forward_triples(x: torch.LongTensor) torch.FloatTensor[source]

Score a batch of triples using the Pyke distance formula.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

class dicee.models.CoKEConfig[source]

Configuration for the CoKE (Contextualized Knowledge Graph Embedding) model.

block_size

Sequence length for transformer (3 for triples: head, relation, tail)

vocab_size

Total vocabulary size (num_entities + num_relations)

n_layer

Number of transformer layers

n_head

Number of attention heads per layer

n_embd

Embedding dimension (set to match model embedding_dim)

dropout

Dropout rate applied throughout the model

bias

Whether to use bias in linear layers

causal

Whether to use causal masking (False for bidirectional attention)

block_size: int = 3
vocab_size: int = None
n_layer: int = 6
n_head: int = 8
n_embd: int = None
dropout: float = 0.3
bias: bool = True
causal: bool = False
class dicee.models.CoKE(args, config: CoKEConfig = CoKEConfig())[source]

Bases: dicee.models.base_model.BaseKGE

Contextualized Knowledge Graph Embedding (CoKE) model. Based on: https://arxiv.org/pdf/1911.02168.

CoKE uses a transformer encoder to learn contextualized representations of entities and relations. For link prediction, it predicts masked elements in (head, relation, tail) triples using bidirectional attention, similar to BERT’s masked language modeling approach.

The model creates a sequence [head_emb, relation_emb, mask_emb], adds positional embeddings, and processes it through transformer layers to predict the tail entity.

name = 'CoKE'
config
pos_emb
mask_emb
blocks
ln_f
coke_dropout
forward_k_vs_all(x: torch.Tensor)[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

score(emb_h, emb_r, emb_t)[source]
forward_k_vs_sample(x: torch.LongTensor, target_entity_idx: torch.LongTensor)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

class dicee.models.BaseKGE(args: dict)[source]

Bases: BaseKGELightning

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

args
embedding_dim = None
num_entities = None
num_relations = None
num_tokens = None
learning_rate = None
apply_unit_norm = None
input_dropout_rate = None
hidden_dropout_rate = None
optimizer_name = None
feature_map_dropout_rate = None
kernel_size = None
num_of_output_channels = None
weight_decay = None
loss
selected_optimizer = None
normalizer_class = None
normalize_head_entity_embeddings
normalize_relation_embeddings
normalize_tail_entity_embeddings
hidden_normalizer
param_init
input_dp_ent_real
input_dp_rel_real
hidden_dropout
loss_history = []
byte_pair_encoding
max_length_subword_tokens
block_size
forward_byte_pair_encoded_k_vs_all(x: torch.LongTensor) torch.FloatTensor[source]

KvsAll scoring for BPE-encoded head entities and relations.

Retrieves subword-unit embeddings for the head entity and relation, reduces them to fixed-size vectors via a linear projection, then scores against all BPE entity embeddings.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) BPE token indices where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

Shape (batch_size, num_bpe_entities) score matrix.

Return type:

torch.FloatTensor

forward_byte_pair_encoded_triple(x: Tuple[torch.LongTensor, torch.LongTensor]) torch.FloatTensor[source]

NegSample scoring for BPE-encoded (head, relation, tail) triples.

Retrieves subword-unit embeddings for all three elements and reduces them to fixed-size vectors via a linear projection before computing the triple score.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) BPE token indices.

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

init_params_with_sanity_checking() None[source]

Populate model hyper-parameters from self.args with safe defaults.

Reads embedding dimension, learning rate, dropout rates, normalisation strategy, optimizer name, and parameter initialisation scheme from the args dict. Falls back to sensible defaults for any missing key so that minimal args dicts (e.g. for unit tests) are still valid.

forward(x: torch.LongTensor | Tuple[torch.LongTensor, torch.LongTensor], y_idx: torch.LongTensor = None) torch.FloatTensor[source]

Route the forward pass to the appropriate scoring method.

Inspects the shape and type of x to decide which low-level scorer to call:

Parameters:
  • x (torch.LongTensor or Tuple[torch.LongTensor, torch.LongTensor]) – Either a plain index tensor or a (triple_idx, target_idx) tuple for sample-based labelling.

  • y_idx (torch.LongTensor, optional) – Target entity indices used by forward_k_vs_sample(). Ignored when x is a plain tensor.

Returns:

Score tensor whose shape depends on the selected scorer.

Return type:

torch.FloatTensor

forward_triples(x: torch.LongTensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_all(*args, **kwargs)[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_k_vs_sample(*args, **kwargs)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

get_triple_representation(idx_hrt) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for a triple index batch.

Parameters:

idx_hrt (torch.LongTensor) – Shape (batch_size, 3) integer tensor with columns [head_idx, relation_idx, tail_idx].

Returns:

head_ent_emb, rel_ent_emb, tail_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_head_relation_representation(indexed_triple) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for head entities and relations.

Parameters:

indexed_triple (torch.LongTensor) – Shape (batch_size, 2) integer tensor with columns [head_idx, relation_idx].

Returns:

head_ent_emb, rel_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_sentence_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve BPE subword-unit embeddings for a batch of triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) where T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb, tail_emb – Each has shape (batch_size, T, embedding_dim).

Return type:

torch.FloatTensor

get_bpe_head_and_relation_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve unit-normalised BPE embeddings for head entities and relations.

Each entity/relation is represented as a sequence of T subword tokens. Their token embeddings are L2-normalised across the sequence dimension so that the resulting matrix has unit Frobenius norm.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb – Each has shape (batch_size, T, embedding_dim), L2-normalised over the (T, D) dimensions.

Return type:

torch.FloatTensor

get_embeddings() Tuple[numpy.ndarray, numpy.ndarray][source]

Return the entity and relation embedding matrices as numpy arrays.

Returns:

  • entity_embeddings (numpy.ndarray) – Shape (num_entities, embedding_dim).

  • relation_embeddings (numpy.ndarray) – Shape (num_relations, embedding_dim).

class dicee.models.ConEx(args)[source]

Bases: dicee.models.base_model.BaseKGE

Convolutional ComplEx Knowledge Graph Embeddings

name = 'ConEx'
conv2d
fc_num_input
fc1
norm_fc1
bn_conv2d
feature_map_dropout
residual_convolution(C_1: Tuple[torch.Tensor, torch.Tensor], C_2: Tuple[torch.Tensor, torch.Tensor]) torch.FloatTensor[source]

Compute residual score of two complex-valued embeddings. :param C_1: a tuple of two pytorch tensors that corresponds complex-valued embeddings :param C_2: a tuple of two pytorch tensors that corresponds complex-valued embeddings :return:

forward_k_vs_all(x: torch.Tensor) torch.FloatTensor[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_triples(x: torch.Tensor) torch.FloatTensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_sample(x: torch.Tensor, target_entity_idx: torch.Tensor)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

class dicee.models.AConEx(args)[source]

Bases: dicee.models.base_model.BaseKGE

Additive Convolutional ComplEx Knowledge Graph Embeddings

name = 'AConEx'
conv2d
fc_num_input
fc1
norm_fc1
bn_conv2d
feature_map_dropout
residual_convolution(C_1: Tuple[torch.Tensor, torch.Tensor], C_2: Tuple[torch.Tensor, torch.Tensor]) torch.FloatTensor[source]

Compute residual score of two complex-valued embeddings. :param C_1: a tuple of two pytorch tensors that corresponds complex-valued embeddings :param C_2: a tuple of two pytorch tensors that corresponds complex-valued embeddings :return:

forward_k_vs_all(x: torch.Tensor) torch.FloatTensor[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_triples(x: torch.Tensor) torch.FloatTensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_sample(x: torch.Tensor, target_entity_idx: torch.Tensor)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

class dicee.models.ComplEx(args)[source]

Bases: dicee.models.base_model.BaseKGE

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

  • forward_triples() — score a batch of (h, r, t) triples.

  • forward_k_vs_all() — score a (h, r) batch against every entity.

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

name = 'ComplEx'
static score(head_ent_emb: torch.FloatTensor, rel_ent_emb: torch.FloatTensor, tail_ent_emb: torch.FloatTensor)[source]
static k_vs_all_score(emb_h: torch.FloatTensor, emb_r: torch.FloatTensor, emb_E: torch.FloatTensor)[source]
Parameters:
  • emb_h

  • emb_r

  • emb_E

forward_k_vs_all(x: torch.LongTensor) torch.FloatTensor[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_k_vs_sample(x: torch.LongTensor, target_entity_idx: torch.LongTensor)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

class dicee.models.BaseKGE(args: dict)[source]

Bases: BaseKGELightning

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

args
embedding_dim = None
num_entities = None
num_relations = None
num_tokens = None
learning_rate = None
apply_unit_norm = None
input_dropout_rate = None
hidden_dropout_rate = None
optimizer_name = None
feature_map_dropout_rate = None
kernel_size = None
num_of_output_channels = None
weight_decay = None
loss
selected_optimizer = None
normalizer_class = None
normalize_head_entity_embeddings
normalize_relation_embeddings
normalize_tail_entity_embeddings
hidden_normalizer
param_init
input_dp_ent_real
input_dp_rel_real
hidden_dropout
loss_history = []
byte_pair_encoding
max_length_subword_tokens
block_size
forward_byte_pair_encoded_k_vs_all(x: torch.LongTensor) torch.FloatTensor[source]

KvsAll scoring for BPE-encoded head entities and relations.

Retrieves subword-unit embeddings for the head entity and relation, reduces them to fixed-size vectors via a linear projection, then scores against all BPE entity embeddings.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) BPE token indices where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

Shape (batch_size, num_bpe_entities) score matrix.

Return type:

torch.FloatTensor

forward_byte_pair_encoded_triple(x: Tuple[torch.LongTensor, torch.LongTensor]) torch.FloatTensor[source]

NegSample scoring for BPE-encoded (head, relation, tail) triples.

Retrieves subword-unit embeddings for all three elements and reduces them to fixed-size vectors via a linear projection before computing the triple score.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) BPE token indices.

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

init_params_with_sanity_checking() None[source]

Populate model hyper-parameters from self.args with safe defaults.

Reads embedding dimension, learning rate, dropout rates, normalisation strategy, optimizer name, and parameter initialisation scheme from the args dict. Falls back to sensible defaults for any missing key so that minimal args dicts (e.g. for unit tests) are still valid.

forward(x: torch.LongTensor | Tuple[torch.LongTensor, torch.LongTensor], y_idx: torch.LongTensor = None) torch.FloatTensor[source]

Route the forward pass to the appropriate scoring method.

Inspects the shape and type of x to decide which low-level scorer to call:

Parameters:
  • x (torch.LongTensor or Tuple[torch.LongTensor, torch.LongTensor]) – Either a plain index tensor or a (triple_idx, target_idx) tuple for sample-based labelling.

  • y_idx (torch.LongTensor, optional) – Target entity indices used by forward_k_vs_sample(). Ignored when x is a plain tensor.

Returns:

Score tensor whose shape depends on the selected scorer.

Return type:

torch.FloatTensor

forward_triples(x: torch.LongTensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_all(*args, **kwargs)[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_k_vs_sample(*args, **kwargs)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

get_triple_representation(idx_hrt) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for a triple index batch.

Parameters:

idx_hrt (torch.LongTensor) – Shape (batch_size, 3) integer tensor with columns [head_idx, relation_idx, tail_idx].

Returns:

head_ent_emb, rel_ent_emb, tail_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_head_relation_representation(indexed_triple) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for head entities and relations.

Parameters:

indexed_triple (torch.LongTensor) – Shape (batch_size, 2) integer tensor with columns [head_idx, relation_idx].

Returns:

head_ent_emb, rel_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_sentence_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve BPE subword-unit embeddings for a batch of triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) where T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb, tail_emb – Each has shape (batch_size, T, embedding_dim).

Return type:

torch.FloatTensor

get_bpe_head_and_relation_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve unit-normalised BPE embeddings for head entities and relations.

Each entity/relation is represented as a sequence of T subword tokens. Their token embeddings are L2-normalised across the sequence dimension so that the resulting matrix has unit Frobenius norm.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb – Each has shape (batch_size, T, embedding_dim), L2-normalised over the (T, D) dimensions.

Return type:

torch.FloatTensor

get_embeddings() Tuple[numpy.ndarray, numpy.ndarray][source]

Return the entity and relation embedding matrices as numpy arrays.

Returns:

  • entity_embeddings (numpy.ndarray) – Shape (num_entities, embedding_dim).

  • relation_embeddings (numpy.ndarray) – Shape (num_relations, embedding_dim).

class dicee.models.IdentityClass(args=None)[source]

Bases: torch.nn.Module

No-op normalisation / dropout placeholder.

Used whenever no normalisation layer is requested (--normalization None). All inputs are returned unchanged so that the rest of the model code does not need conditional checks around normalisation calls.

args = None
__call__(x)[source]
static forward(x)[source]
dicee.models.quaternion_mul(*, Q_1, Q_2) Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]

Perform quaternion multiplication :param Q_1: :param Q_2: :return:

dicee.models.quaternion_mul_with_unit_norm(*, Q_1, Q_2)[source]
class dicee.models.QMult(args)[source]

Bases: dicee.models.base_model.BaseKGE

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

  • forward_triples() — score a batch of (h, r, t) triples.

  • forward_k_vs_all() — score a (h, r) batch against every entity.

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

name = 'QMult'
explicit = True
quaternion_multiplication_followed_by_inner_product(h, r, t)[source]
Parameters:
  • h – shape: (*batch_dims, dim) The head representations.

  • r – shape: (*batch_dims, dim) The head representations.

  • t – shape: (*batch_dims, dim) The tail representations.

Returns:

Triple scores.

static quaternion_normalizer(x: torch.FloatTensor) torch.FloatTensor[source]

Normalize the length of relation vectors, if the forward constraint has not been applied yet.

Absolute value of a quaternion

\[|a + bi + cj + dk| = \sqrt{a^2 + b^2 + c^2 + d^2}\]

L2 norm of quaternion vector:

\[\|x\|^2 = \sum_{i=1}^d |x_i|^2 = \sum_{i=1}^d (x_i.re^2 + x_i.im_1^2 + x_i.im_2^2 + x_i.im_3^2)\]
Parameters:

x – The vector.

Returns:

The normalized vector.

score(head_ent_emb: torch.FloatTensor, rel_ent_emb: torch.FloatTensor, tail_ent_emb: torch.FloatTensor)[source]
k_vs_all_score(bpe_head_ent_emb, bpe_rel_ent_emb, E)[source]
Parameters:
  • bpe_head_ent_emb

  • bpe_rel_ent_emb

  • E

forward_k_vs_all(x)[source]
Parameters:

x

forward_k_vs_sample(x, target_entity_idx)[source]

Completed. Given a head entity and a relation (h,r), we compute scores for all possible triples,i.e., [score(h,r,x)|x in Entities] => [0.0,0.1,…,0.8], shape=> (1, |Entities|) Given a batch of head entities and relations => shape (size of batch,| Entities|)

class dicee.models.ConvQ(args)[source]

Bases: dicee.models.base_model.BaseKGE

Convolutional Quaternion Knowledge Graph Embeddings

name = 'ConvQ'
entity_embeddings
relation_embeddings
conv2d
fc_num_input
fc1
bn_conv1
bn_conv2
feature_map_dropout
residual_convolution(Q_1, Q_2)[source]
forward_triples(indexed_triple: torch.Tensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_all(x: torch.Tensor)[source]

Given a head entity and a relation (h,r), we compute scores for all entities. [score(h,r,x)|x in Entities] => [0.0,0.1,…,0.8], shape=> (1, |Entities|) Given a batch of head entities and relations => shape (size of batch,| Entities|)

class dicee.models.AConvQ(args)[source]

Bases: dicee.models.base_model.BaseKGE

Additive Convolutional Quaternion Knowledge Graph Embeddings

name = 'AConvQ'
entity_embeddings
relation_embeddings
conv2d
fc_num_input
fc1
bn_conv1
bn_conv2
feature_map_dropout
residual_convolution(Q_1, Q_2)[source]
forward_triples(indexed_triple: torch.Tensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_all(x: torch.Tensor)[source]

Given a head entity and a relation (h,r), we compute scores for all entities. [score(h,r,x)|x in Entities] => [0.0,0.1,…,0.8], shape=> (1, |Entities|) Given a batch of head entities and relations => shape (size of batch,| Entities|)

class dicee.models.BaseKGE(args: dict)[source]

Bases: BaseKGELightning

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

args
embedding_dim = None
num_entities = None
num_relations = None
num_tokens = None
learning_rate = None
apply_unit_norm = None
input_dropout_rate = None
hidden_dropout_rate = None
optimizer_name = None
feature_map_dropout_rate = None
kernel_size = None
num_of_output_channels = None
weight_decay = None
loss
selected_optimizer = None
normalizer_class = None
normalize_head_entity_embeddings
normalize_relation_embeddings
normalize_tail_entity_embeddings
hidden_normalizer
param_init
input_dp_ent_real
input_dp_rel_real
hidden_dropout
loss_history = []
byte_pair_encoding
max_length_subword_tokens
block_size
forward_byte_pair_encoded_k_vs_all(x: torch.LongTensor) torch.FloatTensor[source]

KvsAll scoring for BPE-encoded head entities and relations.

Retrieves subword-unit embeddings for the head entity and relation, reduces them to fixed-size vectors via a linear projection, then scores against all BPE entity embeddings.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) BPE token indices where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

Shape (batch_size, num_bpe_entities) score matrix.

Return type:

torch.FloatTensor

forward_byte_pair_encoded_triple(x: Tuple[torch.LongTensor, torch.LongTensor]) torch.FloatTensor[source]

NegSample scoring for BPE-encoded (head, relation, tail) triples.

Retrieves subword-unit embeddings for all three elements and reduces them to fixed-size vectors via a linear projection before computing the triple score.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) BPE token indices.

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

init_params_with_sanity_checking() None[source]

Populate model hyper-parameters from self.args with safe defaults.

Reads embedding dimension, learning rate, dropout rates, normalisation strategy, optimizer name, and parameter initialisation scheme from the args dict. Falls back to sensible defaults for any missing key so that minimal args dicts (e.g. for unit tests) are still valid.

forward(x: torch.LongTensor | Tuple[torch.LongTensor, torch.LongTensor], y_idx: torch.LongTensor = None) torch.FloatTensor[source]

Route the forward pass to the appropriate scoring method.

Inspects the shape and type of x to decide which low-level scorer to call:

Parameters:
  • x (torch.LongTensor or Tuple[torch.LongTensor, torch.LongTensor]) – Either a plain index tensor or a (triple_idx, target_idx) tuple for sample-based labelling.

  • y_idx (torch.LongTensor, optional) – Target entity indices used by forward_k_vs_sample(). Ignored when x is a plain tensor.

Returns:

Score tensor whose shape depends on the selected scorer.

Return type:

torch.FloatTensor

forward_triples(x: torch.LongTensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_all(*args, **kwargs)[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_k_vs_sample(*args, **kwargs)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

get_triple_representation(idx_hrt) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for a triple index batch.

Parameters:

idx_hrt (torch.LongTensor) – Shape (batch_size, 3) integer tensor with columns [head_idx, relation_idx, tail_idx].

Returns:

head_ent_emb, rel_ent_emb, tail_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_head_relation_representation(indexed_triple) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for head entities and relations.

Parameters:

indexed_triple (torch.LongTensor) – Shape (batch_size, 2) integer tensor with columns [head_idx, relation_idx].

Returns:

head_ent_emb, rel_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_sentence_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve BPE subword-unit embeddings for a batch of triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) where T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb, tail_emb – Each has shape (batch_size, T, embedding_dim).

Return type:

torch.FloatTensor

get_bpe_head_and_relation_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve unit-normalised BPE embeddings for head entities and relations.

Each entity/relation is represented as a sequence of T subword tokens. Their token embeddings are L2-normalised across the sequence dimension so that the resulting matrix has unit Frobenius norm.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb – Each has shape (batch_size, T, embedding_dim), L2-normalised over the (T, D) dimensions.

Return type:

torch.FloatTensor

get_embeddings() Tuple[numpy.ndarray, numpy.ndarray][source]

Return the entity and relation embedding matrices as numpy arrays.

Returns:

  • entity_embeddings (numpy.ndarray) – Shape (num_entities, embedding_dim).

  • relation_embeddings (numpy.ndarray) – Shape (num_relations, embedding_dim).

class dicee.models.IdentityClass(args=None)[source]

Bases: torch.nn.Module

No-op normalisation / dropout placeholder.

Used whenever no normalisation layer is requested (--normalization None). All inputs are returned unchanged so that the rest of the model code does not need conditional checks around normalisation calls.

args = None
__call__(x)[source]
static forward(x)[source]
dicee.models.octonion_mul(*, O_1, O_2)[source]
dicee.models.octonion_mul_norm(*, O_1, O_2)[source]
class dicee.models.OMult(args)[source]

Bases: dicee.models.base_model.BaseKGE

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

  • forward_triples() — score a batch of (h, r, t) triples.

  • forward_k_vs_all() — score a (h, r) batch against every entity.

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

name = 'OMult'
static octonion_normalizer(emb_rel_e0, emb_rel_e1, emb_rel_e2, emb_rel_e3, emb_rel_e4, emb_rel_e5, emb_rel_e6, emb_rel_e7)[source]
score(head_ent_emb: torch.FloatTensor, rel_ent_emb: torch.FloatTensor, tail_ent_emb: torch.FloatTensor)[source]
k_vs_all_score(bpe_head_ent_emb, bpe_rel_ent_emb, E)[source]
forward_k_vs_all(x)[source]

Completed. Given a head entity and a relation (h,r), we compute scores for all possible triples,i.e., [score(h,r,x)|x in Entities] => [0.0,0.1,…,0.8], shape=> (1, |Entities|) Given a batch of head entities and relations => shape (size of batch,| Entities|)

class dicee.models.ConvO(args: dict)[source]

Bases: dicee.models.base_model.BaseKGE

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

name = 'ConvO'
conv2d
fc_num_input
fc1
bn_conv2d
norm_fc1
feature_map_dropout
static octonion_normalizer(emb_rel_e0, emb_rel_e1, emb_rel_e2, emb_rel_e3, emb_rel_e4, emb_rel_e5, emb_rel_e6, emb_rel_e7)[source]
residual_convolution(O_1, O_2)[source]
forward_triples(x: torch.Tensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_all(x: torch.Tensor)[source]

Given a head entity and a relation (h,r), we compute scores for all entities. [score(h,r,x)|x in Entities] => [0.0,0.1,…,0.8], shape=> (1, |Entities|) Given a batch of head entities and relations => shape (size of batch,| Entities|)

class dicee.models.AConvO(args: dict)[source]

Bases: dicee.models.base_model.BaseKGE

Additive Convolutional Octonion Knowledge Graph Embeddings

name = 'AConvO'
conv2d
fc_num_input
fc1
bn_conv2d
norm_fc1
feature_map_dropout
static octonion_normalizer(emb_rel_e0, emb_rel_e1, emb_rel_e2, emb_rel_e3, emb_rel_e4, emb_rel_e5, emb_rel_e6, emb_rel_e7)[source]
residual_convolution(O_1, O_2)[source]
forward_triples(x: torch.Tensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_all(x: torch.Tensor)[source]

Given a head entity and a relation (h,r), we compute scores for all entities. [score(h,r,x)|x in Entities] => [0.0,0.1,…,0.8], shape=> (1, |Entities|) Given a batch of head entities and relations => shape (size of batch,| Entities|)

class dicee.models.Keci(args)[source]

Bases: dicee.models.base_model.BaseKGE

Keci: Knowledge Graph Embedding via Clifford Algebra.

Embeds entities and relations as multi-vectors in the Clifford algebra Cl_{p,q}(R^d) and scores triples via the Clifford product. The algebra is parameterised by two non-negative integers p and q:

  • p = 0, q = 0 — reduces to a standard bilinear (DistMult-like) model.

  • p = 0, q = 1 — equivalent to ComplEx.

  • Larger p and q capture higher-order geometric interactions.

The embedding dimension must satisfy embedding_dim % (p + q + 1) == 0; the resulting quotient is stored as self.r.

Parameters:

args (dict) – Configuration dictionary. Recognised keys (beyond those in BaseKGE): p (int, default 0) and q (int, default 0).

References

Demir et al., Clifford Embeddings — A Generalized Approach for Embedding in Normed Algebras, ECML 2023.

name = 'Keci'
p
q
r
requires_grad_for_interactions = True
compute_sigma_pp(hp, rp)[source]

Compute sigma_{pp} = sum_{i=1}^{p-1} sum_{k=i+1}^p (h_i r_k - h_k r_i) e_i e_k

sigma_{pp} captures the interactions between along p bases For instance, let p e_1, e_2, e_3, we compute interactions between e_1 e_2, e_1 e_3 , and e_2 e_3 This can be implemented with a nested two for loops

results = [] for i in range(p - 1):

for k in range(i + 1, p):

results.append(hp[:, :, i] * rp[:, :, k] - hp[:, :, k] * rp[:, :, i])

sigma_pp = torch.stack(results, dim=2) assert sigma_pp.shape == (b, r, int((p * (p - 1)) / 2))

Yet, this computation would be quite inefficient. Instead, we compute interactions along all p, e.g., e1e1, e1e2, e1e3,

e2e1, e2e2, e2e3, e3e1, e3e2, e3e3

Then select the triangular matrix without diagonals: e1e2, e1e3, e2e3.

compute_sigma_qq(hq, rq)[source]

Compute sigma_{qq} = sum_{j=1}^{p+q-1} sum_{k=j+1}^{p+q} (h_j r_k - h_k r_j) e_j e_k sigma_{q} captures the interactions between along q bases For instance, let q e_1, e_2, e_3, we compute interactions between e_1 e_2, e_1 e_3 , and e_2 e_3 This can be implemented with a nested two for loops

results = [] for j in range(q - 1):

for k in range(j + 1, q):

results.append(hq[:, :, j] * rq[:, :, k] - hq[:, :, k] * rq[:, :, j])

sigma_qq = torch.stack(results, dim=2) assert sigma_qq.shape == (b, r, int((q * (q - 1)) / 2))

Yet, this computation would be quite inefficient. Instead, we compute interactions along all p, e.g., e1e1, e1e2, e1e3,

e2e1, e2e2, e2e3, e3e1, e3e2, e3e3

Then select the triangular matrix without diagonals: e1e2, e1e3, e2e3.

compute_sigma_pq(*, hp, hq, rp, rq)[source]

sum_{i=1}^{p} sum_{j=p+1}^{p+q} (h_i r_j - h_j r_i) e_i e_j

results = [] sigma_pq = torch.zeros(b, r, p, q) for i in range(p):

for j in range(q):

sigma_pq[:, :, i, j] = hp[:, :, i] * rq[:, :, j] - hq[:, :, j] * rp[:, :, i]

print(sigma_pq.shape)

apply_coefficients(hp, hq, rp, rq)[source]

Multiplying a base vector with its scalar coefficient

clifford_multiplication(h0, hp, hq, r0, rp, rq)[source]

Compute our CL multiplication

h = h_0 + sum_{i=1}^p h_i e_i + sum_{j=p+1}^{p+q} h_j e_j r = r_0 + sum_{i=1}^p r_i e_i + sum_{j=p+1}^{p+q} r_j e_j

ei ^2 = +1 for i =< i =< p ej ^2 = -1 for p < j =< p+q ei ej = -eje1 for i

eq j

h r = sigma_0 + sigma_p + sigma_q + sigma_{pp} + sigma_{q}+ sigma_{pq} where

  1. sigma_0 = h_0 r_0 + sum_{i=1}^p (h_0 r_i) e_i - sum_{j=p+1}^{p+q} (h_j r_j) e_j

  2. sigma_p = sum_{i=1}^p (h_0 r_i + h_i r_0) e_i

  3. sigma_q = sum_{j=p+1}^{p+q} (h_0 r_j + h_j r_0) e_j

  4. sigma_{pp} = sum_{i=1}^{p-1} sum_{k=i+1}^p (h_i r_k - h_k r_i) e_i e_k

  5. sigma_{qq} = sum_{j=1}^{p+q-1} sum_{k=j+1}^{p+q} (h_j r_k - h_k r_j) e_j e_k

  6. sigma_{pq} = sum_{i=1}^{p} sum_{j=p+1}^{p+q} (h_i r_j - h_j r_i) e_i e_j

construct_cl_multivector(x: torch.FloatTensor, r: int, p: int, q: int) tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Split a flat embedding vector into the three Clifford components.

Given an embedding x of dimension d = r + r*p + r*q, returns the scalar part a0, the p-blade part ap, and the q-blade part aq.

Parameters:
  • x (torch.FloatTensor) – Shape (batch_size, d).

  • r (int) – Scalar block size (embedding_dim // (p + q + 1)).

  • p (int) – Number of positive-signature basis elements.

  • q (int) – Number of negative-signature basis elements.

Returns:

  • a0 (torch.FloatTensor) – Shape (batch_size, r) — scalar (grade-0) part.

  • ap (torch.FloatTensor) – Shape (batch_size, r, p) — positive-blade part.

  • aq (torch.FloatTensor) – Shape (batch_size, r, q) — negative-blade part.

forward_k_vs_with_explicit(x: torch.Tensor) torch.FloatTensor[source]

KvsAll scoring using an explicit loop over sigma_pp/qq/pq terms.

Functionally equivalent to forward_k_vs_all() but computes the higher-order interaction terms (sigma_pp, sigma_qq, sigma_pq) with explicit nested loops rather than einsum contractions. Kept for reference and correctness verification.

Parameters:

x (torch.Tensor) – Shape (batch_size, 2) integer tensor [head_idx, relation_idx].

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

k_vs_all_score(bpe_head_ent_emb: torch.FloatTensor, bpe_rel_ent_emb: torch.FloatTensor, E: torch.FloatTensor) torch.FloatTensor[source]

Compute Clifford-product scores for a head/relation batch vs. all entities.

Decomposes the head-entity and relation embeddings into Clifford multi-vectors, performs the Cl_{p,q} product, and inner-products the result against the entity embedding matrix E.

Parameters:
  • bpe_head_ent_emb (torch.FloatTensor) – Head-entity embeddings, shape (batch_size, embedding_dim).

  • bpe_rel_ent_emb (torch.FloatTensor) – Relation embeddings, shape (batch_size, embedding_dim).

  • E (torch.FloatTensor) – All entity embeddings, shape (num_entities, embedding_dim).

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_k_vs_all(x: torch.Tensor) torch.FloatTensor[source]

Kvsall training

  1. Retrieve real-valued embedding vectors for heads and relations mathbb{R}^d .

  2. Construct head entity and relation embeddings according to Cl_{p,q}(mathbb{R}^d) .

  3. Perform Cl multiplication

  4. Inner product of (3) and all entity embeddings

forward_k_vs_with_explicit and this funcitons are identical Parameter ——— x: torch.LongTensor with (n,2) shape :rtype: torch.FloatTensor with (n, |E|) shape

construct_batch_selected_cl_multivector(x: torch.FloatTensor, r: int, p: int, q: int) tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Split a batched, k-selected embedding tensor into Clifford components.

A variant of construct_cl_multivector() for tensors that have an extra k dimension (e.g. when scoring against k sampled targets).

Parameters:
  • x (torch.FloatTensor) – Shape (batch_size, k, d).

  • r (int) – Scalar block size.

  • p (int) – Number of positive-signature basis elements.

  • q (int) – Number of negative-signature basis elements.

Returns:

  • a0 (torch.FloatTensor) – Shape (batch_size, k, r).

  • ap (torch.FloatTensor) – Shape (batch_size, k, r, p).

  • aq (torch.FloatTensor) – Shape (batch_size, k, r, q).

forward_k_vs_sample(x: torch.LongTensor, target_entity_idx: torch.LongTensor) torch.FloatTensor[source]

Parameter

x: torch.LongTensor with (n,2) shape

target_entity_idx: torch.LongTensor with (n, k ) shape k denotes the selected number of examples.

rtype:

torch.FloatTensor with (n, k) shape

score(h, r, t)[source]
forward_triples(x: torch.Tensor) torch.FloatTensor[source]

Parameter

x: torch.LongTensor with (n,3) shape

rtype:

torch.FloatTensor with (n) shape

class dicee.models.CKeci(args)[source]

Bases: Keci

Without learning dimension scaling

name = 'CKeci'
requires_grad_for_interactions = False
class dicee.models.DeCaL(args)[source]

Bases: dicee.models.base_model.BaseKGE

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

name = 'DeCaL'
entity_embeddings
relation_embeddings
p
q
r
re
forward_triples(x: torch.Tensor) torch.FloatTensor[source]

Parameter

x: torch.LongTensor with (n, ) shape

rtype:

torch.FloatTensor with (n) shape

cl_pqr(a: torch.tensor) torch.tensor[source]

Input: tensor(batch_size, emb_dim) —> output: tensor with 1+p+q+r components with size (batch_size, emb_dim/(1+p+q+r)) each.

1) takes a tensor of size (batch_size, emb_dim), split it into 1 + p + q +r components, hence 1+p+q+r must be a divisor of the emb_dim. 2) Return a list of the 1+p+q+r components vectors, each are tensors of size (batch_size, emb_dim/(1+p+q+r))

compute_sigmas_single(list_h_emb, list_r_emb, list_t_emb)[source]

here we compute all the sums with no others vectors interaction taken with the scalar product with t, that is,

\[s0 = h_0r_0t_0 s1 = \sum_{i=1}^{p}h_ir_it_0 s2 = \sum_{j=p+1}^{p+q}h_jr_jt_0 s3 = \sum_{i=1}^{q}(h_0r_it_i + h_ir_0t_i) s4 = \sum_{i=p+1}^{p+q}(h_0r_it_i + h_ir_0t_i) s5 = \sum_{i=p+q+1}^{p+q+r}(h_0r_it_i + h_ir_0t_i)\]

and return:

\[sigma_0t = \sigma_0 \cdot t_0 = s0 + s1 -s2 s3, s4 and s5\]
compute_sigmas_multivect(list_h_emb, list_r_emb)[source]

Here we compute and return all the sums with vectors interaction for the same and different bases.

For same bases vectors interaction we have

\[\sigma_pp = \sum_{i=1}^{p-1}\sum_{i'=i+1}^{p}(h_ir_{i'}-h_{i'}r_i) (models the interactions between e_i and e_i' for 1 <= i, i' <= p) \sigma_qq = \sum_{j=p+1}^{p+q-1}\sum_{j'=j+1}^{p+q}(h_jr_{j'}-h_{j'} (models the interactions between e_j and e_j' for p+1 <= j, j' <= p+q) \sigma_rr = \sum_{k=p+q+1}^{p+q+r-1}\sum_{k'=k+1}^{p}(h_kr_{k'}-h_{k'}r_k) (models the interactions between e_k and e_k' for p+q+1 <= k, k' <= p+q+r)\]

For different base vector interactions, we have

\[\sigma_pq = \sum_{i=1}^{p}\sum_{j=p+1}^{p+q}(h_ir_j - h_jr_i) (interactionsn between e_i and e_j for 1<=i <=p and p+1<= j <= p+q) \sigma_pr = \sum_{i=1}^{p}\sum_{k=p+q+1}^{p+q+r}(h_ir_k - h_kr_i) (interactionsn between e_i and e_k for 1<=i <=p and p+q+1<= k <= p+q+r) \sigma_qr = \sum_{j=p+1}^{p+q}\sum_{j=p+q+1}^{p+q+r}(h_jr_k - h_kr_j) (interactionsn between e_j and e_k for p+1 <= j <=p+q and p+q+1<= j <= p+q+r)\]
forward_k_vs_all(x: torch.Tensor) torch.FloatTensor[source]

Kvsall training

  1. Retrieve real-valued embedding vectors for heads and relations

  2. Construct head entity and relation embeddings according to Cl_{p,q, r}(mathbb{R}^d) .

  3. Perform Cl multiplication

  4. Inner product of (3) and all entity embeddings

forward_k_vs_with_explicit and this funcitons are identical Parameter ——— x: torch.LongTensor with (n, ) shape :rtype: torch.FloatTensor with (n, |E|) shape

apply_coefficients(h0, hp, hq, hk, r0, rp, rq, rk)[source]

Multiplying a base vector with its scalar coefficient

construct_cl_multivector(x: torch.FloatTensor, re: int, p: int, q: int, r: int) tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Construct a batch of multivectors Cl_{p,q,r}(mathbb{R}^d)

Parameter

x: torch.FloatTensor with (n,d) shape

returns:
  • a0 (torch.FloatTensor)

  • ap (torch.FloatTensor)

  • aq (torch.FloatTensor)

  • ar (torch.FloatTensor)

compute_sigma_pp(hp, rp)[source]

Compute .. math:

\sigma_{p,p}^* = \sum_{i=1}^{p-1}\sum_{i'=i+1}^{p}(x_iy_{i'}-x_{i'}y_i)

sigma_{pp} captures the interactions between along p bases For instance, let p e_1, e_2, e_3, we compute interactions between e_1 e_2, e_1 e_3 , and e_2 e_3 This can be implemented with a nested two for loops

results = [] for i in range(p - 1):

for k in range(i + 1, p):

results.append(hp[:, :, i] * rp[:, :, k] - hp[:, :, k] * rp[:, :, i])

sigma_pp = torch.stack(results, dim=2) assert sigma_pp.shape == (b, r, int((p * (p - 1)) / 2))

Yet, this computation would be quite inefficient. Instead, we compute interactions along all p, e.g., e1e1, e1e2, e1e3,

e2e1, e2e2, e2e3, e3e1, e3e2, e3e3

Then select the triangular matrix without diagonals: e1e2, e1e3, e2e3.

compute_sigma_qq(hq, rq)[source]

Compute

\[\sigma_{q,q}^* = \sum_{j=p+1}^{p+q-1}\sum_{j'=j+1}^{p+q}(x_jy_{j'}-x_{j'}y_j) Eq. 16\]

sigma_{q} captures the interactions between along q bases For instance, let q e_1, e_2, e_3, we compute interactions between e_1 e_2, e_1 e_3 , and e_2 e_3 This can be implemented with a nested two for loops

results = [] for j in range(q - 1):

for k in range(j + 1, q):

results.append(hq[:, :, j] * rq[:, :, k] - hq[:, :, k] * rq[:, :, j])

sigma_qq = torch.stack(results, dim=2) assert sigma_qq.shape == (b, r, int((q * (q - 1)) / 2))

Yet, this computation would be quite inefficient. Instead, we compute interactions along all p, e.g., e1e1, e1e2, e1e3,

e2e1, e2e2, e2e3, e3e1, e3e2, e3e3

Then select the triangular matrix without diagonals: e1e2, e1e3, e2e3.

compute_sigma_rr(hk, rk)[source]
\[\sigma_{r,r}^* = \sum_{k=p+q+1}^{p+q+r-1}\sum_{k'=k+1}^{p}(x_ky_{k'}-x_{k'}y_k)\]
compute_sigma_pq(*, hp, hq, rp, rq)[source]

Compute

\[\sum_{i=1}^{p} \sum_{j=p+1}^{p+q} (h_i r_j - h_j r_i) e_i e_j\]

results = [] sigma_pq = torch.zeros(b, r, p, q) for i in range(p):

for j in range(q):

sigma_pq[:, :, i, j] = hp[:, :, i] * rq[:, :, j] - hq[:, :, j] * rp[:, :, i]

print(sigma_pq.shape)

compute_sigma_pr(*, hp, hk, rp, rk)[source]

Compute

\[\sum_{i=1}^{p} \sum_{j=p+1}^{p+q} (h_i r_j - h_j r_i) e_i e_j\]

results = [] sigma_pq = torch.zeros(b, r, p, q) for i in range(p):

for j in range(q):

sigma_pq[:, :, i, j] = hp[:, :, i] * rq[:, :, j] - hq[:, :, j] * rp[:, :, i]

print(sigma_pq.shape)

compute_sigma_qr(*, hq, hk, rq, rk)[source]
\[\sum_{i=1}^{p} \sum_{j=p+1}^{p+q} (h_i r_j - h_j r_i) e_i e_j\]

results = [] sigma_pq = torch.zeros(b, r, p, q) for i in range(p):

for j in range(q):

sigma_pq[:, :, i, j] = hp[:, :, i] * rq[:, :, j] - hq[:, :, j] * rp[:, :, i]

print(sigma_pq.shape)

class dicee.models.KeciTransformer(args)[source]

Bases: Keci

Keci with Transformer architecture.

Concatenates h0, hp, hq, r0, rp, rq into a single embedding vector and processes through transformer.

name = 'KeciTransformer'
use_clifford_mul
seq_len = 1
transformer
lm_head
forward_k_vs_all(x: torch.Tensor) torch.FloatTensor[source]

Kvsall training

Parameter

x: torch.LongTensor with (n,2) shape

rtype:

torch.FloatTensor with (n, |E|) shape

class dicee.models.BaseKGE(args: dict)[source]

Bases: BaseKGELightning

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

args
embedding_dim = None
num_entities = None
num_relations = None
num_tokens = None
learning_rate = None
apply_unit_norm = None
input_dropout_rate = None
hidden_dropout_rate = None
optimizer_name = None
feature_map_dropout_rate = None
kernel_size = None
num_of_output_channels = None
weight_decay = None
loss
selected_optimizer = None
normalizer_class = None
normalize_head_entity_embeddings
normalize_relation_embeddings
normalize_tail_entity_embeddings
hidden_normalizer
param_init
input_dp_ent_real
input_dp_rel_real
hidden_dropout
loss_history = []
byte_pair_encoding
max_length_subword_tokens
block_size
forward_byte_pair_encoded_k_vs_all(x: torch.LongTensor) torch.FloatTensor[source]

KvsAll scoring for BPE-encoded head entities and relations.

Retrieves subword-unit embeddings for the head entity and relation, reduces them to fixed-size vectors via a linear projection, then scores against all BPE entity embeddings.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) BPE token indices where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

Shape (batch_size, num_bpe_entities) score matrix.

Return type:

torch.FloatTensor

forward_byte_pair_encoded_triple(x: Tuple[torch.LongTensor, torch.LongTensor]) torch.FloatTensor[source]

NegSample scoring for BPE-encoded (head, relation, tail) triples.

Retrieves subword-unit embeddings for all three elements and reduces them to fixed-size vectors via a linear projection before computing the triple score.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) BPE token indices.

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

init_params_with_sanity_checking() None[source]

Populate model hyper-parameters from self.args with safe defaults.

Reads embedding dimension, learning rate, dropout rates, normalisation strategy, optimizer name, and parameter initialisation scheme from the args dict. Falls back to sensible defaults for any missing key so that minimal args dicts (e.g. for unit tests) are still valid.

forward(x: torch.LongTensor | Tuple[torch.LongTensor, torch.LongTensor], y_idx: torch.LongTensor = None) torch.FloatTensor[source]

Route the forward pass to the appropriate scoring method.

Inspects the shape and type of x to decide which low-level scorer to call:

Parameters:
  • x (torch.LongTensor or Tuple[torch.LongTensor, torch.LongTensor]) – Either a plain index tensor or a (triple_idx, target_idx) tuple for sample-based labelling.

  • y_idx (torch.LongTensor, optional) – Target entity indices used by forward_k_vs_sample(). Ignored when x is a plain tensor.

Returns:

Score tensor whose shape depends on the selected scorer.

Return type:

torch.FloatTensor

forward_triples(x: torch.LongTensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_all(*args, **kwargs)[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_k_vs_sample(*args, **kwargs)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

get_triple_representation(idx_hrt) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for a triple index batch.

Parameters:

idx_hrt (torch.LongTensor) – Shape (batch_size, 3) integer tensor with columns [head_idx, relation_idx, tail_idx].

Returns:

head_ent_emb, rel_ent_emb, tail_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_head_relation_representation(indexed_triple) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for head entities and relations.

Parameters:

indexed_triple (torch.LongTensor) – Shape (batch_size, 2) integer tensor with columns [head_idx, relation_idx].

Returns:

head_ent_emb, rel_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_sentence_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve BPE subword-unit embeddings for a batch of triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) where T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb, tail_emb – Each has shape (batch_size, T, embedding_dim).

Return type:

torch.FloatTensor

get_bpe_head_and_relation_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve unit-normalised BPE embeddings for head entities and relations.

Each entity/relation is represented as a sequence of T subword tokens. Their token embeddings are L2-normalised across the sequence dimension so that the resulting matrix has unit Frobenius norm.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb – Each has shape (batch_size, T, embedding_dim), L2-normalised over the (T, D) dimensions.

Return type:

torch.FloatTensor

get_embeddings() Tuple[numpy.ndarray, numpy.ndarray][source]

Return the entity and relation embedding matrices as numpy arrays.

Returns:

  • entity_embeddings (numpy.ndarray) – Shape (num_entities, embedding_dim).

  • relation_embeddings (numpy.ndarray) – Shape (num_relations, embedding_dim).

class dicee.models.PykeenKGE(args: dict)[source]

Bases: dicee.models.base_model.BaseKGE

A class for using knowledge graph embedding models implemented in Pykeen

Notes: Pykeen_DistMult: C Pykeen_ComplEx: Pykeen_QuatE: Pykeen_MuRE: Pykeen_CP: Pykeen_HolE: Pykeen_HolE: Pykeen_HolE: Pykeen_TransD: Pykeen_TransE: Pykeen_TransF: Pykeen_TransH: Pykeen_TransR:

model_kwargs
name
model
loss_history = []
args
entity_embeddings = None
relation_embeddings = None
forward_k_vs_all(x: torch.LongTensor)[source]

# => Explicit version by this we can apply bn and dropout

# (1) Retrieve embeddings of heads and relations + apply Dropout & Normalization if given. h, r = self.get_head_relation_representation(x) # (2) Reshape (1). if self.last_dim > 0:

h = h.reshape(len(x), self.embedding_dim, self.last_dim) r = r.reshape(len(x), self.embedding_dim, self.last_dim)

# (3) Reshape all entities. if self.last_dim > 0:

t = self.entity_embeddings.weight.reshape(self.num_entities, self.embedding_dim, self.last_dim)

else:

t = self.entity_embeddings.weight

# (4) Call the score_t from interactions to generate triple scores. return self.interaction.score_t(h=h, r=r, all_entities=t, slice_size=1)

forward_triples(x: torch.LongTensor) torch.FloatTensor[source]

# => Explicit version by this we can apply bn and dropout

# (1) Retrieve embeddings of heads, relations and tails and apply Dropout & Normalization if given. h, r, t = self.get_triple_representation(x) # (2) Reshape (1). if self.last_dim > 0:

h = h.reshape(len(x), self.embedding_dim, self.last_dim) r = r.reshape(len(x), self.embedding_dim, self.last_dim) t = t.reshape(len(x), self.embedding_dim, self.last_dim)

# (3) Compute the triple score return self.interaction.score(h=h, r=r, t=t, slice_size=None, slice_dim=0)

abstractmethod forward_k_vs_sample(x: torch.LongTensor, target_entity_idx)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

class dicee.models.BaseKGE(args: dict)[source]

Bases: BaseKGELightning

Base class for all Knowledge Graph Embedding models.

Inherits the Lightning training loop from BaseKGELightning and adds the embedding tables, normalisation / dropout layers, and the routing logic that dispatches forward() calls to the appropriate scoring method.

Sub-classes must implement at minimum:

Parameters:

args (dict) – Flat configuration dictionary produced by vars(argparse.Namespace). Required keys: embedding_dim, num_entities, num_relations, learning_rate (or lr), optim, scoring_technique.

args
embedding_dim = None
num_entities = None
num_relations = None
num_tokens = None
learning_rate = None
apply_unit_norm = None
input_dropout_rate = None
hidden_dropout_rate = None
optimizer_name = None
feature_map_dropout_rate = None
kernel_size = None
num_of_output_channels = None
weight_decay = None
loss
selected_optimizer = None
normalizer_class = None
normalize_head_entity_embeddings
normalize_relation_embeddings
normalize_tail_entity_embeddings
hidden_normalizer
param_init
input_dp_ent_real
input_dp_rel_real
hidden_dropout
loss_history = []
byte_pair_encoding
max_length_subword_tokens
block_size
forward_byte_pair_encoded_k_vs_all(x: torch.LongTensor) torch.FloatTensor[source]

KvsAll scoring for BPE-encoded head entities and relations.

Retrieves subword-unit embeddings for the head entity and relation, reduces them to fixed-size vectors via a linear projection, then scores against all BPE entity embeddings.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) BPE token indices where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

Shape (batch_size, num_bpe_entities) score matrix.

Return type:

torch.FloatTensor

forward_byte_pair_encoded_triple(x: Tuple[torch.LongTensor, torch.LongTensor]) torch.FloatTensor[source]

NegSample scoring for BPE-encoded (head, relation, tail) triples.

Retrieves subword-unit embeddings for all three elements and reduces them to fixed-size vectors via a linear projection before computing the triple score.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) BPE token indices.

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

init_params_with_sanity_checking() None[source]

Populate model hyper-parameters from self.args with safe defaults.

Reads embedding dimension, learning rate, dropout rates, normalisation strategy, optimizer name, and parameter initialisation scheme from the args dict. Falls back to sensible defaults for any missing key so that minimal args dicts (e.g. for unit tests) are still valid.

forward(x: torch.LongTensor | Tuple[torch.LongTensor, torch.LongTensor], y_idx: torch.LongTensor = None) torch.FloatTensor[source]

Route the forward pass to the appropriate scoring method.

Inspects the shape and type of x to decide which low-level scorer to call:

Parameters:
  • x (torch.LongTensor or Tuple[torch.LongTensor, torch.LongTensor]) – Either a plain index tensor or a (triple_idx, target_idx) tuple for sample-based labelling.

  • y_idx (torch.LongTensor, optional) – Target entity indices used by forward_k_vs_sample(). Ignored when x is a plain tensor.

Returns:

Score tensor whose shape depends on the selected scorer.

Return type:

torch.FloatTensor

forward_triples(x: torch.LongTensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

forward_k_vs_all(*args, **kwargs)[source]

Score a (head, relation) batch against every entity.

Sub-classes must override this method. The default implementation raises ValueError to make missing overrides obvious at runtime.

Returns:

Shape (batch_size, num_entities) score matrix.

Return type:

torch.FloatTensor

forward_k_vs_sample(*args, **kwargs)[source]

Score a (head, relation) batch against a sampled subset of entities.

Used by KvsSample and 1vsSample datasets. Sub-classes that support sample-based labelling must override this method.

Returns:

Shape (batch_size, k) score matrix where k is the number of sampled target entities.

Return type:

torch.FloatTensor

get_triple_representation(idx_hrt) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for a triple index batch.

Parameters:

idx_hrt (torch.LongTensor) – Shape (batch_size, 3) integer tensor with columns [head_idx, relation_idx, tail_idx].

Returns:

head_ent_emb, rel_ent_emb, tail_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_head_relation_representation(indexed_triple) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve and normalise embedding vectors for head entities and relations.

Parameters:

indexed_triple (torch.LongTensor) – Shape (batch_size, 2) integer tensor with columns [head_idx, relation_idx].

Returns:

head_ent_emb, rel_ent_emb – Each has shape (batch_size, embedding_dim) after applying the configured dropout and normalisation.

Return type:

torch.FloatTensor

get_sentence_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor, torch.FloatTensor][source]

Retrieve BPE subword-unit embeddings for a batch of triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3, T) where T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb, tail_emb – Each has shape (batch_size, T, embedding_dim).

Return type:

torch.FloatTensor

get_bpe_head_and_relation_representation(x: torch.LongTensor) Tuple[torch.FloatTensor, torch.FloatTensor][source]

Retrieve unit-normalised BPE embeddings for head entities and relations.

Each entity/relation is represented as a sequence of T subword tokens. Their token embeddings are L2-normalised across the sequence dimension so that the resulting matrix has unit Frobenius norm.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 2, T) where dim 1 indexes [head, relation] and T is max_length_subword_tokens.

Returns:

head_ent_emb, rel_emb – Each has shape (batch_size, T, embedding_dim), L2-normalised over the (T, D) dimensions.

Return type:

torch.FloatTensor

get_embeddings() Tuple[numpy.ndarray, numpy.ndarray][source]

Return the entity and relation embedding matrices as numpy arrays.

Returns:

  • entity_embeddings (numpy.ndarray) – Shape (num_entities, embedding_dim).

  • relation_embeddings (numpy.ndarray) – Shape (num_relations, embedding_dim).

class dicee.models.FMult(args)[source]

Bases: dicee.models.base_model.BaseKGE

Learning Knowledge Neural Graphs

name = 'FMult'
entity_embeddings
relation_embeddings
k
num_sample = 50
gamma
roots
weights
compute_func(weights: torch.FloatTensor, x) torch.FloatTensor[source]
chain_func(weights, x: torch.FloatTensor)[source]
forward_triples(idx_triple: torch.Tensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

class dicee.models.GFMult(args)[source]

Bases: dicee.models.base_model.BaseKGE

Learning Knowledge Neural Graphs

name = 'GFMult'
entity_embeddings
relation_embeddings
k
num_sample = 250
roots
weights
compute_func(weights: torch.FloatTensor, x) torch.FloatTensor[source]
chain_func(weights, x: torch.FloatTensor)[source]
forward_triples(idx_triple: torch.Tensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

class dicee.models.FMult2(args)[source]

Bases: dicee.models.base_model.BaseKGE

Learning Knowledge Neural Graphs

name = 'FMult2'
n_layers = 3
k
n = 50
score_func = 'compositional'
discrete_points
entity_embeddings
relation_embeddings
build_func(Vec)[source]
build_chain_funcs(list_Vec)[source]
compute_func(W, b, x) torch.FloatTensor[source]
function(list_W, list_b)[source]
trapezoid(list_W, list_b)[source]
forward_triples(idx_triple: torch.Tensor) torch.Tensor[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

class dicee.models.LFMult1(args)[source]

Bases: dicee.models.base_model.BaseKGE

Embedding with trigonometric functions. We represent all entities and relations in the complex number space as: f(x) = sum_{k=0}^{k=d-1}wk e^{kix}. and use the three differents scoring function as in the paper to evaluate the score

name = 'LFMult1'
entity_embeddings
relation_embeddings
forward_triples(idx_triple)[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

tri_score(h, r, t)[source]
vtp_score(h, r, t)[source]
class dicee.models.LFMult(args)[source]

Bases: dicee.models.base_model.BaseKGE

Embedding with polynomial functions. We represent all entities and relations in the polynomial space as: f(x) = sum_{i=0}^{d-1} a_k x^{i%d} and use the three differents scoring function as in the paper to evaluate the score. We also consider combining with Neural Networks.

name = 'LFMult'
entity_embeddings
relation_embeddings
degree
m
x_values
forward_triples(idx_triple)[source]

Score a batch of (head, relation, tail) index triples.

Parameters:

x (torch.LongTensor) – Shape (batch_size, 3) integer tensor where each row is [head_idx, relation_idx, tail_idx].

Returns:

Shape (batch_size,) triple scores.

Return type:

torch.FloatTensor

construct_multi_coeff(x)[source]
poly_NN(x, coefh, coefr, coeft)[source]

Constructing a 2 layers NN to represent the embeddings. h = sigma(wh^T x + bh ), r = sigma(wr^T x + br ), t = sigma(wt^T x + bt )

linear(x, w, b)[source]
scalar_batch_NN(a, b, c)[source]

element wise multiplication between a,b and c: Inputs : a, b, c ====> torch.tensor of size batch_size x m x d Output : a tensor of size batch_size x d

tri_score(coeff_h, coeff_r, coeff_t)[source]

this part implement the trilinear scoring techniques:

score(h,r,t) = int_{0}{1} h(x)r(x)t(x) dx = sum_{i,j,k = 0}^{d-1} dfrac{a_i*b_j*c_k}{1+(i+j+k)%d}

  1. generate the range for i,j and k from [0 d-1]

2. perform dfrac{a_i*b_j*c_k}{1+(i+j+k)%d} in parallel for every batch

  1. take the sum over each batch

vtp_score(h, r, t)[source]

this part implement the vector triple product scoring techniques:

score(h,r,t) = int_{0}{1} h(x)r(x)t(x) dx = sum_{i,j,k = 0}^{d-1} dfrac{a_i*c_j*b_k - b_i*c_j*a_k}{(1+(i+j)%d)(1+k)}

  1. generate the range for i,j and k from [0 d-1]

  2. Compute the first and second terms of the sum

  3. Multiply with then denominator and take the sum

  4. take the sum over each batch

comp_func(h, r, t)[source]

this part implement the function composition scoring techniques: i.e. score = <hor, t>

polynomial(coeff, x, degree)[source]

This function takes a matrix tensor of coefficients (coeff), a tensor vector of points x and range of integer [0,1,…d] and return a vector tensor (coeff[0][0] + coeff[0][1]x +…+ coeff[0][d]x^d,

coeff[1][0] + coeff[1][1]x +…+ coeff[1][d]x^d)
pop(coeff, x, degree)[source]

This function allow us to evaluate the composition of two polynomes without for loops :) it takes a matrix tensor of coefficients (coeff), a matrix tensor of points x and range of integer [0,1,…d]

and return a tensor (coeff[0][0] + coeff[0][1]x +…+ coeff[0][d]x^d,
coeff[1][0] + coeff[1][1]x +…+ coeff[1][d]x^d)
class dicee.models.DualE(args)[source]

Bases: dicee.models.base_model.BaseKGE

Dual Quaternion Knowledge Graph Embeddings (https://ojs.aaai.org/index.php/AAAI/article/download/16850/16657)

name = 'DualE'
entity_embeddings
relation_embeddings
num_ent = None
kvsall_score(e_1_h, e_2_h, e_3_h, e_4_h, e_5_h, e_6_h, e_7_h, e_8_h, e_1_t, e_2_t, e_3_t, e_4_t, e_5_t, e_6_t, e_7_t, e_8_t, r_1, r_2, r_3, r_4, r_5, r_6, r_7, r_8) torch.tensor[source]

KvsAll scoring function

Input

x: torch.LongTensor with (n, ) shape

Output

torch.FloatTensor with (n) shape

forward_triples(idx_triple: torch.tensor) torch.tensor[source]

Negative Sampling forward pass:

Input

x: torch.LongTensor with (n, ) shape

Output

torch.FloatTensor with (n) shape

forward_k_vs_all(x)[source]

KvsAll forward pass

Input

x: torch.LongTensor with (n, ) shape

Output

torch.FloatTensor with (n) shape

T(x: torch.tensor) torch.tensor[source]

Transpose function

Input: Tensor with shape (nxm) Output: Tensor with shape (mxn)