dicee.knowledge_graph_embeddings
Classes
Knowledge Graph Embedding Class for interactive usage of pre-trained models |
Module Contents
- class dicee.knowledge_graph_embeddings.KGE(path=None, url=None, construct_ensemble=False, model_name=None)[source]
Bases:
dicee.abstracts.BaseInteractiveKGE,dicee.abstracts.InteractiveQueryDecomposition,dicee.abstracts.BaseInteractiveTrainKGEKnowledge Graph Embedding Class for interactive usage of pre-trained models
- get_transductive_entity_embeddings(indices: torch.LongTensor | List[str], as_pytorch=False, as_numpy=False, as_list=True) torch.FloatTensor | numpy.ndarray | List[float][source]
- create_vector_database(collection_name: str, distance: str, location: str = 'localhost', port: int = 6333)[source]
- predict_missing_head_entity(relation: List[str] | str, tail_entity: List[str] | str, within=None, batch_size=2, topk=1, return_indices=False) Tuple[source]
Given a relation and a tail entity, return top k ranked head entity.
argmax_{e in E } f(e,r,t), where r in R, t in E.
Parameter
relation: Union[List[str], str]
String representation of selected relations.
tail_entity: Union[List[str], str]
String representation of selected entities.
k: int
Highest ranked k entities.
Returns: Tuple
Highest K scores and entities
- predict_missing_relations(head_entity: List[str] | str, tail_entity: List[str] | str, within=None, batch_size=2, topk=1, return_indices=False) Tuple[source]
Given a head entity and a tail entity, return top k ranked relations.
argmax_{r in R } f(h,r,t), where h, t in E.
Parameter
head_entity: List[str]
String representation of selected entities.
tail_entity: List[str]
String representation of selected entities.
k: int
Highest ranked k entities.
Returns: Tuple
Highest K scores and entities
- predict_missing_tail_entity(head_entity: List[str] | str, relation: List[str] | str, within: List[str] = None, batch_size=2, topk=1, return_indices=False) torch.FloatTensor[source]
Given a head entity and a relation, return top k ranked entities
argmax_{e in E } f(h,r,e), where h in E and r in R.
Parameter
head_entity: List[str]
String representation of selected entities.
tail_entity: List[str]
String representation of selected entities.
Returns: Tuple
scores
- predict(*, h: List[str] | str | None = None, r: List[str] | str | None = None, t: List[str] | str | None = None, within: List[str] | None = None, logits: bool = True) torch.FloatTensor[source]
Predict scores for triples or missing triple elements.
- Parameters:
h – Head entity/entities. None to predict heads.
r – Relation/relations. None to predict relations.
t – Tail entity/entities. None to predict tails.
within – Optional list of entities to restrict predictions to.
logits – If True, return raw scores. If False, return sigmoid scores (0-1).
- Returns:
Single triple (h, r, t): scalar score
Missing element: vector of all possible scores
- Return type:
torch.FloatTensor of scores. Shape depends on the query type
- Raises:
AssertionError – If inputs are not strings or lists of strings.
Examples
>>> # Score a specific triple >>> model.predict(h="Mongolia", r="isLocatedIn", t="Asia", logits=False) tensor(0.9523)
>>> # Get scores for all possible tail entities >>> model.predict(h="Mongolia", r="isLocatedIn", t=None) tensor([0.21, 0.95, 0.03, ...]) # One score per entity
- predict_topk(*, h: str | List[str] | None = None, r: str | List[str] | None = None, t: str | List[str] | None = None, topk: int = 10, within: List[str] | None = None, batch_size: int = 1024) List[Tuple[str, float]] | List[List[Tuple[str, float]]][source]
Predict top-k missing items in a given triple pattern.
- Parameters:
h – Head entity/entities. None to predict heads.
r – Relation/relations. None to predict relations.
t – Tail entity/entities. None to predict tails.
topk – Number of top predictions to return.
within – Optional list of entities to restrict predictions to.
batch_size – Batch size for processing multiple queries.
- Returns:
List[(item, score), …] of length topk. For batch query: List of such lists, one per query.
- Return type:
For single query
- Raises:
AssertionError – If more than one of h, r, t is None.
AssertionError – If the required arguments for a query type are None.
Examples
>>> model.predict_topk(h=["Mongolia"], r=["isLocatedIn"], topk=3) [('Asia', 0.99), ('Europe', 0.02), ...]
>>> model.predict_topk(r=["isLocatedIn"], t=["Asia"], topk=5) [('Mongolia', 0.85), ('China', 0.82), ...]
- triple_score(h: List[str] | str = None, r: List[str] | str = None, t: List[str] | str = None, logits=False) torch.FloatTensor[source]
Predict triple score
Parameter
head_entity: List[str]
String representation of selected entities.
relation: List[str]
String representation of selected relations.
tail_entity: List[str]
String representation of selected entities.
logits: bool
If logits is True, unnormalized score returned
Returns: Tuple
pytorch tensor of triple score
- single_hop_query_answering(query: tuple, only_scores: bool = True, k: int = None, use_logits: bool = True)[source]
- answer_multi_hop_query(query_type: str | None = None, query: Tuple[str | Tuple[str, str], Ellipsis] | None = None, queries: List[Tuple[str | Tuple[str, str], Ellipsis]] | None = None, tnorm: str = 'prod', neg_norm: str = 'standard', lambda_: float = 0.0, k: int = 10, only_scores: bool = False, use_logits: bool = True) List[Tuple[str, torch.Tensor]] | List[List[Tuple[str, torch.Tensor]]][source]
Answer multi-hop EPFO (Existential Positive First-Order) queries.
Supports 9 query types: 1p, 2p, 3p, 2i, 3i, ip, pi, 2u, up. See docs/guides/multi_hop_queries.md for detailed query patterns.
- Parameters:
query_type – Query pattern name. One of: - 1p: (e, (r,)) # One-hop - 2p: (e, (r1, r2)) # Two-hop - 3p: (e, (r1, r2, r3)) # Three-hop - 2i: ((e1, (r1,)), (e2, (r2,))) # Two-way intersection - 3i: ((e1, (r1,)), (e2, (r2,)), (e3, (r3,))) # Three-way intersection - ip: (((e1, (r1,)), (e2, (r2,))), (r3,)) # Intersection + projection - pi: ((e, (r1, r2)), (r3,)) # Projection + intersection (2i meets 2p) - 2u: ((e1, (r1,)), (e2, (r2,))) # Two-way union - up: ((e, (r1, r2)), (e, (r3,))) # Union + projection
query – Single query tuple matching the query_type pattern.
queries – Batch of queries. If provided, query must be None.
tnorm – T-norm for intersection/union. Options: “prod”, “min”.
neg_norm – Negation norm. Options: “standard”, “sugeno”, “yager”.
lambda – Parameter for sugeno and yager negation (0.0-1.0).
k – Number of top answer entities to return.
only_scores – If True, return only scores tensor. If False, return (entity, score) tuples.
use_logits – If True, use raw model logits. If False, use sigmoid probabilities.
- Returns:
List[(entity, score), …] of top-k answers. For batch queries: List of such lists, one per query.
- Return type:
For single query
- Raises:
ValueError – If query_type is not in {1p, 2p, 3p, 2i, 3i, ip, pi, 2u, up}.
AssertionError – If query structure doesn’t match query_type pattern.
Examples
>>> # 1p: Find entities located in Asia >>> model.answer_multi_hop_query( ... query_type="1p", ... query=("Asia", ("isLocatedIn",)), ... k=5 ... ) [("Mongolia", 0.92), ("China", 0.89), ...]
>>> # 2p: Two-hop query (e.g., "capital of countries in Europe") >>> model.answer_multi_hop_query( ... query_type="2p", ... query=("Europe", ("isLocatedIn", "hasCapital")), ... k=3 ... ) [("Paris", 0.85), ("Berlin", 0.82), ...]
>>> # 2i: Intersection query >>> model.answer_multi_hop_query( ... query_type="2i", ... query=(("Asia", ("isLocatedIn",)), ("Mountains", ("hasGeography",))), ... k=5 ... ) [("Nepal", 0.78), ("Tibet", 0.65), ...]
See also
docs/guides/multi_hop_queries.md: Complete guide with all query patterns
tests/test_answer_multi_hop_query.py: Usage examples
- find_missing_triples(confidence: float, entities: List[str] = None, relations: List[str] = None, topk: int = 10, at_most: int = sys.maxsize) Set[source]
Find missing triples
Iterative over a set of entities E and a set of relation R :
orall e in E and orall r in R f(e,r,x)
Return (e,r,x)
otin G and f(e,r,x) > confidence
confidence: float
A threshold for an output of a sigmoid function given a triple.
topk: int
Highest ranked k item to select triples with f(e,r,x) > confidence .
at_most: int
Stop after finding at_most missing triples
{(e,r,x) | f(e,r,x) > confidence land (e,r,x)
otin G
- predict_literals(entity: List[str] | str = None, attribute: List[str] | str = None, denormalize_preds: bool = True) numpy.ndarray[source]
Predicts literal values for given entities and attributes.
- Parameters:
entity (Union[List[str], str]) – Entity or list of entities to predict literals for.
attribute (Union[List[str], str]) – Attribute or list of attributes to predict literals for.
denormalize_preds (bool) – If True, denormalizes the predictions.
- Returns:
Predictions for the given entities and attributes.
- Return type:
numpy ndarray