dicee.read_preprocess_save_load_kg

Submodules

Classes

PreprocessKG

Preprocess the data in memory

LoadSaveToDisk

ReadFromDisk

Read the data from disk into memory

Package Contents

class dicee.read_preprocess_save_load_kg.PreprocessKG(kg)[source]

Preprocess the data in memory

kg
start() None[source]

Preprocess train, valid and test datasets stored in knowledge graph instance

Parameter

rtype:

None

preprocess_with_byte_pair_encoding()[source]
preprocess_with_byte_pair_encoding_with_padding() None[source]
preprocess_with_pandas() None[source]

Preprocess train, valid and test datasets stored in knowledge graph instance with pandas

  1. Add recipriocal or noisy triples

  2. Construct vocabulary

  3. Index datasets

Parameter

rtype:

None

preprocess_with_polars() None[source]
sequential_vocabulary_construction() None[source]
  1. Read input data into memory

  2. Remove triples with a condition

  3. Serialize vocabularies in a pandas dataframe where

    => the index is integer and => a single column is string (e.g. URI)

remove_triples_from_train_with_condition()[source]
class dicee.read_preprocess_save_load_kg.LoadSaveToDisk(kg)[source]
kg
save()[source]
load()[source]
class dicee.read_preprocess_save_load_kg.ReadFromDisk(kg)[source]

Read the data from disk into memory

kg
start() None[source]

Read a knowledge graph from disk into memory

Data will be available at the train_set, test_set, valid_set attributes.

Parameter

None

rtype:

None

add_noisy_triples_into_training()[source]