dicee.read_preprocess_save_load_kg
Submodules
Classes
Preprocess the data in memory |
|
Read the data from disk into memory |
Package Contents
- class dicee.read_preprocess_save_load_kg.PreprocessKG(kg)
Preprocess the data in memory
- kg
- start() None
Preprocess train, valid and test datasets stored in knowledge graph instance
Parameter
- rtype:
None
- preprocess_with_byte_pair_encoding()
- preprocess_with_byte_pair_encoding_with_padding() None
Preprocess with byte pair encoding and add padding
- preprocess_with_pandas() None
Preprocess with pandas: add reciprocal triples, construct vocabulary, and index datasets
- preprocess_with_polars() None
Preprocess with polars: add reciprocal triples and create indexed datasets
- sequential_vocabulary_construction() None
Read input data into memory
Remove triples with a condition
- Serialize vocabularies in a pandas dataframe where
=> the index is integer and => a single column is string (e.g. URI)