API Documentation¶
Olympus¶
- 
class 
olympus.Olympus(*args, **kwargs)[source]¶ Master class of the olympus package
creates empty object and loads defaults
- Parameters
 me (str) – arbitrary name to identify the object
indent (int) – number of spaces used in string representation
- 
add(prop, attr)¶ dynamically adds property and attribute to object
- Parameters
 prop (any) – property associated with attribute
attr (any) – property value
- 
benchmark(dataset='alkox', planners='all', database=<Database (name=olympus_ce09b6d1, kind=sqlite)>, num_ind_runs=5, num_iter=3)[source]¶ - Parameters
 dataset (str) – the dataset to use
- 
from_dict(info_dict)¶ returns object representation of given dictionary
- Parameters
 info_dict (dict) – dictionary to be represented
- Returns
 Object representation of dictionary
- Return type
 Object
- 
get(prop)¶ returns attribute associated with given property
- Parameters
 prop (any) – valid property
- Returns
 attribute associated with property
- Return type
 any
- 
to_dict(exclude=[])¶ returns dictionary representation of presented object
- Parameters
 exclude (list of any) – properties to be excluded
- Returns
 representation of presented object
- Return type
 dict
Datasets¶
- 
class 
olympus.datasets.Dataset(kind=None, data=None, columns=None, target_ids=None, test_frac=0.2, num_folds=5, random_seed=None)[source]¶ A
Datasetobject stores the data of a dataset by wrapping apandas.DataFramein itsdataattribute, provides additional information on the dataset, and provides convenience methods to access features and targets as well as to generate training/validation/test splits.- Parameters
 kind (str) – kind of the Olympus dataset to load.
data (array) – custom dataset. Same input as for pandas.DataFrame.
columns (list) – column names. Same input as for pandas.DataFrame.
target_ids (list) – list of column indices, or names if provided, that identify the targets for the predictions.
test_frac (float) – fraction of the data to be used as test set.
num_folds (int) – number of cross validation folds the training set will be split into.
random_seed (int) – random seed for numpy. Setting a seed makes the random splits reproducible.
- 
create_train_validate_test_splits(test_frac=0.2, num_folds=5, test_indices=None)[source]¶ - Parameters
 test_frac (float) –
num_folds (int) –
test_indices (array) – Array with the indices of the samples to be used as test set.
- 
get_cv_fold(fold)[source]¶ Get the data for a specific cross-validation fold.
- Parameters
 fold (int) – fold id.
- Returns
 data for the chosen fold.
- Return type
 data (DataFrame)
- 
infer_param_space()[source]¶ Guess the parameter space from the dataset. The range for all parameters will be define based on the minimum and maximum values in the dataset for each variable. All variables will be assumed not to be periodic.