API Documentation¶

Olympus¶

class olympus.Olympus(*args, **kwargs)[source]¶

Master class of the olympus package

creates empty object and loads defaults

Parameters

me (str) – arbitrary name to identify the object
indent (int) – number of spaces used in string representation

add(prop, attr)¶

dynamically adds property and attribute to object

Parameters

prop (any) – property associated with attribute
attr (any) – property value

benchmark(dataset='alkox', planners='all', database=<Database (name=olympus_ce09b6d1, kind=sqlite)>, num_ind_runs=5, num_iter=3)[source]¶

Parameters: dataset (str) – the dataset to use

from_dict(info_dict)¶

returns object representation of given dictionary

Parameters: info_dict (dict) – dictionary to be represented
Returns: Object representation of dictionary
Return type: Object

get(prop)¶

returns attribute associated with given property

Parameters: prop (any) – valid property
Returns: attribute associated with property
Return type: any

to_dict(exclude=[])¶

returns dictionary representation of presented object

Parameters: exclude (list of any) – properties to be excluded
Returns: representation of presented object
Return type: dict

Datasets¶

class olympus.datasets.Dataset(kind=None, data=None, columns=None, target_ids=None, test_frac=0.2, num_folds=5, random_seed=None)[source]¶

A Dataset object stores the data of a dataset by wrapping a pandas.DataFrame in its data attribute, provides additional information on the dataset, and provides convenience methods to access features and targets as well as to generate training/validation/test splits.

Parameters

kind (str) – kind of the Olympus dataset to load.
data (array) – custom dataset. Same input as for pandas.DataFrame.
columns (list) – column names. Same input as for pandas.DataFrame.
target_ids (list) – list of column indices, or names if provided, that identify the targets for the predictions.
test_frac (float) – fraction of the data to be used as test set.
num_folds (int) – number of cross validation folds the training set will be split into.
random_seed (int) – random seed for numpy. Setting a seed makes the random splits reproducible.

create_train_validate_test_splits(test_frac=0.2, num_folds=5, test_indices=None)[source]¶

Parameters

test_frac (float) –
num_folds (int) –
test_indices (array) – Array with the indices of the samples to be used as test set.

dataset_info()[source]¶: Provide summary info about dataset.

get_cv_fold(fold)[source]¶

Get the data for a specific cross-validation fold.

Parameters: fold (int) – fold id.
Returns: data for the chosen fold.
Return type: data (DataFrame)

infer_param_space()[source]¶: Guess the parameter space from the dataset. The range for all parameters will be define based on the minimum and maximum values in the dataset for each variable. All variables will be assumed not to be periodic.

set_param_space(param_space)[source]¶

Define the parameter space of the dataset.

Parameters: param_space (ParameterSpace) – ParameterSpace object with information about all variables in the dataset.

to_disk(folder='custom_dataset')[source]¶

Save the dataset to disk in the format expected by Olympus for its own datasets. This can be useful if you plan to upload the dataset to the community datasets available online.

Parameters: folder (str) – Folder in which to save the dataset files.

API Documentation¶

Olympus¶

Datasets¶

Models¶

Planners¶