API Documentation

Olympus

class olympus.Olympus(*args, **kwargs)[source]

Master class of the olympus package

creates empty object and loads defaults

Parameters
  • me (str) – arbitrary name to identify the object

  • indent (int) – number of spaces used in string representation

add(prop, attr)

dynamically adds property and attribute to object

Parameters
  • prop (any) – property associated with attribute

  • attr (any) – property value

benchmark(dataset='alkox', planners='all', database=<Database (name=olympus_ce09b6d1, kind=sqlite)>, num_ind_runs=5, num_iter=3)[source]
Parameters

dataset (str) – the dataset to use

from_dict(info_dict)

returns object representation of given dictionary

Parameters

info_dict (dict) – dictionary to be represented

Returns

Object representation of dictionary

Return type

Object

get(prop)

returns attribute associated with given property

Parameters

prop (any) – valid property

Returns

attribute associated with property

Return type

any

to_dict(exclude=[])

returns dictionary representation of presented object

Parameters

exclude (list of any) – properties to be excluded

Returns

representation of presented object

Return type

dict

Datasets

class olympus.datasets.Dataset(kind=None, data=None, columns=None, target_ids=None, test_frac=0.2, num_folds=5, random_seed=None)[source]

A Dataset object stores the data of a dataset by wrapping a pandas.DataFrame in its data attribute, provides additional information on the dataset, and provides convenience methods to access features and targets as well as to generate training/validation/test splits.

Parameters
  • kind (str) – kind of the Olympus dataset to load.

  • data (array) – custom dataset. Same input as for pandas.DataFrame.

  • columns (list) – column names. Same input as for pandas.DataFrame.

  • target_ids (list) – list of column indices, or names if provided, that identify the targets for the predictions.

  • test_frac (float) – fraction of the data to be used as test set.

  • num_folds (int) – number of cross validation folds the training set will be split into.

  • random_seed (int) – random seed for numpy. Setting a seed makes the random splits reproducible.

create_train_validate_test_splits(test_frac=0.2, num_folds=5, test_indices=None)[source]
Parameters
  • test_frac (float) –

  • num_folds (int) –

  • test_indices (array) – Array with the indices of the samples to be used as test set.

dataset_info()[source]

Provide summary info about dataset.

get_cv_fold(fold)[source]

Get the data for a specific cross-validation fold.

Parameters

fold (int) – fold id.

Returns

data for the chosen fold.

Return type

data (DataFrame)

infer_param_space()[source]

Guess the parameter space from the dataset. The range for all parameters will be define based on the minimum and maximum values in the dataset for each variable. All variables will be assumed not to be periodic.

set_param_space(param_space)[source]

Define the parameter space of the dataset.

Parameters

param_space (ParameterSpace) – ParameterSpace object with information about all variables in the dataset.

to_disk(folder='custom_dataset')[source]

Save the dataset to disk in the format expected by Olympus for its own datasets. This can be useful if you plan to upload the dataset to the community datasets available online.

Parameters

folder (str) – Folder in which to save the dataset files.

Models

Planners