API Documentation¶
Olympus¶
-
class
olympus.
Olympus
(*args, **kwargs)[source]¶ Master class of the olympus package
creates empty object and loads defaults
- Parameters
me (str) – arbitrary name to identify the object
indent (int) – number of spaces used in string representation
-
add
(prop, attr)¶ dynamically adds property and attribute to object
- Parameters
prop (any) – property associated with attribute
attr (any) – property value
-
benchmark
(dataset='alkox', planners='all', database=<Database (name=olympus_ce09b6d1, kind=sqlite)>, num_ind_runs=5, num_iter=3)[source]¶ - Parameters
dataset (str) – the dataset to use
-
from_dict
(info_dict)¶ returns object representation of given dictionary
- Parameters
info_dict (dict) – dictionary to be represented
- Returns
Object representation of dictionary
- Return type
Object
-
get
(prop)¶ returns attribute associated with given property
- Parameters
prop (any) – valid property
- Returns
attribute associated with property
- Return type
any
-
to_dict
(exclude=[])¶ returns dictionary representation of presented object
- Parameters
exclude (list of any) – properties to be excluded
- Returns
representation of presented object
- Return type
dict
Datasets¶
-
class
olympus.datasets.
Dataset
(kind=None, data=None, columns=None, target_ids=None, test_frac=0.2, num_folds=5, random_seed=None)[source]¶ A
Dataset
object stores the data of a dataset by wrapping apandas.DataFrame
in itsdata
attribute, provides additional information on the dataset, and provides convenience methods to access features and targets as well as to generate training/validation/test splits.- Parameters
kind (str) – kind of the Olympus dataset to load.
data (array) – custom dataset. Same input as for pandas.DataFrame.
columns (list) – column names. Same input as for pandas.DataFrame.
target_ids (list) – list of column indices, or names if provided, that identify the targets for the predictions.
test_frac (float) – fraction of the data to be used as test set.
num_folds (int) – number of cross validation folds the training set will be split into.
random_seed (int) – random seed for numpy. Setting a seed makes the random splits reproducible.
-
create_train_validate_test_splits
(test_frac=0.2, num_folds=5, test_indices=None)[source]¶ - Parameters
test_frac (float) –
num_folds (int) –
test_indices (array) – Array with the indices of the samples to be used as test set.
-
get_cv_fold
(fold)[source]¶ Get the data for a specific cross-validation fold.
- Parameters
fold (int) – fold id.
- Returns
data for the chosen fold.
- Return type
data (DataFrame)
-
infer_param_space
()[source]¶ Guess the parameter space from the dataset. The range for all parameters will be define based on the minimum and maximum values in the dataset for each variable. All variables will be assumed not to be periodic.