Emulators¶

Olympus provides many pre-trained emulators to be readily used as benchmarks for optimization and experimental planing algorithms. In particular, any combination of Datasets and Models specifies an Emulator that you can readily load from Olympus as follows:

# we want to load the emulator that uses a Bayesian neural network model for the HPLC dataset
from olympus import Emulator
emulator = Emulator(dataset='hplc_n9', model='BayesNeuralNet')

Once an Emulator instance has been created, it can be used to simulate the outcome of an experimental evaluation of the query parameters:

next_point = [[0.01, 0.02, 0.5, 1.0, 100., 7.]]
emulator.run(next_point)
>>>> [ParamVector(peak_area = 244.39515369060487)]

In addition to loading pre-trained emulators, you can define your own custom emulators by creating custom instances of Dataset and Model. For example, if you wanted to train an Emulator using different settings for the BayesNeuralNet model:

from olympus import Emulator
from olympus.models import BayesNeuralNet

model = BayesNeuralNet(hidden_depth=3, hidden_nodes=48, out_act='sigmoid')
emulator = Emulator(dataset='hplc_n9', model=model)

You can then evaluate the performance on the model via cross validation:

emulator.cross_validate()

And then finally train the model:

emulator.train()

The same can be done for a custom dataset. In this case you would load your own dataset (see Datasets) and train the emulator:

from olympus import Emulator, Dataset
from olympus.models import BayesNeuralNet

mydata = pd.from_csv('mydata.csv')
dataset = Dataset(data=mydata)
model = BayesNeuralNet(hidden_depth=3, hidden_nodes=48, out_act='sigmoid')
emulator = Emulator(dataset=dataset, model=model)
emulator.train()

To save the Emulator instance to file, such that you do not have to re-train it every time you’d like to use it, you can use the save method:

emulator.save('my_new_emulator')

You can then retrieve this emulator with the load_emulator function:

from olympus.emulators import load_emulator
emulator = load_emulator('my_new_emulator')

Emulator Class¶

class olympus.emulators.emulator.Emulator(dataset=None, model=None, feature_transform='identity', target_transform='identity')[source]

generic experiment emulator

This class is intended to provide the interface to the user.

Random notes: - emulators are uniquely determined via dataset + model + emulator_id

Experiment emulator.

Parameters

dataset (str, Dataset) – dataset used to train a model. Either a string, in which case a standard dataset is loaded, or a Dataset object. To see the list of available datasets …
model (str, Model) – the model used to create the emulator. Either a string, in which case a default model is loaded, or a Model object. To see the list of available models …
feature_transform (str, list) – the data transform to be applied to the features. See DataTransformer for the available transformations.
target_transform (str, list) – the data transform to be applied to the targets. See DataTransformer for the available transformations.

Methods

`run`(features[, num_samples, return_paramvector])	Run the emulator and return a value given the features provided.
`train`([plot, retrain])	Trains the model on the emulator dataset, using the emulator model.
`save`([path, include_cv])	Save the emulator in a specified location.
`cross_validate`([rerun, plot])	Performs cross validation on the emulator dataset, using the emulator model.

cross_validate(rerun=False, plot=False)[source]

Performs cross validation on the emulator dataset, using the emulator model. The number of folds used is defined in the Dataset object.

Parameters: rerun (bool) – whether to run cross validation again, in case it had already been performed.
Returns: dictionary with the list of train and validation R2 scores.
Return type: scores (dict)

run(features, num_samples=1, return_paramvector=False)[source]

Run the emulator and return a value given the features provided.

Parameters

features (ndarray) – 2d array with the input features used for the predictions
num_samples (int) – number of samples to average. only useful for probabilistic models
return_paramvector (bool) – Whether to return a ParameterVector object instead of a list of lists. Default is False.

Returns

model prediction(s).

Return type

y (float or array)

save(path='./olympus_emulator', include_cv=False)[source]

Save the emulator in a specified location. This will save the emulator object as a pickle file, and the associated TensorFlow model, in the specified location. The saved emulator can then be loaded with the olympus.emulators.load_emulator function.

Parameters

path (str) – relative path where to save the emulator.
include_cv (bool) – whether to include the cross validation models. Default is False.

train(plot=False, retrain=False)[source]

Trains the model on the emulator dataset, using the emulator model. The train/test split is defined in the Dataset object emulator.dataset. Note that the test set is used for testing the model performance, and for early stopping.

Parameters

plot (bool) –
retrain (bool) – whether to retrain the model, in case it had already been trained.

Returns

dictionary with the train and test R2 scores.

Return type

scores (dict)