Emulators¶
Olympus provides many pre-trained emulators to be readily used as benchmarks for optimization and experimental
planing algorithms. In particular, any combination of Datasets and Models specifies an Emulator
that you can readily load from Olympus as follows:
# we want to load the emulator that uses a Bayesian neural network model for the HPLC dataset
from olympus import Emulator
emulator = Emulator(dataset='hplc_n9', model='BayesNeuralNet')
Once an Emulator
instance has been created, it can be used to simulate the outcome of an experimental evaluation
of the query parameters:
next_point = [[0.01, 0.02, 0.5, 1.0, 100., 7.]]
emulator.run(next_point)
>>>> [ParamVector(peak_area = 244.39515369060487)]
In addition to loading pre-trained emulators, you can define your own custom emulators by creating custom instances of
Dataset
and Model
. For example, if you wanted to train an Emulator
using different settings for the
BayesNeuralNet
model:
from olympus import Emulator
from olympus.models import BayesNeuralNet
model = BayesNeuralNet(hidden_depth=3, hidden_nodes=48, out_act='sigmoid')
emulator = Emulator(dataset='hplc_n9', model=model)
You can then evaluate the performance on the model via cross validation:
emulator.cross_validate()
And then finally train the model:
emulator.train()
The same can be done for a custom dataset. In this case you would load your own dataset (see Datasets) and train the emulator:
from olympus import Emulator, Dataset
from olympus.models import BayesNeuralNet
mydata = pd.from_csv('mydata.csv')
dataset = Dataset(data=mydata)
model = BayesNeuralNet(hidden_depth=3, hidden_nodes=48, out_act='sigmoid')
emulator = Emulator(dataset=dataset, model=model)
emulator.train()
To save the Emulator
instance to file, such that you do not have to re-train it every time you’d like to use it,
you can use the save
method:
emulator.save('my_new_emulator')
You can then retrieve this emulator with the load_emulator
function:
from olympus.emulators import load_emulator
emulator = load_emulator('my_new_emulator')
Emulator Class¶
-
class
olympus.emulators.emulator.
Emulator
(dataset=None, model=None, feature_transform='identity', target_transform='identity')[source] generic experiment emulator
This class is intended to provide the interface to the user.
Random notes: - emulators are uniquely determined via dataset + model + emulator_id
Experiment emulator.
- Parameters
dataset (str, Dataset) – dataset used to train a model. Either a string, in which case a standard dataset is loaded, or a Dataset object. To see the list of available datasets …
model (str, Model) – the model used to create the emulator. Either a string, in which case a default model is loaded, or a Model object. To see the list of available models …
feature_transform (str, list) – the data transform to be applied to the features. See DataTransformer for the available transformations.
target_transform (str, list) – the data transform to be applied to the targets. See DataTransformer for the available transformations.
Methods
run
(features[, num_samples, return_paramvector])Run the emulator and return a value given the features provided.
train
([plot, retrain])Trains the model on the emulator dataset, using the emulator model.
save
([path, include_cv])Save the emulator in a specified location.
cross_validate
([rerun, plot])Performs cross validation on the emulator dataset, using the emulator model.
-
cross_validate
(rerun=False, plot=False)[source] Performs cross validation on the emulator dataset, using the emulator model. The number of folds used is defined in the Dataset object.
- Parameters
rerun (bool) – whether to run cross validation again, in case it had already been performed.
- Returns
dictionary with the list of train and validation R2 scores.
- Return type
scores (dict)
-
run
(features, num_samples=1, return_paramvector=False)[source] Run the emulator and return a value given the features provided.
- Parameters
features (ndarray) – 2d array with the input features used for the predictions
num_samples (int) – number of samples to average. only useful for probabilistic models
return_paramvector (bool) – Whether to return a
ParameterVector
object instead of a list of lists. Default is False.
- Returns
model prediction(s).
- Return type
y (float or array)
-
save
(path='./olympus_emulator', include_cv=False)[source] Save the emulator in a specified location. This will save the emulator object as a pickle file, and the associated TensorFlow model, in the specified location. The saved emulator can then be loaded with the olympus.emulators.load_emulator function.
- Parameters
path (str) – relative path where to save the emulator.
include_cv (bool) – whether to include the cross validation models. Default is False.
-
train
(plot=False, retrain=False)[source] Trains the model on the emulator dataset, using the emulator model. The train/test split is defined in the Dataset object emulator.dataset. Note that the test set is used for testing the model performance, and for early stopping.
- Parameters
plot (bool) –
retrain (bool) – whether to retrain the model, in case it had already been trained.
- Returns
dictionary with the train and test R2 scores.
- Return type
scores (dict)