Finding the best hyperparameters for your machine learning algorithms is a difficult job. Oscar has been built to relieve your datascientist from this tedious task.
He can be up and runnig for you in 5 minutes:
pip install git+git://github.com/sensout/Oscar-Python.git
luarocks install Oscar --from=https://raw.githubusercontent.com/sensout/Oscar-Lua/master/
Sign in and get your API access token in the Account panel.
# Get Oscar from Oscar import Oscar scientist = Oscar('API_ACCESS_TOKEN') # Describe your experiment experiment = {'name':'Square', 'parameters':{'x' : {'min': -10, 'max' : 10}}} for i in range(1, 10): # Get next parameters to try from Oscar job = scientist.suggest(experiment) print job # Run you complex, time-consuming algorithm import math loss = math.pow(job['x'], 2) # Tell Oscar the result scientist.update(job, {'loss' : loss})
-- Get Oscar local Oscar = require('Oscar') local scientist = Oscar('API_ACCESS_TOKEN') -- Describe your experiment local experiment = {name='Square', parameters={x = {min = -10, max = 10}}} for i = 1, 10 do -- Get next parameters to try from Oscar local job = scientist:suggest(experiment) print(job) -- Run you complex, time-consuming algorithm local loss = math.pow(job.x, 2) -- Tell Oscar the result scientist:update(job, {loss = loss}) end
After a few runs you should see some interesting insights about your experiment in the Trials panel.
Your are ready to run your own experiment !
experiment
objectThe experiment
object describes your experiment to Oscar when you call the suggest(experiment)
method.
It has a JSON-like format (dict in Python/table in Lua) with mandatory and optional key-value pairs:
experiment = { 'name' : 'Square', 'description' : 'This is a very simple experiment', 'parameters' : ..., 'resume' : True } job = scientist.suggest(experiment)
experiment = { name = 'Square', description = 'This is a very simple experiment', parameters = ..., resume : true } job = scientist:suggest(experiment)
Key | Description | Required | Possible values | Default value |
---|---|---|---|---|
name |
This is the name of your experiment | Yes | string |
None |
description |
A description of your experiment for easier monitoring in the dashboard | No | string |
'' |
parameters |
Describes the pararameters of your experiments | Yes | see bellow | None |
resume |
Resume an experiment with the same name if True Overrides it if False |
No | boolean |
true |
parameters
objectThe parameters
object has the same JSON-like format as the experiment
object.
It describes the hyper parameters space of your machine learning algorithm.
Besides describing the range of each hyper parameter you can describe its probabilistic distribution. You can see this as a way to convey your experience and intuition to Oscar to speed up the hyper parameters space exploration.
Each hyper parameter is defined as a key-value pair in the parameters
object:
The key is the name of the hyper parameter
The value is another key-value pair object describing the space of the hyper parameter.
If your parameter takes discrete values which are not ordered or related, it is a categorical parameter.
Describe it with an array of its possible values.
'parameters' : {'activation' : ['relu', 'tanh']}
parameters = {activation = {'relu', 'tanh'}}
If your parameter takes continous or discrete values which are uniformly distributed, you must specify its range with the min
and max
keys.
If your parameter takes dicrete values, you can specify the step size with the step
key.
If your parameter has a log uniform distribution, use the log
key.
'parameters' : { 'layers' : {'min' : 1, 'max' : 3, 'step' : 1}, 'batch_size' : {'min' : 10, 'max' : 100, 'step' : 10} }
parameters = { layers = {min = 1, max = 3, step = 1}, batch_size : {min = 10, max = 100, step = 10} }
If your parameter takes continous or discrete values which are normaly distributed, you must specify its mean and standard deviation with the mu
and sigma
keys.
If your parameter takes dicrete values, you can specify the step size with the step
key.
If your parameter has a log normal distribution, use the log
key.
Once you've run your experiment, you must tell Oscar the result so he can take it into account for his next hyper parameter suggestion.
The result of your experiment is encoded in a JSON-like format with the mandatory loss
key which Oscar will try to minimize (Invert it if you want to maximize an objective).
You can also add any other custom parameters in order to visualize them in the report.
result = { # We want to optimize the validation loss 'loss' : 2.0, 'val_time' : 5, 'train_loss' : 3.0, 'train_time' : 20 } scientist.update(job, result)
local result = { -- We want to optimize the validation loss loss = 2.0, val_time = 5, train_loss = 3.0, train_time = 20 } scientist:update(job, result)
Oscar is ready to scale and handle many experiments simultaneously.
Define your experiment and install Oscar's client on as many compute nodes as you like. Then simply run your experiment as long as you want on every node:
#!/bin/bash # loop infinitely while true do python experiment.py done
#!/bin/bash # loop infinitely while true do th experiment.lua done
Follow your results in Oscar's dashboard.
Run your experiments in the cloud with a few lines:
Adapt and paste the following script into the User data field of your instance:
#!/bin/bash # install the requirements for your experiment # ex: install theano # download your experiment script and data # wget http://www.yoursite.com/yourexperiemnt.tar.gz # tar xvf yourexperiemnt.tar.gz # install Oscar's Python client pip install git+git://github.com/sensout/Oscar-Python.git # loop infinitely while true do python experiment.py done
Example if requested.
Oscar can cope with experiment interruption and resume it after a node reboot for example.
This gives you the ability to reduce your cost by running your long duration experiments on a low cost infrastructure such as AWS EC2 Spot instances. More info if requested.