How to leverage Auto ML with MLOS?

Use built-in templates to get started with auto-ml and scale your hyperparameter optimization.

6 min

May 6, 2019 from Nicolas Narbais

child build with lego

As mentioned in our previous article, the implementation of Auto ML algorithms raises a few challenges that could be overcome with Activeeon Machine Learning Open Studio (MLOS).

In this article, we will quickly describe the various parts to help you get started and scale.

Configure the studio

machine learning open studio controls

But before going forward, add the “Auto ML Optimization” bucket to your studio. For that, click on the + sign next to the other buckets and add the bucket.

Auto ML workflow description

First let’s import, the Auto ML workflow called “Multi_Tuners_Auto_ML” within the bucket previously added. You should see a workflow similar to the one below.

automated machine learning workflow

A few parts can be identified at this level:

  1. Launch Visdom get its endpoint. Visdom is used to visualize the progress of the various trainings that will be in progress and visualize the results of the one completed.
  2. Launch MongoDB and get its endpoint. MongoDB stores information regarding each training (inputs and loss value). This information can be used by the tuning algorithm.
  3. Generate batches of hyperparameters combinations with the Chocolate library. It takes as an input the type of algorithm, the search space to explore and the previous training history within the MongoDB.
  4. Submit multiple training in parallel with the replication structure within ProActive. (The training launched comes from another workflow stored within the catalog.) It uses the hyperparameter combinations generated previously and wait for the results in the merge task.
  5. The loop that generates batches of trainings.

Tuning library with Chocolate / Hyperparameters creation (part 3)

The tuning library integrated in this template is Chocolate. It has been selected following multiple criteria. The main ones are its community, the variety of available tuning algorithms and the ability to work in batches.

As most tuning algorithm require historical training results a storage solution is required. Chocolate is well integrated with MongoDB so we selected this database for better support and more flexibility.

Note: In the beginning of the task, if previous submission have been done in a previous loop, the results are added onto MongoDB.

Job submission (part 4)

The submission of the jobs uses the scheduler API to submit a training algorithm stored within the ProActive catalog.

Note: This task may have performance issues at scale since the submission time and the time to compute new hyperparameter combinations is not negligible. Some adjustments can be done depending on the use case. (More details in the conclusion.)

Setup your algorithm for the Auto ML template

Once you’ve developed your algorithm on your favorite Notebook, you can quickly integrate it within a workflow. Just create a task and copy your code within it. Once it is done, follow the few steps below.

Expose a token variable

The training algorithm will be executed multiple times with various hyperparameters combinations. A token ID will be generated by the Auto ML workflow and sent to the training algorithm to link any output results to its ID.

Expose input variables

Add workflow variables as an input to your workflow. This is required to expose relevant variables to the Auto ML workflow.

To get those variables within the task itself, add this on top:

######################## AUTOML SETTINGS ##########################
# SEARCH_SPACE:
# {"OPTIMIZER": choice(["Adam", "SGD", "RMSprop"]),
#  "LEARNING_RATE": choice([0.0001, 0.00025]), 
#  "BATCH_SIZE": choice([32, 64]), 
#  "WEIGHT_DECAY": choice([0.0005, 0.005])}
#"""
NUM_EPOCHS    = int(variables.get("NUM_EPOCHS"))
LEARNING_RATE = float(variables.get("LEARNING_RATE"))
BATCH_SIZE    = int(variables.get("BATCH_SIZE"))
WEIGHT_DECAY  = float(variables.get("WEIGHT_DECAY"))
OPTIMIZER     = (variables.get("OPTIMIZER"))

input_variables = variables.get("INPUT_VARIABLES")
if input_variables is not None and input_variables != '':
    input_variables = json.loads(input_variables)
    LEARNING_RATE = input_variables["LEARNING_RATE"]
    BATCH_SIZE = input_variables["BATCH_SIZE"]
    WEIGHT_DECAY = input_variables["WEIGHT_DECAY"]
    OPTIMIZER = input_variables["OPTIMIZER"]
#"""
###################################################################

Here we can see 5 variables as an input that can be passed on directly or through the INPUT_VARIABLES variable.

Expose loss value

The Auto ML workflow requires some feedback to understand the quality of the model trained. It will try to minimize the loss value. You consequently need to expose this information. For that just add at the end of your task:

######################## AUTOML SETTINGS ##########################
#"""
token = variables.get("TOKEN")
# Convert from JSON to dict
token = json.loads(token)

# Return the loss value
result_map = {'token': token, 'loss':  scores[0]}
print('result_map: ', result_map)

resultMap.put("RESULT_JSON", json.dumps(result_map))
#"""
###################################################################

The token is the ID given by the Auto ML workflow in order to identify your training workflow. The loss value is the result you aim to minimize.

Note: The result_map is a mechanism within Activeeon to help workflow expose some values. Those values can for instance be visualized within ProActive Job Analytics.

automated machine learning analytics

Add a standard pre-script

Finally add a pre-script to decode potential encoded input.

params_encoded = variables.get('params_encoded')
token_encoded = variables.get('token_encoded')

// If encoded variables are found
if ((params_encoded != null && params_encoded.length() > 0) &&
    (token_encoded != null && token_encoded.length() > 0))
{
    println "Found encoded variables:"
    println "params_encoded: " + params_encoded
    println "token_encoded: " + token_encoded
    
    byte[] params_decoded = params_encoded.decodeBase64()
    byte[] token_decoded = token_encoded.decodeBase64()
    
    input_variables = new String(params_decoded)
    token = new String(token_decoded)
    
    variables.put('INPUT_VARIABLES', input_variables)
    variables.put('TOKEN', token)
}

Store it within the catalog

As mentioned above, the training algorithm to optimize needs to be stored within the catalog. Do not forget to push it onto a bucket.

Get faster with a template

To get faster, do not hesitate to import one of the workflows within the Auto ML bucket and edit the code to fit your needs. The structure will already be there for you to get faster.

Launch it!

Finally, just launch the Auto ML workflow template.

automated machine learning job submission

Select your tuning algorithm, input the catalog path to your training algorithm, input your search space and finally configure the depth of optimization you are interested in.

Usually, the number of PARALLEL_SAMPLES_PER_LOOP is equal to the number of GPUs connected to MLOS. The MAX_ITERATIONS will impact the precision of the model finally identified.

Select a relevant tuning algorithm

If you do not know which tuning algorithm to use, do not hesitate to check the quick analysis we made. This analysis will give you an idea of compute cost and results expected.

Go further

In conclusion, you can quickly use this template within the MLOS in order to leverage tuning algorithms with very little changes. To go further, you can look at optimizing the creation and submission of training models. This is particularly useful with Bayesian method which takes longer time when training history grows. Do not hesitate to check the version V2 to get some optimization ideas.

Overall with MLOS, focus on building your training algorithm. The platform will take care of identifying the hyperparameters most relevant to your objective. The workflow Auto ML template will automatically launch various training for you and collect the results. Moreover, the distribution mechanism within the ProActive solution enables workload parallelization and ease access to any type of resources (cloud, HPC, etc.).

Let us know your feedback on this implementation and your suggestions.


More articles

All our articles