Tuesday, 21 August 2018

New Features in MLflow v0.5.0 Release

Today, we’re excited to announce MLflow v0.5.0, which we released last week with some new features. MLflow 0.5.0 is already available on PyPI and docs are updated. If you do pip install mlflow as described in the MLflow quickstart guide, you will get the recent release.

In this post, we’ll describe new features and fixes in this release.

Keras and PyTorch Model Integration

As part of MLflow 0.5.0 and continued effort to offer a range of machine learning frameworks, we’ve extended support to save and load Keras and PyTorch models using log_model APIs. These model flavors APIs export their models in their respective formats, so either Keras or PyTorch applications can reuse them, not only from MLflow but natively from Keras or PyTorch code too.

Using Keras Model APIs

Once you have defined, trained, and evaluated your Keras model, you can log the model as part of an MLflow artifact as well as export the model in Keras HDF5 format for others to load it or serve it for predictions. For example, this Keras snippet code shows how:

from keras import Dense, layers
import mlflow 

# Build, compile, and train your model
keras_model = ...
keras_model.compile(optimizer=’rmsprop’, loss=’mse’, metrics[‘accuracy’])
results = keras_model.fit(x_train, y_train, epochs=20, batch_size = 128, validation_data=(x_val, y_val))
...
# Log metrics and log the model
with mlflow.start_run() as run:
   ...
   mlflow.keras.log_model(keras_model, “keras-model”)

# Load the model as Keras model or as pyfunc and use its predict() method

keras_model = mlflow.keras.load_model(“keras-model”, run_id=”96771d893a5e46159d9f3b49bf9013e2”)
predictions = keras_model.predict(x_test)
...

Using PyTorch Model APIs

Similarly, you can use the model APIs to log models in PyTorch, too. For example, the code snippets below are similar in PyTorch, with minor changes in how PyTorch exposes its methods. However, with pyfunc the method is the same: predict():

import mlflow 
import pytorch

# Build, compile, and train your model
pytorch_model = ...
pytorch_model.train()
pytorch_model.eval()
...
y_pred = pytorch_model.model(x_data)

# Log metrics and log the model
with mlflow.start_run() as run:
...
   mlflow.pytorch.log_model(pytorch_model, “pytorch-model”)
   # Load the model as pytorch model or as pyfunc and use its predict() method

pytorch_model = mlflow.pytorch.load_model(“pytorch-model”)
y_predictions = pytorch_model.model(x_test)

Python APIs for Experiment and Run Management

To query past runs and experiments, we added new public APIs as part of mlflow.tracking module. In the process, we have also refactored the old APIs into mlflow module for logging parameters and metrics for current runs. So for example, to log basic parameters and metrics for the current run, you can use the mlflow.log_xxxx() calls.

import mlflow
uu_id = ‘v.0.5’
with mflow.start_run(run_uuid=uu_id) as run:
   mflow.log_param(“game”, 1)
   mlflow.log_metric(“score”, 25)
   ...

However, to access this run’s results, say in another part of application, you can use mflow.tracking APIs as such:

import mlflow.tracking

#get the service; defaults to URI or locally from ‘mlruns’
run_uuid = ‘v.0.5’
run = mlflow.tracking.get_service().get_run(run_uuid)
score = run.data.metrics[0]

While the former deals with persisting metrics, parameters and artifacts for the currently active run, the latter allows managing experiments and runs (especially historical runs).

With this new APIs, developers have access to Python CRUD interface to MLflow Experiments and Runs. Because it is a lower level API, it maps well to REST calls. As such you can build a REST-based service around your experimental runs.

UI Improvements for Comparing Runs

Thanks to Toon Baeyens (Issue #268, @ToonKBC), in the MFlow Tracking UI we can compare two runs with a scatter plot. For example, this image shows a number of trees and its corresponding rmse metric.

Also, with better columnar and tabular presentation and organization of experimental runs, metrics, and parameters, you can easily visualize the outcomes and compare runs. Together with navigation breadcrumbs, the overall encounter is a better UI experience.

Other Features and Bug Fixes

In addition to these features, other items, bugs and documentation fixes are included in this release. Some items worthy of note are:

  • [Sagemaker] Users can specify a custom VPC when deploying SageMaker models (#304, @dbczumar)
  • [Artifacts] SFTP artifactory store added (#260, @ToonKBC)
  • [Pyfunc] Pyfunc serialization now includes the Python version and warns if the major version differs (can be suppressed by using load_pyfunc(suppress_warnings=True)) (#230, @dbczumar)
  • [Pyfunc] Pyfunc serve/predict will activate conda environment stored in MLModel. This can be disabled by adding --no-conda to mlflow pyfunc serve or mlflow pyfunc predict (#225, @0wu)
  • [CLI] mlflow run can now be run against projects with no conda.yaml specified. By default, an empty conda environment will be created — previously, it would just fail. You can still pass --no-conda to avoid entering a conda environment altogether (#218, @smurching)

The full list of changes and contributions from the community can be found in the 0.5.0 Changelog. We welcome more input on mlflow-users@googlegroups.com or by filing issues or submitting patches on GitHub. For real-time questions about MLflow, we’ve also recently created a Slack channel for MLflow as well as you can follow @MLflowOrg on Twitter.

Credits

MLflow 0.5.0 includes patches, bug fixes, and doc changes from Aaron Davidson, Adrian Zhuang, Alex Adamson, Andrew Chen, Arinto Murdopo, Corey Zumar, Jules Damji, Matei Zaharia, @RBang1, Siddharth Murching, Stephanie Bodoff, Tomas Nykodym, Tingfan Wu, Toon Baeyens, and Yassine Alouini.

--

Try Databricks for free. Get started today.

The post New Features in MLflow v0.5.0 Release appeared first on Databricks.

No comments:

Post a Comment