Experiment Tracking and Experiment Management Tools

Gauging the popularity of a new class of tools for machine learning.

Individuals and teams who build machine learning models need tools to keep track of and analyze experiments they run. It can be very challenging to organize all the information needed to evaluate, identify, and reproduce specific experiments. In the course of building a single model, teams may need to try different models, different model hyperparameters and training/test data sets, and they may need to tweak the code or the computation environment they use. 

Fortunately there are now several open source and commercial tools ​​designed to help ML teams during the model development phase. Experiment tracking and experiment management tools log all relevant metadata and results, and most of these tools include collaboration and visualization features to make ML experiments easier to manage and analyze.

The purpose of this post is to compare several popular experiment tracking and management solutions using an index that measures popularity. Beyond tools that collect and store information related to experiments, we also include visualization tools used during model development, and software used to version machine learning models and related elements.  As with our previous post on BI tools, we use an index that relies on public data and is modeled after TIOBE’s programming language index. Our index is comprised of the following components:

  • Search: We used a subset from TIOBE’s list (Google, Wikipedia, Amazon) and added Reddit, Twitter, and Stack Overflow into the mix.
  • Supply (of talent):  This component is based on the number of people who have listed a specific experiment management tool as a skill on their LinkedIn profiles.
  • Demand (for talent): We examine the number of U.S. online job postings that mention a specific experiment management tool.

In comparison to our previous BI tool index, this fairly young category has relatively sparse data for a number of the tools we examined. But while scores exhibit more volatility compared to mature categories, the following key segments are quite stable:

  1. MLflow and TensorBoard are by far the most popular tools.
  2. Amazon SageMaker Experiments is the sole solution in the middle tier.
  3. Three startups (Weights & Biases, Neptune AI, Comet) occupy the next tier, and a set of startups and open source projects trail closely below.
Figure 1: Experiment Tracking and Management Index – an indicator of the popularity of tools for tracking and managing machine learning experiments. MLflow and TensorBoard are far and away the most popular tools.

We also plot the supply and demand sides of the labor market for each tool. Many of these tools are so new that labor market data was quite sparse for most of them. But it’s clear that MLflow and TensorBoard are the most popular from a talent perspective, and, therefore, have the most developed user ecosystems.

Figure 2: Ranking experiment tracing and management solutions using two talent pool metrics – [Supply (size of worldwide talent pool)] and [Demand (number of online job postings)].

Next Steps

We plan to use our index to monitor trends in machine learning tools. Let us know (using the form below) what tools you want us to include in future editions. 

Suggestion Form

Use this form to suggest companies to include in future editions of this Index.

Index Suggestions

Related Content:

To stay up to date, subscribe to the Gradient Flow newsletter.

%d bloggers like this: