How Comet can streamline machine learning on The GitLab DevOps Platform

Building machine learning-powered applications comes with numerous challenges. When we talk about these challenges, there is a tendency to overly focus on problems related to the quality of a model’s predictions—things like data drift, changes in model architectures, or inference latency.

While these are all problems worthy of deep consideration, an often overlooked challenge in ML development is the process of integrating a model into an existing software application.

If you’re tasked with adding an ML feature to a product, you will almost certainly run into an existing codebase that must play nicely with your model. This is, to put it mildly, not an easy task.

ML is a highly iterative discipline. Teams often make many changes to their codebase and pipelines in the process of developing a model. Coupling an ML codebase to an application’s dependencies, unit tests, and CI/CD pipelines will significantly reduce the velocity with which ML teams can deliver on a solution, since each change would require running these downstream dependencies before a merge can be approved.

In this post, we’re going to demonstrate how you can use Comet with GitLab’s DevOps platform to streamline the workflow for your ML and software engineering teams, allowing them to collaborate without getting in each other's way.

The challenge for ML teams working with application teams

Let’s say your team is working on improving a feature engineering pipeline. You will likely have to test many combinations of features with some baseline model for the task to see which combinations make an impact on model performance.

It is hard to know beforehand which features might be significant, so having to run multiple experiments is inevitable. If your ML code is a part of your application codebase, this would mean having to run your application’s CI/CD pipeline for every feature combination you might be trying.

This will certainly frustrate your Engineering and DevOps teams, since you would be unnecessarily tying up system resources, given that software engineering teams do not need to run their pipelines with the same frequency as ML teams do.

The other issue is that despite having to run numerous experiments, only a single set of outputs from these experiments will make it to your production application. Therefore, the rest of the assets produced through these experiments are not relevant to your application code.

Keeping these two codebases separated will make life a lot easier for everyone – but it also introduces the problem of syncing the latest model between two codebases.

Use The GitLab DevOps Platform and Comet for your model development process

With The GitLab DevOps platform and Comet, we can keep the workflows between ML and engineering teams separated, while enabling cross-team collaboration by preserving the visibility and auditability of the entire model development process across teams.

We will use two separate projects to demonstrate this process. One project will contain our application code for a handwritten digit recognizer, while the other will contain all the code relevant to training and evaluating our model.

We will adopt a process where discussions, code reviews, and model performance metrics get automatically published and tracked within The GitLab DevOps Platform, increasing the velocity and opportunity for collaboration between data scientists and software engineers for machine learning workflows.

Project setup

Our project consists of two projects: comet-model-trainer and ml-ui.

Alt text for your image

The comet-model-trainer repository contains scripts to train and evaluate a model on the MNIST dataset. We have set up The GitLab DevOps Platform in a way that runs the training and evaluation Pipeline whenever a new merge request is opened with the necessary changes.

The ml-ui repository contains the necessary code to build the frontend of our ML application.

Since the code is integrated with Comet, your ML team can easily track the source code, hyperparameters, metrics, and other details related to the development of the model.

Once the training and evaluation steps are completed, we can use Comet to fetch summary metrics from the project as well as metrics from the Candidate model and display them within the merge request; This will allow the ML team to easily review the changes to the model.

Alt text for your image

In our case, the average accuracy of the models in the project is 97%. Our Candidate model achieved an accuracy of 99%, so it looks like it is a good fit to promote to production. The metrics displayed here are completely configurable and can be changed as necessary.

When the merge request is approved, the deployment pipeline is triggered and the model is pushed to Comet’s Model Registry. The Model Registry versions each model and links it back to the Comet Experiment that produced it.
Alt text for your image

Once the model is pushed to the Model Registry, it is available to the application code. When the application team wishes to deploy this new version of the model to their app, they simply have to trigger their specific deployment pipeline.

Running the pipeline

Pipeline outline

We will run the process outlined below every time a team member creates a merge request to change code in the build-neural-networkscript:

Alt text for your image

Now, let’s take a look at the yaml config used to define our CI/CD pipelines depicted in the previous diagram:

Alt text for your image

Let's break down the CI/CD pipeline by describing the gitlab-ci.yml file so you can use it and customize it to your needs.

We start by instructing our GitLab runners to utilize Python:3.8 to run the jobs specified in the pipeline:

Image: python:3.8

Then, we define the job where we want to build and train the neural network:

Build-neural-network

Build-neural-network

In this step, we start by creating a folder where we will store the artifacts generated by this job, install dependencies using the requirements.txt file, and finally execute the corresponding Python script that will be in charge of training the neural network. The training runs in the GitLab runner using the Python image defined above, along with its dependencies.

Once the build-neural-network job has finalized successfully, we move to the next job: write-report-mr

Here, we use another image created by DVC that will allow us to publish a report right in the merge request opened by the contributor who changed code in the neural network script. In this way, we’ve brought software development workflows to the development of ML applications. With the report provided by this job, code and model review can be executed within the merge request view, enabling teams to collaborate not only around the code but also the model performance.

From the merge request page, we get access to loss curves and other relevant performance metrics from the model we are training, along with a link to the Comet Experiment UI, where richer details are provided to evaluate the model performance. These details include interactive charts for model metrics, the model hyperparameters, and Confusion Matrices of the test set performance, to name a few.

Alt text for your image

When the team is done with the code and model review, the merge request gets approved, and the script that generated the model is merged into the main codebase, along with its respective commit and the CI pipeline associated to it. This takes us to the next job:

Register-model

This job uses an integration between GitLab and Comet to upload the reviewed and accepted version of the model to the Comet Model Registry. If you recall, the Model Registry is where models intended for production can be logged and versioned. In order to run the commands that will register the model, we need to set up these variables:

COMET_WORKSPACE
COMET_PROJECT_NAME

In order to do that, follow the steps described here.

It is worth noting that the register-model job only runs when the merge request gets reviewed and approved, and this behavior is obtained by setting only: main at the end of the job.

Finally, we decide to let a team member have final control of the deployment so therefore we define a manual job: Deploy-ml-ui

Alt text for your image

When triggered, this job will import the model from Comet’s Model Registry and automatically create the necessary containers to build the user interface and deploy to a Kubernetes cluster.

Alt text for your image

This job triggers a downstream pipeline, which means that the UI for this MNIST application resides in a different project. This keeps the codebase for the UI and model training separated but integrated and connected at the moment of deploying the model to a production environment.

Alt text for your image

Key takeaways

In this post, we addressed some of the challenges faced by ML and software teams when it comes to collaborating on delivering ML-powered applications. Some of these challenges include:

The discrepancy in the frequency with which each of these teams need to iterate on their codebases and CI/CD pipelines.
The fact that only a single set of experiment assets from an ML experimentation pipeline is relevant to the application.
The challenge of syncing a model or other experiment assets across independent codebases.

Using The GitLab DevOps Platform and Comet, we can start bridging the gap between ML and software engineering teams over the course of a project.

By having model performance metrics adopted into software development workflows like the one we saw in the issue and merge request, we can keep track of the code changes, discussions, experiments, and models created in the process. All the operations executed by the team are recorded, can be audited, are end-to end-traceable, and (most importantly) reproducible.

Watch a demo of this process:

About Comet: Comet is an MLOps Platform that is designed to help data scientists and teams build better models faster! Comet provides tooling to Track, Explain, Manage, and Monitor your models in a single place!

Learn more about Comet here and get started for free!

橄榄绿是什么颜色	餐后血糖高吃什么药	门槛费是什么意思	射进去什么感觉	2014年是什么年
挂钟挂在客厅什么位置好	赛博朋克什么意思	血红蛋白低吃什么药	儿童流鼻涕吃什么药	脚趾麻是什么病的前兆
龟头炎什么症状	大便粘马桶是什么原因	失信人是什么意思	aq是什么标准	锦鲤可以和什么鱼混养
牛骨煲汤搭配什么最好	夏天喝什么茶比较好	做梦梦见鬼是什么预兆	钢笔刻字刻什么好	牡丹是什么季节开的

骨头炖什么好吃xscnpatent.com	腰痛去医院挂什么科hcv9jop0ns9r.cn	梅毒症状男有什么表现dayuxmw.com	龟兔赛跑的故事告诉我们什么道理hcv8jop8ns8r.cn	精神衰弱吃什么能改善hcv7jop9ns4r.cn
人生三件大事是指什么hcv7jop9ns9r.cn	海参吃了有什么好处hcv9jop6ns7r.cn	喝中药不能吃什么食物hcv8jop2ns6r.cn	掌眼什么意思hcv9jop1ns1r.cn	高考都考什么hcv8jop9ns5r.cn
为什么会肾虚hcv8jop5ns0r.cn	男人头发硬说明什么hcv7jop6ns6r.cn	吸烟有害健康为什么国家还生产烟hcv7jop5ns0r.cn	带牙套是什么意思hcv9jop2ns4r.cn	牛蒡茶有什么功效hcv9jop5ns5r.cn
灵芝搭配什么煲汤最好wuhaiwuya.com	长期喝咖啡有什么好处和坏处bjhyzcsm.com	办香港通行证要准备什么材料hcv8jop9ns5r.cn	贫血是什么引起的wuhaiwuya.com	小儿消化不良吃什么药最好hcv9jop6ns0r.cn

肯尼亚桑布鲁郡发生冲突至少10人死亡肯尼亚布鲁冲突

The challenge for ML teams working with application teams

Use The GitLab DevOps Platform and Comet for your model development process

Project setup