October 31 - November 1 - Co-Located Events
October 28-30 - Conference
Lyon Convention Centre - Lyon, France
More information for Open Source Summit + Embedded Linux Conference Europe 2019
Back To Schedule
Monday, October 28 • 11:30 - 12:05
Machine Learning Models and Datasets Versioning Practices and Tools - Dmitry Petrov & Ruslan Kuprieiev, Iterative AI

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
The rise of AI and ML changes development workflow and requires new development tools: data versioning, ML pipeline versioning, experiments metrics tracking and others that have not been formalized and even named yet.

Machine learning workflow is data-centric in contrast to source code-centric software engineering workflow. The traditional software engineering toolset does not fully cover ML team's needs. We will discuss the current practices of organizing ML projects using traditional open-source tools like Git and Git-LFS as well as their limitations. Thereby motivation for developing new ML specific data management systems will be explained.

Data Version Control or DVC.ORG is an open source, command-line tool. We will show how to version datasets with dozens of gigabytes of data and version ML models, how to use your favorite cloud storage (S3, GCS, or bare metal SSH server) as a data file backend and how to embrace the best engineering practices in your ML projects.

avatar for Dmitry Petrov

Dmitry Petrov

Co-Founder & CEO, DVC
Dmitry is an ex-Data Scientist at Microsoft with Ph.D. in Computer Science and active open source contributor. He has written and open sourced the first version of DVC.org - machine learning workflow management tool. Also he implemented Wavelet-based image hashing algorithm (wHash... Read More →
avatar for Ruslan Kuprieiev

Ruslan Kuprieiev

Software Engineer, Iterative AI
Ruslan is a Software Engineer at Iterative AI. Previously he worked on live container migration at Parallels, Linux Kernel live-patching at CloudLinux, and also in a few startups. Ruslan's career started by working in an open source project called CRIU and he continues to contribute... Read More →

Monday October 28, 2019 11:30 - 12:05 CET
St. Clair 3
  LF AI Summit (AI/ML/DL)
  • Session Slides Included Yes