[개념] MLOps 에 대해 알아보자1 (feat. public cloud platform : GCP, Azure, AWS)

What is MLOps ?

MLOps is a methodology for ML engineering that unifies ML system development (the ML element) with ML system operations (the Ops element) [1].

MLOps supprots ML development and deployment in the way that DevOps and DataOps support application engineering and data engineering (analytics).

What is difference between MLOps and DataOps, DevOps

MLOps care about resilience, queries per second, load balancing, and so on.

Also worry about changes in the data, changes in the model, users trying to game the system, and so on.

Why is developed and getting attention?

It advocates formalizing and automating critical steps of ML system construction.

MLOps provides a set of standardized processes and technology capabilities for building, deploying, and operationalizing ML systems rapidly and reliably.

- Shorter development cycles, and as a result, shorter time to market.

- Better collaboration between teams

- Increased reliability, performance, scalability, and security of ML systems.

- Streamlined operational and governance processes

- Increased return on investment of ML projects

Overview of MLOps lifecycle and core capabilities

0. Building an ML-enabled system

MLOps combines data engineering, ML engineering, and application engineering tasks as shown in Fig. 1

1) Data engineering

Data engineering involves ingesting, integrating, curating, and refining data to facilitate a broad spectrum of operational tasks, data analytics tasks, and ML tasks. Data engineering can be crucial to the success of the analytics and ML initiatives. If an organization does not have robust data engineering processes and technologies, it might not be set up for success with downstream business intelligence, advanced analytics, or ML projects.

2) ML engineering

ML models are built and deployed in production using curated data that is usually created by the data engineering team. The models do not operate in silos; they are components of, and support, a large range of application systems, such as business intelligence systems, line of business applications, process control systems, and embedded systems.

3) Application engineering

Integrating an ML model into an application is a critical task that involves making sure first that the deployed model is used effectively by the applications, and then monitoring model performance. In addition to this, you should also collect and monitor relevant business KPIs (for example, click-through rate, revenue uplift, and user experience). This information helps you understand the impact of the ML model on the business and adapt accordingly.

1. MLOps lifecycle and workflow

- Seven integrated and iterated processes are encompassed in MLOps lifecycle for data and model management.

1) ML development : concerns experimenting and developing a robust and reproducible model training procedure (training pipeline code), which consists of multiple tasks from data preparation and transformation to model training and evaluation.

2) Training operationalization : concerns automating the process of packaging, testing, and deploying repeatable and reliable training pipelines

3) Continuous training : concerns repeatedly executing the training pipeline in response to new data or to code changes, or on a schedule, potentially with new training settings.

4) Model deployment : concerns packaging, testing, and deploying a model to a serving environment for online experimentation and production serving.

5) Prediction serving : is about serving the model that is deployed in production for inference.

6) Continuous monitoring : is about monitoring the effectiveness and efficiency of a deployed model.

7) Data and model management : is a central, cross-cutting function for governing ML artifacts to support auditability, traceability, and compliance. Data and model management can also promote shareability, reusability, and discoverability of ML assets.

2. MLOps: An end-to-end Workflow

2.1. ML development

- the core activity during this ML development phase is experimentation.

R&R

- data scientist and ML researchers : prototype model architectures and training routines

- MLOps engineer : create labeled datasets, use features and other reusable ML artifacts that are governed through the data and model management

Primary output : formalized training procedure - data preprocessing, model architecture, and model training settings.

2.2. Training operationalization

2.3. Continuous training

2.4. Model deployment

2.5. Model serving

2.6. Continuous monitoring process

3. MLOps capabilites

3.1. Experimentation

3.2. Data processing

3.3. Model training

3.4. Model evaluation

3.5. Model serving

3.6. Online experimentation

3.7. Model monitoring

3.8. ML pipelines

3.9. Model registry

3.10. Dataset and feature repository

3.11. ML metadata and artifact tracking

2. Individual capabilites that are required for a robust MLOps implementation

Why is developed and getting attention?

Executive Summary

Google Cloud's AI Adaption Framework :

Practitioners guide to mlops :

- a deeper dive into the themes of scale and automate to illustrate the requirements for building and operationalizing ML systems

- intended for technology leaders and enterprise architects who want to understand MLOps.

- this is in two parts

1) an overview of the MLOps lifecycle is for all readers,

-> it introduces MLOps processes and capabilities and why they're important for successful adoption of ML-based systems

2) a deep dive on the MLOps processes and capabilities is for readers who want to understand the concrete details of tasks like running a continuous training pipeline, deploying a model, and monitoring predictive performance of an ML model

Overview of MLOps lifecycle and core capabilities

1) Problem : Sucessful

1) Problem : Successful deployments and effective operations are a bottleneck for getting value from AI

Deep dive of MLOps processes

MLOps is a methodologies

[1] Practitioners Guide to MLOps |. (2021). Google Cloud. https://cloud.google.com/resources/mlops-whitepaper

LIST

저작자표시 (새창열림)

'Machine Learning Operation Systems' 카테고리의 다른 글

[Databricks] The Big Book of MLOPS (0)	2024.04.21
MLOps 구성요소 (0)	2022.01.20

Data Scientist Story For Sustainability

[개념] MLOps 에 대해 알아보자1 (feat. public cloud platform : GCP, Azure, AWS)

What is MLOps ?

What is difference between MLOps and DataOps, DevOps

Why is developed and getting attention?