About MLOps

What is MLOps?

MLOps is an abbreviation for ‘machine learning operations’. It is a set of practices that combines machine learning, DevOps and data engineering, which aims to deploy and maintain machine learning systems reliably and efficiently in production environments[1].

Why do we need MLOps?

According to PwC Japan Group's 2022 AI predictions (Japan) [2], in-house production or self-running is an important success factor in AI utilization. There is a large difference in the in-house production rate, especially in the "improvement operations (MLOps)" phase. The operation phase is the most important phase to increase the value of AI. It can be inferred that companies that are able to in-house the operation phase are more likely to enjoy the effects of AI.

Background

In this article, we will introduce some MLOps practices used in a machine learning project for a back office department of a telecommunications industry company. According to the client's policy, the project involved the following prerequisites and limits.

Dedicated servers are needed because:
- On-demand execution requires that the servers always be running.
- AI inference and model retraining require a great deal of computing resources.
AI applications need to be hosted in an on-premises server based on the client’s data policy.
The final business users work from home as a general rule due to COVID-19. They need to be able to operate the AI tool and data remotely.
Data cannot be ingested by API.

Solution

We used robotic process automation (RPA) to ingest the data, online storage and local drives to store the data and program, and periodic batch processing to execute AI model inference and re-training.

Data storage and ingestion (MLOps: Data preparation)

We decided to use an online storage service and local drive allocation for data storage. This made it possible to upload data and configuration files, download data, and validate and correct data remotely.

We used RPA to ingest data from different systems and store it on the local drive. Then, the data is synchronised to an online storage service.

AI application deployment (MLOps: Deploy)

The final AI application is also deployed to the online storage service and then synchronised to a local drive in the production environment. This makes it possible to update the AI application programs from anywhere as needed.

AI model inference (MLOps: Inference)

When the input data is ready, the operator can execute the AI application manually by updating the trigger file (writing the target folder name in a plain text file) on the local drive, which uploads the trigger file to the online storage service.

In the on-premises server, the data to be inferred, the trigger file, and other configuration files are synchronised to the local drive.

In the on-premises server, periodic batch processing is used to execute the AI application periodically (e.g. at five-minute intervals).

The AI application logic checks the content of the trigger file. If the valid execution condition is set, the AI application executes the following logic:

Loading the latest AI models
Validating and pre-processing the input data specified in the trigger file
Converting the data to vectorised values and inferring the result

To prevent duplicate execution, the AI application resets the trigger file after reading it.

Monitoring and feedback (MLOps: Monitoring)

The AI application will infer the new data and output the results to spreadsheet files, which contain both the inference results and the confidence level for each data item.

The operator can check the result data after it is synchronised to the local drive, prioritising checks of data with low confidence. The output spreadsheet files provide a format that allows the operator to fill in the correct answer.

Retraining and deploying AI models automatically (MLOps: Retraining)

The automation of AI model retraining is an indispensable part of MLOps.

The retraining program is executed periodically (e.g. monthly) to read the correction data provided by the operator and merge those corrections with the current training data. The program then optimises the hyper parameters according to the updated data. Finally, new AI models are generated and deployed to the AI application folder.

Conclusion

By using a periodic scheduler on an on-premises server and an online storage service in combination with local drives, we implemented architecture that

ingests new input data periodically;
executes the AI model inference remotely and on demand;
enables operators to monitor AI inference results and correct incorrect inference results remotely; and
retrains new AI models periodically based on operators’ manual correction, and deploys the new models to the AI application.

This enables the AI models to evolve continuously by obtaining new knowledge from humans. After a year of actual operation, the AI model accuracy was maintained at over 99.3%.

Of course, this architecture is not a thorough MLOps practice. For example, we have not automated deployment by using continuous integration and continuous deployment (CI/CD). However, given the abovementioned background and limitations, this represents a good effort to put MLOps into practice in an on-premises environment under the background of the COVID-19 pandemic.

References

[1] Breuel, C. ‘ML Ops: Machine Learning as an Engineering Discipline.’ Towards Data Science. Accessed 7 December 2022. https://towardsdatascience.com/ml-ops-machine-learning-as-an-engineering-discipline-b86ca4874a3f

[2] ’2022 AI Predictions (Japan)’, PwC Japan Group.
Accessed 7 December 2022. https://www.pwc.com/jp/ja/knowledge/thoughtleadership/2022-ai-predictions.html

Author

G. Zhao
Before joining PwC Consulting LLC, G. Zhao worked for a system development company. He is currently engaged in data analysis and AI tool development for the telecommunications industry, telework environment improvement surveys for government agencies, and management of the digital product (intelligent business analytics tool) development team.