In data science projects, the derivation of business value follows something akin to the Pareto Principle, where the vast majority of the business value is generated not from the planning, the scoping, or even from producing a viable machine learning model. Rather, business value comes from the final few steps: operationalization of that project.
Operationalization simply means deploying a machine learning model for use across the organization.
More often than not, there is a disconnect between the worlds of development and production. Some teams may choose to re-code everything in an entirely different language while others may make changes to core elements, such as testing procedures, backup plans, and programming languages.
Operationalizing analytics products could become complicated as different opinions and methods vie for supremacy, resulting in projects that needlessly drag on for months beyond promised deadlines.
The goal of this guide is to explore grounds for commonality and introduce strategies & procedures designed to bridge the gap between development and operationalization. The topics range from Best Operating Procedures (managing environmental consistency, data scalability, and consistent code & data packaging) to Risk Management for unforeseen situations (roll-back and failover strategies). We also discuss modelling (continuously re-train of models, A/B testing, and multivariate optimization) and implementing communication strategies (auditing and functional monitoring).
Successfully building an analytics product and then operationalizing it is not an easy task — it becomes twice as hard when teams are isolated and playing by their own rules.
This guide will help your organization find the common ground needed to empower your Data Science and IT Teams to work together for the benefit of your data and analytics projects as a whole.