Unified Pipeline is a case study on the design and implementation of a unified machine learning pipeline in a banking environment. The goal was to accelerate model deployment, increase model stability over time, and create a single experimental framework across teams and use cases.
Project Overview
Design and implementation of a unified machine learning (ML) pipeline for a banking institution, aimed at accelerating model deployment, increasing out-of-time stability (robustness outside the training period), and standardizing the experimental workflow across teams. The pipeline enables consistent development, validation, and deployment of models for more than 100 products and their variants.
The Challenge
The client faced a fragmented environment for ML model development:
- Inconsistent approaches between teams (inconsistent workflows)
- Long model deployment times (typically 7–10+ days)
- Overestimation of model quality due to improper validation (data leakage, random splits)
- Low model stability over time (poor performance on out-of-time data)
- Limited reproducibility of experiments and a weak audit trail
The Solution
Architecture
- Databricks as the main execution platform (distributed processing, orchestration)
- MLflow for experiment tracking and model registry
- Optuna for hyperparameter optimization with a focus on efficient search strategies
- Spark (PySpark DataFrames) for scalable feature processing
- Migration from the original Hadoop-based solution to a more modern Databricks-based architecture
Key Design Principles
- Time-aware validation (a key differentiator from common practice)
- Use of walk-forward validation instead of random K-Fold cross-validation.
- Simulation of real-world model deployment over time.
- **Elimination of *data leakage***.
- Significant reduction in the gap between training and production performance.
- Unified training framework
- A single pipeline for both classification and regression.
- Shared data preprocessing and feature engineering steps.
- Parameterization of the pipeline for different use-cases.
- Advanced hyperparameter tuning (Optuna)
- Combination of Bayesian optimization and QMCSampler (Sobol sequences) for better coverage of the search space.
- Optimization with respect to time-stability metrics, not just in-sample performance.
- Managing the trade-off between performance and training time.
- Model stability over raw performance
- Optimization for metrics such as Lift (within the top decile), F1 score, and stability of R² / accuracy over time.
- Emphasis on robustness across time periods.
Key Features
- Automated feature preparation pipeline
- Scalable data transformation using Spark.
- Data quality checks, for example min/max validation instead of expensive operations such as
countDistinct.
- Centralized experiment tracking (MLflow)
- Complete audit trail of experiments, model versioning, and associated parameters.
- Model registry and standardized deployment
- A unified interface for model deployment and support for rapid rollout of new versions.
- Framework for intertemporal evaluation
- Systematic evaluation of models over time and identification of degradation patterns.
- Flexible pipeline orchestration
- Support for multiple model types within a single pipeline through a modular design.
The Results
| Metric | Before | After | Improvement |
|---|---|---|---|
| Model deployment time | 7–10+ days | 1–2 days | -6 to -8 days |
| Model stability (OOT) | Low | Significantly higher | Major improvement |
| Model lift (top decile) | Baseline | +5 to +30 % | +5–30 % |
| Pipeline execution time | ~4 hours | ~2–3 hours | -30 to -40 % |
Technology Stack
- Python, PySpark (Spark DataFrames)
- Databricks (distributed computing, orchestration)
- MLflow (experiment tracking and model registry)
- Optuna (hyperparameter optimization)
- CatBoost (classifier and regressor)
Unified Pipeline Series
This case study is followed by a five-part series that explores the architecture, validation logic, and practical decisions behind the solution in more detail:
- Unified Pipeline – Part 1: Why Was the Unified Pipeline Created? – why improving the model alone was not enough and the real problem was system fragmentation.
- Unified Pipeline – Part 2: From Experiments to a System – how architecture and configuration replaced improvisation.
- Unified Pipeline – Part 3: Time as the Enemy of the Model – why validation without time discipline fails in production.
- Unified Pipeline – Part 4: MLOps Without the Buzzwords – what truly delivered value and what only added complexity.
- Unified Pipeline – Part 5: What I Would Do Differently Today – lessons learned, dead ends, and transferable principles.