Unified Pipeline

Unified Pipeline is a case study on the design and implementation of a unified machine learning pipeline in a banking environment. The goal was to accelerate model deployment, increase model stability over time, and create a single experimental framework across teams and use cases.


Project Overview

Design and implementation of a unified machine learning (ML) pipeline for a banking institution, aimed at accelerating model deployment, increasing out-of-time stability (robustness outside the training period), and standardizing the experimental workflow across teams. The pipeline enables consistent development, validation, and deployment of models for more than 100 products and their variants.


The Challenge

The client faced a fragmented environment for ML model development:

  • Inconsistent approaches between teams (inconsistent workflows)
  • Long model deployment times (typically 7–10+ days)
  • Overestimation of model quality due to improper validation (data leakage, random splits)
  • Low model stability over time (poor performance on out-of-time data)
  • Limited reproducibility of experiments and a weak audit trail

The Solution

Architecture

  • Databricks as the main execution platform (distributed processing, orchestration)
  • MLflow for experiment tracking and model registry
  • Optuna for hyperparameter optimization with a focus on efficient search strategies
  • Spark (PySpark DataFrames) for scalable feature processing
  • Migration from the original Hadoop-based solution to a more modern Databricks-based architecture

Key Design Principles

  • Time-aware validation (a key differentiator from common practice)
    • Use of walk-forward validation instead of random K-Fold cross-validation.
    • Simulation of real-world model deployment over time.
    • **Elimination of *data leakage***.
    • Significant reduction in the gap between training and production performance.
  • Unified training framework
    • A single pipeline for both classification and regression.
    • Shared data preprocessing and feature engineering steps.
    • Parameterization of the pipeline for different use-cases.
  • Advanced hyperparameter tuning (Optuna)
    • Combination of Bayesian optimization and QMCSampler (Sobol sequences) for better coverage of the search space.
    • Optimization with respect to time-stability metrics, not just in-sample performance.
    • Managing the trade-off between performance and training time.
  • Model stability over raw performance
    • Optimization for metrics such as Lift (within the top decile), F1 score, and stability of R² / accuracy over time.
    • Emphasis on robustness across time periods.

Key Features

  • Automated feature preparation pipeline
    • Scalable data transformation using Spark.
    • Data quality checks, for example min/max validation instead of expensive operations such as countDistinct.
  • Centralized experiment tracking (MLflow)
    • Complete audit trail of experiments, model versioning, and associated parameters.
  • Model registry and standardized deployment
    • A unified interface for model deployment and support for rapid rollout of new versions.
  • Framework for intertemporal evaluation
    • Systematic evaluation of models over time and identification of degradation patterns.
  • Flexible pipeline orchestration
    • Support for multiple model types within a single pipeline through a modular design.

The Results

Metric Before After Improvement
Model deployment time 7–10+ days 1–2 days -6 to -8 days
Model stability (OOT) Low Significantly higher Major improvement
Model lift (top decile) Baseline +5 to +30 % +5–30 %
Pipeline execution time ~4 hours ~2–3 hours -30 to -40 %

Technology Stack

  • Python, PySpark (Spark DataFrames)
  • Databricks (distributed computing, orchestration)
  • MLflow (experiment tracking and model registry)
  • Optuna (hyperparameter optimization)
  • CatBoost (classifier and regressor)

Unified Pipeline Series

This case study is followed by a five-part series that explores the architecture, validation logic, and practical decisions behind the solution in more detail:

  1. Unified Pipeline – Part 1: Why Was the Unified Pipeline Created? – why improving the model alone was not enough and the real problem was system fragmentation.
  2. Unified Pipeline – Part 2: From Experiments to a System – how architecture and configuration replaced improvisation.
  3. Unified Pipeline – Part 3: Time as the Enemy of the Model – why validation without time discipline fails in production.
  4. Unified Pipeline – Part 4: MLOps Without the Buzzwords – what truly delivered value and what only added complexity.
  5. Unified Pipeline – Part 5: What I Would Do Differently Today – lessons learned, dead ends, and transferable principles.

© 2026 Michael Princ. All rights reserved.

Built with WordPress