Unified ML Pipeline

Written by

Unified ML Pipeline

Project Overview

Design and implementation of a unified ML pipeline for a banking institution, which accelerated model deployment and improved the stability of predictions.

The Challenge

The client had fragmented ML processes with different approaches across individual teams:

Long deployment times for new models (averaging 10+ days)
Inconsistent model quality
Difficulty in tracking versions and experiments

The Solution

Architecture

Databricks as the central platform for ML
MLflow for tracking experiments and model registry
Optuna for automated hyperparameter tuning
Docker containers for a consistent environment

Key Features

Automated feature engineering pipeline
Centralized model registry with versioning
A/B testing for gradual deployment
Monitoring and alerting for model drift

The Results

Metric	Before	After	Improvement
Model deployment time	10 days	2 days	-8 days
Model accuracy	baseline	+15%	+15%
Pipeline execution time	4 hours	2.4 hours	-40%

Technology Stack

Python, PySpark
Databricks, MLflow
Optuna, Docker
GitHub Actions (CI/CD)

michaelprinc.com

Data Scientist & ML Engineer z Prahy

Vytvořeno s WordPress

Unified ML Pipeline

Unified ML Pipeline

Project Overview

The Challenge

The Solution

Architecture

Key Features

The Results

Technology Stack

More posts

EPC from operational data: where estimation ends and decision begins

PENB z provozních dat: kde končí odhad a začíná rozhodnutí

Unified Pipeline – Part 5: What I Would Do Differently Today

Unified Pipeline – Part 4: MLOps Without the Buzzwords

michaelprinc.com