Series: Unified Pipeline – Experiences from Building a Production ML System

Series Goal:
To show how theoretical data science differs from production reality and why infrastructure, process, and governance are often more important than the model itself.

Planned Parts

Why the Unified Pipeline Was Created in the First Place – a problem that couldn’t be solved with a better model
From Experiments to a System – architectural principles and decisions
Time as the Enemy of the Model – time-aware validation, stability, and the reality of operations
MLOps Without the Buzzwords – what actually increased speed and quality
What I Would Do Differently Today – lessons learned, dead ends, and transferable principles

Part 1: Why the Unified Pipeline Was Created in the First Place

When a Better Model Isn’t Enough

At a certain stage in data science work, one reaches a point where further model improvements no longer provide corresponding value.
Not because the models are "good enough," but because the problem is no longer statistical.

It was at this exact point that the idea for the Unified Pipeline was born.

At first glance, everything was fine:

predictive models existed,
the results were not bad,
the data was available.

Yet, development was slow, changes were risky, and knowledge transfer was difficult. Every new use-case meant:

re-solving data preparation,
re-solving validation,
re-solving deployment,
and often, re-discovering the same mistakes.

This is not a failure of people.
This is a failure of the work architecture.

The Hidden Debt: Fragmentation

The fundamental problem was not in the individual models, but in the fact that:

each was created slightly differently,
had a different validation approach,
handled time differently,
was deployed differently.

The result was fragmentation:

fragmentation of code,
fragmentation of responsibility,
fragmentation of knowledge.

And most importantly: no change was cheap.

One Pipeline ≠ One Model

The Unified Pipeline was not an attempt to create "one universal model."
It was an effort to create one universal way of thinking about how models are built, tested, and operated.

The basic idea was simple:

If two models solve a different problem, but run at the same time, on the same data, and in the same production environment,
they should share the maximum amount of infrastructure and the minimum amount of variability.

In other words:

variability should be explicit,
not hidden in ad-hoc scripts.

Speed as a Consequence, Not a Goal

There is often talk of "speeding up development."
But the Unified Pipeline was not created to be fast.

It was created to be:

predictable,
auditable,
repeatable.

Speed came as a consequence:

less ad-hoc decision making,
less re-inventing the wheel,
fewer "heroic" interventions.

And this is what made it possible to:

deploy new models significantly faster,
test more variants without chaos,
and focus more on the purpose of the model than on its surroundings.

Why "Unified"

The word Unified was not for marketing.
It was chosen intentionally.

The Pipeline unified:

the way of working with time,
the method of validation,
the versioning method,
the deployment method,
and even the way of thinking about models.

And that is perhaps its greatest contribution:
it unified the team’s mental model, not just the code.

What’s Next

In the next part, I will look at:

why it was necessary to abandon a purely experimental approach,
which architectural decisions were key,
and where it turned out that "best practices from blogs" often don’t work in real operation.

Unified Pipeline – Part 1: Why the Unified Pipeline Was Created

Series: Unified Pipeline – Experiences from Building a Production ML System

Planned Parts

Part 1: Why the Unified Pipeline Was Created in the First Place

When a Better Model Isn’t Enough

The Hidden Debt: Fragmentation

One Pipeline ≠ One Model

Speed as a Consequence, Not a Goal

Why "Unified"

What’s Next

Comments

Napsat komentář Zrušit odpověď na komentář

More posts

EPC from operational data: where estimation ends and decision begins

PENB z provozních dat: kde končí odhad a začíná rozhodnutí

Unified Pipeline – Part 5: What I Would Do Differently Today

Unified Pipeline – Part 4: MLOps Without the Buzzwords

michaelprinc.com