Projects

  • PENB Energy Label Approximation

    PENB Energy Label Approximation is a public application that converts standard apartment operational data into an indicative estimate of energy performance. The goal is not to replace a certified building energy performance certificate, but to provide owners, buyers, or consultants with a quick and defensible basis for further decisions.


    Project Overview

    The project combines several layers that are often handled separately in similar tools:

    • validation of apartment and heating input data,
    • working with meteorological data for the given location,
    • calibration of a simplified RC model based on actual consumption,
    • translating the result into an understandable energy class and reliability commentary,
    • public bilingual deployment that can be shared instantly.

    The result is not just a calculation. It’s a small product with an input workflow, computational core, report, and operational layer.


    Problem

    In practice, there is often a gap between what someone needs to know immediately and what a formal audit provides:

    • an owner wants to quickly check if consumption matches the apartment’s parameters,
    • an investor needs a preliminary screening before deeper due diligence,
    • a consultant or portfolio manager seeks a way to prioritize units for further action.

    A formal PENB is the right path for legal purposes, but for initial orientation it’s often too slow and expensive. This project was created precisely for situations where a quick, data-driven decision guide is needed.


    Solution

    Architecture

    • Streamlit app as a public interface for entering parameters, calculation, and displaying results.
    • Domain core in Python that validates inputs, prepares time series, calibrates the model, and simulates annual energy demand.
    • Hybrid weather layer combining WeatherAPI, Open-Meteo, and a fallback mechanism for missing data.
    • HTML report that converts the result into a format suitable for sharing or further review.

    Design Principles

    • Simple input for users, not a simple model at all costs.
    • Transparent limitations: the app openly communicates that it provides an indicative estimate.
    • Multiple calculation modes: from a quick estimate to more demanding calibration.
    • Separation of computational logic from UI to allow independent iteration of the model and presentation.

    Application Workflow

    The app currently covers five main steps:

    1. Location – enter the place for which meteorological data is prepared.
    2. Apartment parameters – area, ceiling height, temperature regime, and heating type.
    3. Operational data – upload CSV or use sample data, select non-heating months, and estimate hot water usage.
    4. Calculation – select mode, calibrate the model, and simulate the reference year.
    5. Result – energy class, qualitative comment, basic metrics, and report export.

    This is also important from a product perspective: the user is not exposed to technical model details until the input is properly prepared.


    Key Features

    • Three calculation modes based on the quality and scope of input data.
    • Five logical UI screens guiding the user from input to result.
    • Three weather and fallback layers to ensure the app remains robust if one source fails.
    • Calibration based on actual consumption, not just static table approximation.
    • Output focused on interpretation, not just a number without context.
    Area Current Status
    Public access Yes, via a separate subdomain
    Language versions Czech and English
    Result export HTML report
    Calculation modes Basic, Standard, Advanced
    Model type Simplified RC model with calibration

    Deployment and Operations Layer

    The project is deployed as a containerized Streamlit app. Importantly, this is not just a local experiment:

    • there is a public Czech and English instance,
    • it runs in Docker with separate storage and reporting,
    • it’s ready for further iterations without needing to rewrite the entire interface,
    • UI, domain logic, and reporting layers are clearly separated.

    Just as important as the model itself is the fact that the result can be presented publicly and clearly. This makes the project a true case study, not just an internal analytics script.


    Important limitations

    The app output is an indicative estimate, not an official PENB. That’s why the project emphasizes explaining uncertainty, interpretation recommendations, and a clear distinction between quick screening and formal audit.


    Article series on the app’s development

    The project is accompanied by an introductory article and a five-part series covering the journey from problem definition through data to deployment:

    1. EPC from operational data: where estimation ends and decision begins – business and product context of the app.
    2. PENB Energy Label Approximation – Part 1: Why Waiting for a Formal Audit Isn’t Enough – why an indicative calculation from operational data makes sense.
    3. PENB Energy Label Approximation – Part 2: Turning Regular Consumption Data into Valid Input – how to prepare data for the model without unnecessary chaos.
    4. PENB Energy Label Approximation – Part 3: Weather, Heating Season, and the RC Model Without Magic – how weather, calibration, and simulation work together.
    5. PENB Energy Label Approximation – Part 4: Turning the Calculation into an App for Everyday Users – why UX is as important as the model.
    6. PENB Energy Label Approximation – Part 5: Deployment, Limitations, and What’s Next – a transparent look at operations, limits, and future direction.
  • Unified Pipeline

    Unified Pipeline is a case study on the design and implementation of a unified machine learning pipeline in a banking environment. The goal was to accelerate model deployment, increase model stability over time, and create a single experimental framework across teams and use cases.


    Project Overview

    Design and implementation of a unified machine learning (ML) pipeline for a banking institution, aimed at accelerating model deployment, increasing out-of-time stability (robustness outside the training period), and standardizing the experimental workflow across teams. The pipeline enables consistent development, validation, and deployment of models for more than 100 products and their variants.


    The Challenge

    The client faced a fragmented environment for ML model development:

    • Inconsistent approaches between teams (inconsistent workflows)
    • Long model deployment times (typically 7–10+ days)
    • Overestimation of model quality due to improper validation (data leakage, random splits)
    • Low model stability over time (poor performance on out-of-time data)
    • Limited reproducibility of experiments and a weak audit trail

    The Solution

    Architecture

    • Databricks as the main execution platform (distributed processing, orchestration)
    • MLflow for experiment tracking and model registry
    • Optuna for hyperparameter optimization with a focus on efficient search strategies
    • Spark (PySpark DataFrames) for scalable feature processing
    • Migration from the original Hadoop-based solution to a more modern Databricks-based architecture

    Key Design Principles

    • Time-aware validation (a key differentiator from common practice)
      • Use of walk-forward validation instead of random K-Fold cross-validation.
      • Simulation of real-world model deployment over time.
      • **Elimination of *data leakage***.
      • Significant reduction in the gap between training and production performance.
    • Unified training framework
      • A single pipeline for both classification and regression.
      • Shared data preprocessing and feature engineering steps.
      • Parameterization of the pipeline for different use-cases.
    • Advanced hyperparameter tuning (Optuna)
      • Combination of Bayesian optimization and QMCSampler (Sobol sequences) for better coverage of the search space.
      • Optimization with respect to time-stability metrics, not just in-sample performance.
      • Managing the trade-off between performance and training time.
    • Model stability over raw performance
      • Optimization for metrics such as Lift (within the top decile), F1 score, and stability of R² / accuracy over time.
      • Emphasis on robustness across time periods.

    Key Features

    • Automated feature preparation pipeline
      • Scalable data transformation using Spark.
      • Data quality checks, for example min/max validation instead of expensive operations such as countDistinct.
    • Centralized experiment tracking (MLflow)
      • Complete audit trail of experiments, model versioning, and associated parameters.
    • Model registry and standardized deployment
      • A unified interface for model deployment and support for rapid rollout of new versions.
    • Framework for intertemporal evaluation
      • Systematic evaluation of models over time and identification of degradation patterns.
    • Flexible pipeline orchestration
      • Support for multiple model types within a single pipeline through a modular design.

    The Results

    Metric Before After Improvement
    Model deployment time 7–10+ days 1–2 days -6 to -8 days
    Model stability (OOT) Low Significantly higher Major improvement
    Model lift (top decile) Baseline +5 to +30 % +5–30 %
    Pipeline execution time ~4 hours ~2–3 hours -30 to -40 %

    Technology Stack

    • Python, PySpark (Spark DataFrames)
    • Databricks (distributed computing, orchestration)
    • MLflow (experiment tracking and model registry)
    • Optuna (hyperparameter optimization)
    • CatBoost (classifier and regressor)

    Unified Pipeline Series

    This case study is followed by a five-part series that explores the architecture, validation logic, and practical decisions behind the solution in more detail:

    1. Unified Pipeline – Part 1: Why Was the Unified Pipeline Created? – why improving the model alone was not enough and the real problem was system fragmentation.
    2. Unified Pipeline – Part 2: From Experiments to a System – how architecture and configuration replaced improvisation.
    3. Unified Pipeline – Part 3: Time as the Enemy of the Model – why validation without time discipline fails in production.
    4. Unified Pipeline – Part 4: MLOps Without the Buzzwords – what truly delivered value and what only added complexity.
    5. Unified Pipeline – Part 5: What I Would Do Differently Today – lessons learned, dead ends, and transferable principles.

© 2026 Michael Princ. All rights reserved.

Built with WordPress