michaelprinc.com

Tag: penb

PENB Label Approximation – Part 1: Why Waiting for a Formal Audit Isn’t Enough
Series: PENB Energy Label Approximation – Lessons from Building a Public Application

Series goal:
Describe the project from problem definition through data and modeling to public deployment, highlighting where data science, product thinking, and quality implementation meet.

Related Parts
1. Why waiting for a formal audit isn’t enough – when an indicative calculation is more useful than waiting.
2. How to turn regular consumption into valid input – what needs to be prepared before the model can start.
3. Weather, heating season, and an RC model without magic – where domain logic meets data.
4. How to turn a calculation into an app for everyday users – why UX is more than just cosmetics.
5. Deployment, limitations, and what’s next – what works today and what should come in the next iteration.
Part 1: Why waiting for a formal audit isn’t enough

Where the real problem started

A formal PENB makes sense when you need to meet a legal requirement or need a final document for sale or rental. But most decisions happen earlier.

People want to know:
- if their consumption is normal,
- whether it’s worth investing in apartment improvements,
- if a specific apartment is suspiciously energy-intensive,
- whether it makes sense to proceed with deeper analysis.
At this stage, a formal audit is usually too slow. You need a quick but still defensible signal.

The hardest part isn’t the calculation, but defining the problem correctly

Projects like this clearly show the difference between a technically interesting model and a practically useful product.

The technical question is:

Can you estimate an apartment’s energy consumption from operational data?

The product question is different:

Can you quickly and clearly help someone decide whether it’s worth analyzing their apartment in detail?

The second question is more important. That’s why this app wasn’t created as a replacement for a certified PENB, but as a tool for initial orientation.

Why operational data makes sense

People often don’t have the building’s technical documentation at hand, but they do have:
- bills and consumption data,
- basic apartment parameters,
- information about the heating source,
- a rough idea of how they use the apartment.
It’s not a perfect dataset. But it’s a dataset that actually exists. And a good product often starts where data is truly available, not where it would be ideal.

What such a tool must deliver

For the app to be useful, it must meet four criteria:
- be fast, so it helps even before a formal audit,
- be clear, so the result isn’t just another technical barrier,
- openly acknowledge limitations, because uncertainty can’t be hidden,
- be publicly accessible, so the principle can be demonstrated immediately in practice.
That’s also why the project didn’t end up as just a notebook script. From the start, it was headed toward becoming an application.

The value for users is in decisions, not just numbers

The energy class itself is only part of the result.

The greater value is that the application helps answer practical questions:
- is it worth continuing with due diligence,
- is the consumption consistent with the apartment’s parameters,
- is it appropriate to plan a renovation,
- is it worth ordering a detailed audit.
In other words: the project stands out because it turns unclear operational data into an actionable framework for the next step.

What’s next

In the next part, I’ll look at why it’s critical for this kind of application to properly prepare input data, how to separate heating from regular usage, and why input validation often matters more than model optimization itself.

Project case study

Next part

Open the application
April 11, 2026
PENB Label Approximation – Part 2: Turning Regular Consumption Data into Valid Input
Part 2: Turning Regular Consumption Data into Valid Input

A model is only as good as its input

In projects working with operational data, the biggest mistake is often assuming the main value lies in the algorithm itself. In reality, the quality of the outcome is often determined before any calculation happens.

For PENB approximation, it’s especially critical that the application correctly understands:
- what consumption data is available,
- which period it covers,
- when the user is heating and when not,
- which part of the energy likely relates to heating and which to hot water or regular use.
What the application actually needs from the user

The practical input is intentionally kept fairly simple:
- location,
- apartment area and ceiling height,
- type of heating,
- temperature regime,
- consumption time series,
- selection of non-heating months,
- method for hot water approximation.
This is an important compromise. If the application asked for too many details, most users wouldn’t finish. If it asked for too little, the result would lose its grounding in reality.

Why uploading a CSV isn’t enough

Uploading a file is technically easy, but not enough in terms of data. Consumption alone doesn’t tell you:
- whether it’s heating or another component,
- whether there are gaps in the data,
- whether the observations match the heating season,
- whether the measurement period is sufficient for the chosen calculation mode.
That’s why the workflow includes selecting non-heating months and splitting energy into heating-related and hot water or regular usage parts.

Validation isn’t about restricting the user

Good validation doesn’t feel like a barrier. It’s a way to prevent the app from returning a confident result based on inconsistent data.

In this project, validation handles for example:
- minimum data length based on calculation mode,
- input field logic for heating type,
- consistency of temperature regime,
- presence of expected columns in the input file.
From a product perspective, this matters because users get feedback early—not after several minutes of calculation.

Why this is interesting for data science

A workflow like this shows that data science in production isn’t just about modeling. It’s also about designing how data enters the system so results are repeatable and interpretable.

This is exactly where:
- data quality,
- domain logic,
- form UX,
- and the operational reality of everyday users meet.
What’s next

In the next part, I’ll look at the core of the estimation: how weather data enters the app, why it’s important to distinguish the heating season, and the role of a simplified RC model in calibrating the apartment’s energy behavior.

Previous part

Next part

Project case study
April 11, 2026
PENB Label Approximation – Part 3: Weather, Heating Season, and RC Model Without Magic
Part 3: Weather, Heating Season, and the RC Model Without Magic

Why Consumption Alone Isn’t Enough

The same energy use can mean something different in January than it does in April. Without the context of weather and season, it’s impossible to reasonably estimate how much energy is actually explained by heating.

That’s why the app isn’t just about uploading a CSV. Alongside operational data, it also adds meteorological context for the specific location.

Hybrid Weather Layer as a Practical Choice

In an ideal world, there would be a single perfect data source, always available and never down. In reality, it’s better to assume that the network, API, or historical data coverage won’t always be perfect.

That’s why the project uses a multi-layered approach:
- recent data comes from WeatherAPI,
- older history is filled in via Open-Meteo,
- and only as a last fallback does it use a synthetic approximation.
This isn’t just a technical detail. It’s an example of how robustness is built into the data layer from the start.

Where the Heating Season Comes In

Energy consumption isn’t homogeneous. Some months are mainly about heating, others reflect regular operation and hot water. If the model doesn’t distinguish this, it starts calibrating the wrong signal.

That’s why the user selects non-heating months in the app, and the system uses them when estimating consumption components. It’s not an unnecessary detail—it’s one of the most important steps in the entire logic.

Why the RC Model

A simplified RC model isn’t interesting because it’s theoretically the most complex. It’s valuable because it offers a reasonable balance between:
- domain interpretability,
- computational simplicity,
- the ability to calibrate with real data.
The model helps translate apartment behavior into a structure you can actually work with. It’s not a “black box,” but an explainable approximation of thermal dynamics.

Multiple Calculation Modes Matter

The app now offers several calculation modes. This matters not just for performance, but also for the nature of available data.
- sometimes a quick estimate is enough,
- other times local optimization makes sense,
- and for more demanding cases, robust calibration is possible.
This is a good example of a product compromise: instead of forcing everyone into one “right” mode, offer several paths based on input quality and user expectations.

What’s Next

In the next part, I’ll move from the calculation core to the user layer: why having the right model isn’t enough, how the interface steps were designed, and why UX is part of technical quality for tools like this.

Previous part

Next part

Open the app
April 11, 2026
PENB Label Approximation – Part 4: Turning the Calculation into an App for Everyday Users
Part 4: Turning a Calculation into an App for Everyday Users

Why the Right Model Isn’t Enough

Many technical projects fail not because the model is wrong, but because real users can’t interact with it. For a PENB approximation app, this is especially important since the target audience isn’t just analysts.

A usable public app must meet three requirements at once:
- the user must understand what to enter,
- the system must receive consistent input,
- the output must be readable even without deep technical knowledge.
Five Steps Instead of One Overwhelming Screen

The current interface uses a logical breakdown into several steps: location, apartment, data, calculation, and result. This is important for practical reasons.

When people see everything at once, it’s easy to lose context. Guiding them step by step increases the chance of correct input.

This isn’t just a UX rule. It’s also a way to improve the quality of data that ultimately reaches the model.

The Form as Part of Domain Logic

A public app shouldn’t just be a thin layer over the backend. In this project, the form actively helps structure the input:
- guides users to the truly essential information,
- distinguishes between quick and detailed calculation modes,
- handles months without heating and hot water right at input,
- sets the stage for interpreting the result.
From this perspective, UX is not separate from data science. It’s one of the layers that determines whether the model receives meaningful input.

The result must be understandable, not just accurate

The user usually doesn’t need to know all the internal calibration parameters. They need to understand:
- what energy class the calculation yields,
- how reliable the estimate is,
- what the interpretation limits are,
- what the next logical step is.
That’s why the output combines the energy class, key metrics, a written comment, and an exportable report. The result serves as a communication artifact, not just a technical intermediate output.

Bilingualism as a product feature

An interesting part of the project is that the application isn’t just prepared for local testing. It has both Czech and English language versions. This increases usability for project presentations, sharing with clients, and further development.

Technically, this means more work. But from a product perspective, it significantly boosts the application’s overall value.

What’s next

In the final part, I’ll cover deployment, report export, current project limitations, and what I would expand or refine in the next iteration.

Previous part

Next part

Project case study
April 11, 2026
PENB Label Approximation – Part 5: Deployment, Limitations, and What’s Next
Part 5: Deployment, limitations, and what’s next

When does a project become a real project

As long as the calculation only runs locally, it’s just an experiment. The moment you can open it at a public URL, switch languages, go through the workflow, and download a report, it starts to become a real product.

That’s the case with this app. The computational logic matters, but it’s just as important that it’s deployed as a publicly accessible service.

What brings operational value

Today, the project stands on several practical building blocks:
- containerized deployment in Docker,
- separate Czech and English versions,
- persistent storage for local state and reports,
- HTML export of results,
- clear separation of UI, model, and reporting layer.
These are exactly the elements that determine whether the app can be further developed without rewriting it from scratch.

Transparency about limitations is part of quality

For a tool like this, it’s important not only what it can do, but also what it can’t do yet or only handles approximately.

With the current implementation, it’s good to be open about, for example:
- the output is indicative, not a certified PENB,
- the reference year in the MVP is an approximated typical year, not a full TMY dataset,
- result quality depends on the scope and consistency of input data,
- some parts of the result presentation still have room for further development.
This is not a weakness in communication. It’s its professionalization.

What I would develop in the next iteration

If the project were to continue to the next version, I believe these directions would make the most sense:
- more precise handling of the reference year and climate scenarios,
- expanding the interpretation of results with further recommendations,
- deeper work with visualizations and calibration explanations,
- more robust handling of a wider range of input situations.
These steps would not only advance the technical side of the model. They would also increase user trust in the output and the ability to use the tool in real decision-making.

Key takeaways from the whole series

The PENB approximation project clearly shows that a quality data application doesn’t arise from a single clever idea. It emerges from the interplay of several disciplines:
- choosing the right problem,
- a reasonable model,
- a quality data workflow,
- a usable interface,
- and deployment that allows the result to be truly used.
This combination, in my view, is more interesting than the mere fact that the application returns an energy class.

Previous part

Project case study

Open the application
April 11, 2026

michaelprinc.com

Data Scientist & ML Engineer from Prague

© 2026 Michael Princ. All rights reserved.

Built with WordPress