Tag: rc-model

  • PENB Label Approximation – Part 2: Turning Regular Consumption Data into Valid Input

    Part 2: Turning Regular Consumption Data into Valid Input

    A model is only as good as its input

    In projects working with operational data, the biggest mistake is often assuming the main value lies in the algorithm itself. In reality, the quality of the outcome is often determined before any calculation happens.

    For PENB approximation, it’s especially critical that the application correctly understands:

    • what consumption data is available,
    • which period it covers,
    • when the user is heating and when not,
    • which part of the energy likely relates to heating and which to hot water or regular use.

    What the application actually needs from the user

    The practical input is intentionally kept fairly simple:

    • location,
    • apartment area and ceiling height,
    • type of heating,
    • temperature regime,
    • consumption time series,
    • selection of non-heating months,
    • method for hot water approximation.

    This is an important compromise. If the application asked for too many details, most users wouldn’t finish. If it asked for too little, the result would lose its grounding in reality.


    Why uploading a CSV isn’t enough

    Uploading a file is technically easy, but not enough in terms of data. Consumption alone doesn’t tell you:

    • whether it’s heating or another component,
    • whether there are gaps in the data,
    • whether the observations match the heating season,
    • whether the measurement period is sufficient for the chosen calculation mode.

    That’s why the workflow includes selecting non-heating months and splitting energy into heating-related and hot water or regular usage parts.


    Validation isn’t about restricting the user

    Good validation doesn’t feel like a barrier. It’s a way to prevent the app from returning a confident result based on inconsistent data.

    In this project, validation handles for example:

    • minimum data length based on calculation mode,
    • input field logic for heating type,
    • consistency of temperature regime,
    • presence of expected columns in the input file.

    From a product perspective, this matters because users get feedback early—not after several minutes of calculation.


    Why this is interesting for data science

    A workflow like this shows that data science in production isn’t just about modeling. It’s also about designing how data enters the system so results are repeatable and interpretable.

    This is exactly where:

    • data quality,
    • domain logic,
    • form UX,
    • and the operational reality of everyday users meet.

    What’s next

    In the next part, I’ll look at the core of the estimation: how weather data enters the app, why it’s important to distinguish the heating season, and the role of a simplified RC model in calibrating the apartment’s energy behavior.

  • PENB Label Approximation – Part 3: Weather, Heating Season, and RC Model Without Magic

    Part 3: Weather, Heating Season, and the RC Model Without Magic

    Why Consumption Alone Isn’t Enough

    The same energy use can mean something different in January than it does in April. Without the context of weather and season, it’s impossible to reasonably estimate how much energy is actually explained by heating.

    That’s why the app isn’t just about uploading a CSV. Alongside operational data, it also adds meteorological context for the specific location.


    Hybrid Weather Layer as a Practical Choice

    In an ideal world, there would be a single perfect data source, always available and never down. In reality, it’s better to assume that the network, API, or historical data coverage won’t always be perfect.

    That’s why the project uses a multi-layered approach:

    • recent data comes from WeatherAPI,
    • older history is filled in via Open-Meteo,
    • and only as a last fallback does it use a synthetic approximation.

    This isn’t just a technical detail. It’s an example of how robustness is built into the data layer from the start.


    Where the Heating Season Comes In

    Energy consumption isn’t homogeneous. Some months are mainly about heating, others reflect regular operation and hot water. If the model doesn’t distinguish this, it starts calibrating the wrong signal.

    That’s why the user selects non-heating months in the app, and the system uses them when estimating consumption components. It’s not an unnecessary detail—it’s one of the most important steps in the entire logic.


    Why the RC Model

    A simplified RC model isn’t interesting because it’s theoretically the most complex. It’s valuable because it offers a reasonable balance between:

    • domain interpretability,
    • computational simplicity,
    • the ability to calibrate with real data.

    The model helps translate apartment behavior into a structure you can actually work with. It’s not a “black box,” but an explainable approximation of thermal dynamics.


    Multiple Calculation Modes Matter

    The app now offers several calculation modes. This matters not just for performance, but also for the nature of available data.

    • sometimes a quick estimate is enough,
    • other times local optimization makes sense,
    • and for more demanding cases, robust calibration is possible.

    This is a good example of a product compromise: instead of forcing everyone into one “right” mode, offer several paths based on input quality and user expectations.


    What’s Next

    In the next part, I’ll move from the calculation core to the user layer: why having the right model isn’t enough, how the interface steps were designed, and why UX is part of technical quality for tools like this.

© 2026 Michael Princ. All rights reserved.

Built with WordPress