You’ve heard the Big Data buzzword here, there and everywhere… but as an energy manager, how well do you understand the importance of data quality? Does energy data quality control really play a big role into your day-to-day work activities?
Here at DEXMA we know that it’s important to take a much closer look at Big Data in energy management. That’s why interviewed the newest member of our energy data science team, Juan Carlos Fernández, who tells us the importance of energy data quality lies in the fact that “when it comes to making decisions, it can mean the difference between right and wrong” for energy managers.
Hi Juan Carlos! Tell us a bit about yourself and your role at DEXMA?
I’m part of the Data Science Team here at DEXMA. Our job is to use Big Data, Machine Learning, mathematical models and statistics to leverage the DEXMA database as much as we can. By extracting and leveraging the insights in that database, we can create useful algorithms to help our clients save more energy.
How would you define the Big Data concept in 1 sentence?
In just one sentence? – The Industrial Revolution of our time.
What affect does energy data quality have on energy consumption?
Basically, without quality data, the artificial intelligence algorithms would be less precise or might not even be usable in production. When it comes to making smart energy decisions, poor quality data can lead energy managers down the wrong path.
What are the most common errors that can occur if we have a data quality problem?
The most common mistake tends to be the blind accumulation of data with the idea of simply “saving” or storing them without planning or controlling for quality.
Many also tend to overlook the importance of metadata which is the part that puts context to the data, which is essential to obtain insights and make informed decisions.
What is the most difficult part of achieving high data quality? Is there a secret to it?
The “data supply chain” from when the data is received until it is stored in a database has many critical points along it. A small error in one of these points can cause a catastrophic error in the database, because errors propagate and become amplified as the data keeps flowing.
There is no magic formula to this, unfortunately. The key lies in data quality metrics and constant monitoring and follow-up. And of course there are applications to can help with that, like the Data Quality App we built. Quality data allows for modelling on a robust basis.
What are some things that energy managers should know about data quality?
Energy managers should ask themselves the following three questions:
- What percentage of data will I keep? Which parts of my installation or site are not being monitored?
- How much data am I losing? Do they make sense from a statistical point of view?
- Is all the data that comes in labelled correctly? Is metadata well-defined and labelled?
If the answer to any of these is no, or unknown, energy managers might run into quality problems with their database.
How do we ensure energy data quality here at DEXMA?
At DEXMA we have a replicated Big Data infrastructure with metrics and alarms. A quality metric is a mathematical measure of how good or robust our data are. DEXMA metrics are in real time, which is good because if at any time the quality falls below certain levels, the alarms jump and we can act immediately to correct it.
Can you share some specific examples of energy data verification we do here at DEXMA?
Our development team carries out multiple verifications, some of which are:
- Filters that detect abnormal values produced by errors in the meters (zeros or very high values).
- Detectable alerts for missing data, that is, you can set an alert to let you know if your meter has stopped sending data.
- We also do data weights in case of periods with missing data, which means that if at any moment a very large amount of data is detected that actually corresponds to a period of time during which nothing was received and receives them all at once, a weighting is made that gives us the actual consumption.
- “Tunnel alerts” for energy consumption. The system calculates a consumption pattern that should have a meter based on the consumption of the last weeks, and can indicate the margin of error.
Any extra advice for energy professionals working with Big Data in energy management?
If you are reading this you probably already know it, but in the future, the difference between a good and an excellent product will be the quality of the data on which each one is based.
Improving your energy database today can be a great competitive advantage tomorrow. Think of it as investing in the future.
After this interview with Juan Carlos, we can see how data and its quality forms an integral part of our business as energy professionals. Thanks to Juan Carlos for his time and sharing his knowledge to help us learn more about Big Data in energy management.
To see real examples of energy data quality control in action, why not check out a free video training session: