by Greg Shank, Principal, Altura
‘Tis the season where political pundits bring us new revelations every day as a result of polls, such as “Candidate A will outperform with people aged 18-35.” As an advocate and practitioner using data analytics for energy efficiency in buildings, I recognize the value of quality data and appropriate interpretation of the right information. So I read about the results of various polls, and I ask myself “who did they actually talk to in that poll?”, and “can you really draw THAT conclusion from THAT poll?”, and “is this information really adding any value?”
For instance, I’ve been of voting age for over twenty years and I’ve never been asked to participate in any type of political poll, and I can’t think of anyone I know who has (at least that I’m aware of). This experience makes me suspicious of how representative the political polls are of the entire voting community. But mostly, it reminds me to understand data in the proper context. A full understanding of the context, trends, and people behind the numbers can make the difference between data-driven value and misleading data.
How does political polling relate to the current state of data analytics in the management and optimization of energy use in buildings? The answer is that many of the same questions can be used to separate real from false conclusions. In energy management, as in politics, there is a rush to make a snap judgement based on little analysis. It’s common to see statements from consultants along the lines of, “we reduced your energy use by 20%.” But the reality is that such conclusions require deeper analysis. Over what time period are you claiming those savings? 20% of what? How did you verify the savings?
Thus, amidst the rush to leverage Big Data and analytics tools, I offer the following reminder questions for consideration:
1. What are your underlying data sources, and are you seeing everything you need?
a. In polls, if your data set is biased toward a certain socioeconomic group or is otherwise not representative of the spectrum of people in the area you are polling, then the conclusions you can draw are very limited. Similarly, if you ask biased or ambiguous questions, you get junk information. And you need more than statistical tools to answer this question. You response rate may be statistically significant, but does the method and technology you use automatically preclude participation by some group(s)?
b. In building energy analytics, if your sensors are not calibrated or if you’re excluding certain data points, your fancy graphs will lead you to the wrong conclusions. If your interval is too long or too short, you may miss critical issues or lose the important signals through the noise. If you don’t understand how the machine you are analyzing actually works, you will have difficulty knowing whether the data is signaling good or bad performance.
2. What is the necessary timeframe over which to analyze the data given the question you are asking?
a. In polling, the result of a single poll is much less meaningful if the result changes dramatically from week to week. Similarly, when you look back at the relationship between poll results and eventual winners over the course of a year-long campaign, early polls may appear much less predictive of the eventual outcome than the pollsters would have you believe in the moment. Sidebar: this may be the manner in which polls and energy management are most alike – both require enormous stamina and consistency to achieve long term results!
b. In buildings, it is critical to avoid overreacting to a single data point or even a single data stream; a space temperature that goes out of range for 45 seconds and recovers quickly is wholly different from one that goes out of range for 30 minutes and is again wholly different from one that goes out of range in 3 minute cycles consistently throughout the day. Most importantly, we must frequently analyze multiple variables simultaneously to understand how meaningful any single data stream is in the context of the overall system health.
3. Do you know the people behind the numbers and their motivations?
a. In polling, the risks here are obvious. Political campaign managers are experts at spinning the results of a given poll to their favor. In many cases, they don’t even have to stretch the truth. They may only ask questions that serve their interests; or they may present only that portion of the data that supports their case, excluding other valuable results.
b. In buildings, we must keep an appreciation of what we actually hope to do with the data, and who really controls how the building is operating. We must be able to see the building systems from the perspectives of the operating engineers and occupants in order to focus the analytics on questions that really matter to the people in that building and to the owner paying the bills. And the data must be managed to create shared ownership with the operators in any changes that may be needed.
These are just a few of the interesting similarities between political polls and building energy analytics. As with all exercises in gathering, analyzing, and drawing critical conclusions from large data sets, the key questions raised above (underlying data sources, timeframe for analysis, and understanding the motivations of the people behind the data) should always be well-understood, regardless of the subject matter.
Ultimately, I’m left frustrated with the constant chatter around the latest political polls, because I don’t know or trust the data sources, the timeframes are often too short to yield any meaningful analysis, and it’s often difficult to discover the bias, motivation, or spin of the pollsters.
The good news for the building energy analytics industry is that these issues are easily addressed in the environments in which we work. Technology has made it faster and cheaper to collect data from every component of the building, every few seconds, all day, and continuously. We also have managers and operators to collaborate with in the operation of a building, to help us understand some of the “why” behind the data and to take pride in making the improvements. If only politics could work as smoothly!