Demand Side Analytics has done multiple HER evaluations for many utilities; across a range of geographies, fuels, and cohort sizes. In this post, we review the results of a persistence study conducted in Pennsylvania after the conclusion of report delivery at four of FirstEnergy’s Pennsylvania Electric Distribution Companies (EDCs). The full report can be found here. The effect of treatment is typically measured through comparison of the group of customers receiving the HER, known as the treatment group, to a statistically identical control group. The comparison is done both in the period prior to receiving the reports (to assess treatment and control equivalence), and after treatment (to measure the impact of treatment on consumption). The goal of the study was to identify how long energy savings persisted, even after reports were discontinued.
What is it?
Behavioral conservation programs, such as residential Home Energy Reports (HERs), are well understood to provide small, yet measurable reductions in energy use when appropriately deployed. These programs are relatively inexpensive, geographically widespread, and effective at reducing consumption for most residential customer segments. The treatment effect is related to behavior changes brought about by providing customers information about their energy consumption relative to their peers. By showing how much energy the customer is using compared to similar households, the HER induces behavior changes using the power of social norms. This effect is facilitated by having the HER provide energy efficiency and conservation tips to the customer, which induces temporary and permanent behavior changes. Because of this, the conservation effect can persist in treated groups even after customers stop receiving reports.
Why is it important?
The persistence of HER impacts means that even after the discontinuation of HER delivery, treated customers continue to provide energy savings relative to customers who never received a report. Accurately quantifying how long savings persist in the previously-treated group is important as it helps determine program cost-effectiveness and assessments of the effective useful life of any HER program.
How did we do the analysis?
The HER program in question was implemented as a randomized control trial for each of the EDCs. A randomized control trial is an evaluation technique that provides very precise and unbiased estimates of the effect of treatment – that is, the receipt of HER bill comparisons. If properly implemented, randomized control trials (RCTs) are a very effective framework for estimating HER impacts for two key reasons, related to how HER programs are designed:
- Expected effect size: Because the HER effect is generally small – on the order of 1-3% – the experimental design must be precise enough to detect the effect and must be able to account for any other factors that could bias energy consumption in the treatment group. By comparing consumption in the treatment group to the control group, external influences that are experienced by both the treatment and control groups are netted out of the treatment effect, reducing the amount of noise around the treatment’s impact
- Treatment duration: HER programs can run for many years; some Pennsylvania households have been receiving them for over five consecutive years. Over such a long period, many things can change at an individual home that would affect energy consumption (e.g., occupancy changes, renovations, or weather pattern changes). These factors are not all directly observed or measured, so they cannot be modeled and therefore may be misattributed to the effect of treatment in a regression. However, because these changes will equally affect the control and the treatment group, they will be netted out of an RCT impact estimate.
To isolate the impact of treatment while controlling for other factors that may influence energy use, DSA applied a lagged dependent regression approach. This model works particularly well at providing precise savings estimates when there is good pretreatment equivalence between the treatment and control groups. The model uses information about individual household seasonal consumption patterns collected through billing data analysis to estimate the impact of treatment in each month after the start of report delivery, including after reports were stopped for the persistence test.
To model the effect of persistence, a simple regression specification was used to determine the decay of impacts as a function of the number of months since the cohort received their last report. Because impacts can be seasonal and have uncertainty around them, a weighted average of the prior year’s monthly impacts was used to create an average pre-cessation savings level.
The key metric used to quantify the effect of persistence is how many months it takes for impacts to reach zero. Once the regression is performed, DSA used the intercept and slope from the regression output to calculate the number of months it would take for the trend in impacts to go to zero. This is shown graphically below, where it takes approximately 37 months for the orange trend line to cross the y-axis at zero. The intercept for the persistence regression line is set equal to the average savings in the prior 12-months (shown in blue circles and the grey squares at month = 0). The underlying assumption with this model is that the HER savings will continue to decay at the same rate observed in months 1-24 until reaching zero.
Were there any project-specific considerations?
Each EDC studied in this project had multiple cohorts of customers that were included in the HER program and persistence study. Not all of these cohorts showed robust pretreatment equivalence. Because of this, it is best to carefully consider which cohort’s impacts should be included in an analysis of HER persistence. The criteria that DSA used to categorize cohort quality were threefold:
- Pretreatment equivalence must be established: Without this condition, the lagged seasonal regression model cannot provide unbiased estimates of the savings associated with a HER program.
- The cohort must be large enough in the persistence period to provide a precise impact: Cohorts with 10,000 or more unique – and active – customers after June 2016 provided enough information to ensure that impact estimates during the persistence period could be estimated precisely.
- Enough of the original cohort must remain active through the persistence period to feel confident in the internal validity of the impact: It is possible that there were systematic reasons for customer account churn in the persistent cohorts, which could create a biased estimate of the cohort’s savings. In other words, if customers who left the group responded to the HERs differently than customers who remained active, the overall cohort’s result would reflect only customers who remained active if enough other customers left. We focused our efforts on cohorts that had at least 50% of their original size still left by the persistence period.
These criteria are illustrated graphically in Figure 2 for one of the EDCs. The x-axis plots the average number of customers still active in the period between June 2016 and May 2018 for each cohort, while the y-axis shows the percentage of the original cohort size that is still active during this period. The markers for each cohort are also color-coded to highlight whether the cohort was used in the final analysis, or what the reason was for its exclusion.
Figure 2: Cohort Characterization
What are the results?
The cohort characterization resulted in five cohorts analyzed in the persistence study: two from Met-Ed and three from Penelec. The five cohorts that qualified were then fed in to a second-stage model that sought to determine the monthly decay rate of the savings estimates. Since there is noise in each savings estimate and seasonal variation in the savings estimates, DSA thought it most appropriate to set the intercept of each cohort’s regression to equal the average savings percentage over the twelve months immediately prior to the persistence test. That is, the starting point of this regression was not simply what the customers saved in May of 2016 but a weighted average of the full year prior to the test. Figure 3 shows the raw data used to construct this analysis. The five cohorts that were identified as having good equivalence and the appropriate cohort size are shown in the figure below. The trend line of persistent savings is shown in blue. This figure displays the trend for FirstEnergy cohorts only, and approaches zero nearly 30 months after the HER reports stop being sent to customers. This estimate is combined with other Pennsylvania studies, below, to provide an overall decay rate estimate.
Figure 3: Persistence Trends by Cohort
To estimate the HER effect duration more precisely, DSA fit a simple linear model that related the percent savings estimates – again weighted by the aggregate reference load – to the number of months it had been since the cohort received a HER. The weighting of the percent savings is necessary in this case because we are using percent savings as our variable of interest. Doing the weighting ensures that larger cohorts are have more impact than smaller ones, and that a 2% savings in a high-consumption month counts more than a 2% savings in a low-consumption month, while still creating a percentage metric that can be directly compared to other studies.
Table 1: Persistence Trends by Cohort
|Months to No Impact
|July 2012 Market Rate
|Jan 2014 Market Rate
|July 2012 Market Rate
|Jan 2014 Market Rate
|Nov 2014 Remediation
How do these results compare to a larger set of recent HER persistence studies?
In 2015, the Pennsylvania evaluation team conducted a similar analysis of residential HER persistence for cohorts from PPL and Duquesne Energy that stopped receiving HERs. Three cohorts across these two EDCs experienced between 16 and 24 months of no report delivery, with resumption of HERs after that period had passed. Prior to having begun the persistence test, the two PPL cohorts had received reports since 2010 (Legacy), and since 2011 (Expansion). Duquesne’s HER program began in PY4 (between June 2012 and May 2013), so at most customers received 11 months of HER treatment prior to report discontinuation.
Table 2: Persistence Trends for Other Pennsylvania HER Studies
|Persistence Test Start
|Persistence Test End
|Months of Test
|Months to No Impact
In general, the FirstEnergy results are quite similar to those of the two PPL cohorts, with between 29.7 to 51 months of expected impact decay time. The PPL customers in the HER program had been receiving reports for a longer period than most FirstEnergy customers, but had generally similar savings rates prior to the start of the persistence test. This generally corresponds to the common understanding of HER reports; namely that they can deliver relatively consistent savings after a maturation period of one to two years when customers first start receiving reports. The decay rates, or slope of percent savings decay, in the PPL study is quite similar to that of FirstEnergy, with between a 0.04% and 0.06% drop in savings per month (roughly a 0.5% to 0.75% annual decay).