## Understanding Timing Bias in Survey Administration

A frequent question I get is: Is there some sort of seasonality to administering a Parent Satisfaction Survey? Are there better and worse times to send a survey?

*The answer is yes, but the effect of timing in survey administration is not that great.* My goal here is to definitively settle the issue – I hope for years. To do so, I am using 108,203 respondents to our Parent Satisfaction and Referral Survey who have taken our survey from 2006 to the present. And I am explaining exactly what I did and presenting the results as publicly as possible.

## Statistical Analysis

***** Warning: Statistics. Permission Granted to ****Skip to Results at the End**** *** **

108,203 is the exact number respondents (from 2006 to 2021) for whom we had the survey submittal date and answers to both “How Satisfied are you”, and “How willing are you to refer to our school?” The scale for the latter two questions is the same – 0 to 10, where 10 is high.

I sorted and scored all responses *by day*, from January 1^{st} to December 31^{st}. Since willingness to refer and overall satisfaction correlate highly together (Spearman correlation of .72), I *doubled* the number of data points by using the answers for both questions on each survey.

I determined the average, standard deviations, and counts for each and every day of the year. That includes Christmas Day (17 responses) and February 29 (142 responses). With this many respondents, most days of the year had hundreds of responses. The notable exception was summer and much of September, where we actively discourage sending out satisfaction surveys.

In statistics, a confidence interval range is the highest to lowest possible averages from a given group of respondents. This is a range of possible averages that will be true practically all of the time – regardless of how many more responses are obtained. You can calculate confidence interval ranges for any time period – a day, a week, a month, or custom dates.

You can also decide how confident you want to be of your conclusions. Pretty darn sure is the 99% confidence interval range. That’s how we calculated the overall average range, as well as the ranges of possible averages for the 12 periods of times in the table below.

For daily responses, we calculated what the confidence interval range at 95% – pretty certain. Fifty-five days – mostly summer and early September – had too few respondents to trust the results of these calculations.

Confidence interval ranges are calculated based on means, standard deviations, and counts. For the whole group of 108,203 respondents, the average satisfaction / willingness to refer score was 8.36. Because the number of respondents was HUGE, we can say that 99% of the time, even if we get another 100,000 respondents, the overall average will stay between 8.37632 and 8.34542. That is the confidence interval range for all these respondents.

If there was no timing bias whatsoever, you would expect that the “real” (confidence interval range) average for any given day should fall somewhere between our “grand average” range of 8.37632 and 8.34542. Calculating all those confidence interval ranges for each day’s average, *we can say with 94% certainty that about 231 days of the year, the daily and overall averages do just that.* In other words, from a statistics point of view, for 231 *individual* days, we find no statistically significant timing bias.

However, 80 days had confidence interval ranges for average scores that were either too high (21) or two low (59) to be some sort of statistical fluke, at least 94% of the time.

***** Trigger Warning: Heavy Duty Statistics Ahead – May Cause Headaches or Brain Fog *****

The confidence level for all 108,203 surveys the entire year was figured at p<.01. Likewise, the confidence levels for each period of time in the table below were also calculated at p<.01. Thus, the designation of a period of time as “higher”, “lower”, or “average” has a 98% chance of being objectively true (99% times 99% = 98%).

The confidence levels *per day* were calculated at *p<.05 *due to the far lower numbers of surveys taken in a day. (The confidence intervals range at p<.01 would have been too large to have meaningful results.) So we have 94% certainty for comparing the daily mean to the grand mean: 99% times 95% = 94%.

Respecting the Binomial Theorem, all days with less than 30 respondents were not considered – that included 55 days (mostly summer and early September.) The data for these days is inconclusive for timing bias. Summer appears “average” largely because of this problem. We really don’t know.

Of the remaining 311 days with adequate data, 80 days had averages that, even accounting for confidence levels above, were higher (21) or lower (59). At 94% certainty, there is a 6% chance that this calculation is wrong for any given day. That means that 18 days (311 times 6%) might be categorized wrongly.

So instead of 80 days that are not properly aligned with our grand mean, it could be there are really 96 (80+16). That’s the worst case. In the best case, only 64 days (80 – 16) were off the mean for the year due to timing bias. Dividing these numbers by the 311 days, roughly one-fifth to one-third of days in a year could be higher or lower in average satisfaction and willingness to refer scores than you would normally expect.

***** End Heavy Duty Statistics. Deep Breath. *** **

Let me explain all of this at a granular level. Take Christmas Eve, where 58 people filled out the survey. Their average SAT/REF score was 7.91. It’s a lot less than the overall average for the year of 8.36, but is the difference real, or just a statistical fly-by-night?

To answer that, we must calculate the confidence interval *for that particular day.* We’re glad only 58 people responded on Christmas Eve, for reasons other than statistics. However, because it is 58 responses – not hundreds – the “real” average for that day could be found in a *much larger* confidence interval range than usual. Why? Because with far fewer respondents, we can’t be really certain what the “real” score would be if everyone took their survey on Christmas Eve (heaven forbid). Doing the math, the confidence interval for Christmas Eve is 0.63.

So our Christmas Eve REAL average satisfaction and willingness to refer is somewhere between 8.54 (7.91 + 0.63) and 7.28 (7.91 – 0.63.) *This confidence interval DOES include our 8.36 mean*. This overlap is a big deal.

We might conclude that people taking satisfaction surveys on Christmas Eve need a life. But because of the overlap, we simply cannot conclude they are more biased taking our survey.

On the other hand, sometimes the 95% confidence interval range for a given day in question could not, by the numbers, include our magic 8.37632 and 8.34542 range we calculated all respondents. For example, on January 5 (205 respondents), the 95% confidence interval range calculation gets us a range of possible means of 8.29 to 7.75. We could have hundreds more parents take our survey on January 5^{th}, but 95% of the time, the mean won’t budge much past 8.29. This falls short of our bottom range confidence level of 8.34 for everyone. Concretely then, this is a “lower” timing bias day.

It can go the other way, too. On All Saints Day (November 1), the 95% confidence interval range is between 8.72 and 8.43. Since our overall grand mean target is between 8.37632 and 8.34542,* parents who take the survey on November 1*^{st}* are likely to be more enthusiastic than all parents over the course of the year. *On this particular day, it is unlikely satisfaction/referral mean probably will dip all the way down to 8.37632, regardless of how many more people take the survey that day.

Note these respondents didn’t have some sort of All Saint’s Day spiritual revival – Halloween’s numbers are similar. So by day, So October 31 and November 1^{st} are two of the 21 higher timing bias days. That they are back-to-back hints at a higher timing bias *for a period of time* surrounding these dates.

**Summarizing the tale of the tape:**

Category | Days |

Mean too high (positive timing bias) | 21 |

Mean too low (negative timing bias) | 59 |

Not Enough Responses (<30) | 55 |

Average | 231 |

Total | 366 |

Using this confidence interval ranges comparisons, is there some sort of pattern to the 81 total “higher” and “lower” days? Yes, there is, but you do have to tease it out.

For example, 12 of the 23 days between April 26^{th} to May 19^{th} were “lower” timing bias days. (That’s a big clue!) On the other hand, between January 12^{th} and January 23^{rd}, there is only one day out of 11 that is “lower.”

I graphed out these “higher” and “lower” days for the entire year, using a 7-day rolling average (3 days back, 3 days forward.) That’s how I determined the periods in the table below. As you would expect, these periods did not start or stop at the beginning or end of months, or quarters, or semesters. (I also double-checked my period demarcations using 9- and 15-day rolling counts as well.)

Next, I ran the confidence interval ranges for each of the periods I had identified (p<.01). If I had identified the correct start and stop dates, you would expect that many of these time periods would be higher or lower in overall satisfaction and willingness to refer, using confidence interval statistics. That was the case for 8 of the 12 time periods. Three of these had higher timing biases, five had lower, and four were average. (Two of the “average” had limited data, however.)

Finally, using the average satisfaction/willingness to refer for each period, I used our own internal survey results from 780 Christian schools to determine the percentile impact of taking surveys in a higher or lower time bias period. The overall percentile rank impact of survey timing bias was no more than 7 percentile ranks up, or 9 percentile ranks down (out of 50 possible either way.)

Over the years, I have heard all kinds of “authorities” talk about the best time to take a survey – never explaining how they came to their conclusions. That’s why I want you to know exactly how I came to mine, in the most public way possible.

Now here are the results.

## Non-Statistical Summary Starts Here

The table below presents the result of my analysis:

Date | Conclusion | Period Average (Willingness to Refer / Overall Satisfaction) | Difference From Mean | Boost / Decline Percentile | Top Range (99%) | Bottom Range (99% |

All Dates | Average All | 8.36 | 0 | 50 | 8.37632 | 8.34542 |

12-25 to 1-21 | AVERAGE | 8.37 | 0.01 | 51 | 8.44 | 8.31 |

1-22 to 2-19 | HIGHER | 8.47 | 0.11 | 57 | 8.51 | 8.42 |

2-20 to 3-12 | LOWER | 8.27 | -0.09 | 44 | 8.32 | 8.21 |

3-13 to 4-3 | AVERAGE | 8.33 | -0.03 | 48 | 8.38 | 8.27 |

4-4 to 4-25 | HIGHER | 8.43 | 0.07 | 54 | 8.47 | 8.38 |

4-26 to 5-19 | LOWER | 8.22 | -0.14 | 41 | 8.26 | 8.18 |

5-20 to 6-21 | LOWER | 8.29 | -0.07 | 45 | 8.34 | 8.24 |

6-22 to 7-27 | AVERAGE* | 8.28 | -0.08 | 45 | 8.39 | 8.17 |

7-28 to 9-16 | AVERAGE* | 8.43 | 0.07 | 54 | 8.59 | 8.26 |

9-17 to 10-7 | AVERAGE | 8.25 | -0.11 | 43 | 8.41 | 8.10 |

10-8 to 12-6 | HIGHER | 8.44 | 0.08 | 55 | 8.48 | 8.41 |

12-7 to 12-24 | LOWER | 8.25 | -0.11 | 43 | 8.31 | 8.18 |

* Limited data available; inconclusive

## Conclusions

- Timing bias in survey administration is real.
- Timing bias working in your favor can boost a theoretically average score from the 50
^{th}percentile rank to the 57th percentile rank. That’s as good as it gets. - Timing bias working against you can depress a theoretically average score from the 50
^{th}percentile rank to the 41st percentile rank. That’s as bad as it gets. - Timing bias cannot – repeat cannot – be determined by month. That’s too artificial. Likewise, you also cannot describe timing bias in some sort of “Fall – good, Spring – bad” terms. Ditto any broad statements of 1
^{st}and 2nd-semester results. The time periods where timing bias occurs are far more precise than that. - Timing bias works in your favor from October 8
^{th}to December 6^{th}. Moreover, January 22^{nd}to February 19^{th}is also a good time to send a survey, as well as the first three weeks of April. - There is very little timing bias either way from December 25
^{th}to January 21^{st}(Christmas break after Christmas). Other average times include March 13^{th}to April 4^{th}. - The most negative timing bias occurs between April 26
^{th}and May 19^{th}. February 20^{th}to March 12^{th}also has a “low” timing bias. Possibly end of school or spring break issues, respectively. (I am not sure.) The first three weeks of December are also subject to timing bias working against you. Most likely that is related to the Christmas rush. - The start of the school year is also timing neutral, but we do NOT recommend sending surveys then, because new parents particularly do not know you very well. In addition, there are likely program adjustments that you have made from last year, which all parents have not yet experienced enough to make a judgment.
- Now that GraceWorks knows precisely about timing bias, we will adjust for it on Parent Satisfaction and Referral Surveys going forward.

There are other considerations for the timing of survey administration. The most notable is avoiding sending surveys when response rates seem to be the lowest. Here May and ALL of summer are most problematic.

In addition, the results of a top leadership change may take up to a year to be reflected in survey scores. Therefore, a school with a new leader would be wise to wait for the last possible good window for survey administration, which is the first three weeks of April.

The overall conclusion is that *there are* times of the year when parents seem to rate Christian schools slightly higher or slightly lower in their overall satisfaction and willingness to refer. Depending on the time of year, this means that a theoretically “average” score – 50^{th} percentile – can come in with a timing bias ranging from the 57^{th} to 41^{st} percentile rank.

It is important to note that in the worse possible case, no more than 96 days of the 311 days of the year (for which we had sufficient data) have timing biases. Most likely it’s less. Therefore, of the 311, at least 214 days of the year do NOT have a significant timing bias for survey administration – figuring the worst case. That number is probably more like 230. It could be as high as 246.

While there appears to be timing biases in survey administration, the consequences of these are relatively small and easily corrected.

**What now?**

Why did we go through all the effort to conduct a deep statistical analysis on the timing bias of over 100,000 survey respondents? Because it matters. GraceWorks has been refining our Parent Satisfaction & Referral Surveys for over 15 years. The outcome of this constant improvement is a survey that is more accurate and more useful to Christian schools than anything else available to you.

If you’re ready to conduct a survey for your school, click here to learn more about what we offer and take your next steps.