Abstract:
No-shows in medical offices create significant clinical practice disruption and financial stress. Overbooking is a standard and popular practice used to pad the schedule to make up for no-shows, but determining how many to overbook by is a constant challenge. No-show data were converted to time series and analyzed using both the exponential smoothing state space model (ETS) and neural network autoregression model (NNAR) to produce point-forecasts and long-term forecasts in R programming language. Validated ETS and NNAR modeling resulted in numeric predictions that closely matched the actual no-show numbers. Eight-week horizon forecasts designed to be used for predictions for long-term planning also were obtained with both the ETS and NNAR models. These numeric eight-week forecasts compared favorably with the actual number of no-shows. Horizon forecasts may be used for long-term planning, such as targets, goals, and internal benchmarks.
.The rate of missed appointments (defined as scheduled visits that were not canceled or rescheduled) or “no-shows” in medical clinics ranges from 15% to 30% as a global phenomenon.(1) Twenty percent is typical, but in some settings the rate is over 50%.(2)
The impact of no-shows on a medical system often is measured in financial costs. For example, in 2015 Kheirkhah et al.(3) stated that in 2008 the average cost of no-shows, per patient, was $196. There may also be clinical effects on the patient,(4-6) such as possible complications of incomplete medical care. Much less is published on the anxiety experienced by the attending physician as a result of worrying over a no-showing patient’s clinical status.
Because it causes significant disruption of clinical practice, patient flow, and the business of healthcare, the no-show phenomenon has spurred many studies, research investigations, and interventions. Successful strategies for reducing no-show rates usually employ many methods at the same time, directed at both “scheduling manipulation” (i.e., overbooking), physician flexibility, and some form of direct reminder-contact with patients.(7,8) With the emergence and adoption of data technology in medical practice (e.g., electronic health records, electronic practice management systems) and rising data analytics, new techniques of statistical modeling have been brought to bear on the no-show problem. For example, scheduling manipulation is no longer performed blindly: there are now computer programs and modeling algorithms that can provide optimized decision support by forecasting no-show rates,(9) which are relied on to adjust the schedule. Other computer models can produce a no-show profile of each patient in the practice by calculating the statistical probability of no-show for that particular patient(10); such information is taken into account in preparing the final schedule or dynamically modifying it.
One way that schedules can be modified to compensate for no-shows is overbooking or padding—a practice well-known and widely used in the airline industry. The effectiveness of overbooking with regard to counteracting no-show is not in doubt; neither is the downside: overflow. The challenge then is to pad the schedule with just the right amount of overbooking. A few methods of estimating no-show are available, including regression or classification modeling, and some of these have evolved into commercial products.(11) This article explores the use of time series analytical methods applied to the stochastic representation of no-show as a function of time to predict no-show rates as real numbers. This method does not result in any type of patient profiles(10); such “profiling” usually carries the risk of being viewed negatively as patient stigmatization.
This article presents the prediction of no-show rate and number of no-shows using predictive analytical tools in time series and compares the results with the actual no-show figures for the occasion or period. Both point-forecasts and interval predictions are demonstrated, much like the day’s weather predictions and the extended seven-day forecast. When desired, long-term forecasts can be obtained and used for planning, for internal benchmarking, and as targets.
Methods
No-show data in this study were collated from the electronic practice manager (EPM) application database at a primary care medical clinic. All data have been de-identified, with no mechanism available to re-identify the patients. Pre-consultation with an internal review board consultant cleared the study as exempt, being classified as nonhuman research. Note that the terms “predict” / “prediction”, “forecast” / “forecasting” are used interchangeably in this study.
Data Collection
Using programmed instructions, the EPM flags an appointment as “no-show” if it meets the conditions of not kept, not canceled, and not rescheduled after the time slotted for that appointment. Lateness (showing up on the same day but after the scheduled time) and early-show are not categorized as no-show.
A report run against the EPM database aggregates the data as a dataset of weekly totals of no-shows and corresponding weekly totals of scheduled patients, from which the percent no-show is calculated. For this study, the start of the data series is the first week of year 2014 (week #1), and the end is week #39 of year 2016, yielding a total of 143 weekly time data observations. The dataset was complete, with no missing data, due mainly to automatic data collection and management by the EPM application software system. Data analytics was performed using R programming language; the R packages required are tseries, fbasics, forecast, zoo, and nnet.
Data Analytics
Analytical models were built with nnetar-fitted NNAR, a neural network method; and ts-fitted ETS, a non-neural network time series method. For NNAR, the forecast method is NNAR(8,4); the model is built from an average of 20 networks, each of which is an 8-4-1 network with 41 weights; options were: linear output units. For ETS, the forecast method is ETS(M,N,N), and the model is ETS(M,N,N). Both models—NNAR and ETS—were applied to the (same) dataset, enabling a comparison between them. Close correspondence between their respective outputs increased confidence in the overall method.
Results
For the 143-week study period, there were 159,832 scheduled appointments, out of which 25,002 were no-shows.
Weekly totals series from the first week of year 2014 (wk #1) to wk #39 in 2016, for a total of 143 weekly time-units (Table 1 and Figure 1), were used to produce a time series for the percent which no-showed.
Figure 1. The time series plot of the no-show data, in percent, from week #1 to week #143. Percent no-show was derived from the numerator, total no-show for the week and the denominator, total number of scheduled patients for that week.
Figure 2 shows the actual data series compared with the ts-fitted ETS and nnetar-fitted NNAR models. Visual inspection would conclude that both models are quite similar in their fits with the actual data and with each other. While this conclusion boosts confidence in the methodology, very close agreement between the original data and model-fitted data could represent “overfitting” which might result in poor predictions.
Figure 2. Comparison of model fits to the actual series. The NNAR model is the dotted line while the ETS model fit is dashed. The actual data series is illustrated by the solid line. The vertical line is at week #135 (which corresponds to training plus test data subsets for the model fittings).
Table 2 displays a model validation parameters comparison. These parameters are different error measures calculated from the model performance on the training data subset. The lower the value, the better the model. Using this criterion, NNAR scores consistently better than the ETS model on all parameters on the training data subset.
Starting with week #136 through week #143, weekly forecasts are plotted for each of the NNAR and ETS models (Figure 3). The NNAR model produces predictions which vary from week to week, whereas the ETS model’s predictions are a straight line throughout.
Figure 3. Eight-week horizon forecasts with confidence intervals. For weeks #136 through #143, weekly forecasts are plotted for each of the NNAR and ETS models. The NNAR model produces predictions that vary from week to week (top panel, right-side end of the plot), whereas the ETS module (bottom panel) is a straight line throughout. The shaded “fans” represent the confidence intervals for the predictions; the inner (darker) interval represents 80% confidence limits while the larger cone represents the 95% confidence limits. (See also Table 3 for the real numbers which define the confidence interval.)
Figure 4 illustrates the eight-week horizon forecasts, combined on one panel. Table 3 presents the eight-week horizon point-forecast numbers with prediction/confidence intervals. Each model’s weekly eight-week window predictions are presented, at two different confidence levels: 80% and 95%, respectively.
Figure 4. Eight-week horizon forecasts. The point-forecasts of each of the NNAR and ETS models on the unseen data of weeks #136 through #143. The actual data, when they became available, were later superimposed on the predictions to produce this plot. The vertical line is at week #136, the start of the predictions. As noted in Figure 3, the ETS prediction is a straight line, whereas the NNAR forecast is variable.
Table 4 shows the point-forecast converted into numbers and compared with actual no-shows for the eight-week horizon, assuming 32 patients scheduled per day per week.
Discussion
Time series analysis and modeling have been well researched and have been applied to qualifying problems for decades.(12) The popularity of time series analytics probably is most evident in its applications in financial portfolio management and its subtending econometrics, for which Engle received a Nobel Prize in 2003.(13) Applying time series modeling to stochastic time-related data is challenging, as explained by Langkvist et al.,(14) who also point out that such challenge carries over to widely successful deep neural network methods. No-show series fall under stochastic time series: values are random.
In spite of such challenges, the no-show problem can be subjected to regular time series modeling and, by extension, to neural network modeling, as was done here, with reasonably acceptable outcomes.
In this study, the neural network (NNAR) model shows better training data subset performance based on error measures of root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and mean absolute scaled error (MASE). However, better training-set performance, which often is reflected as a better model fit, does not automatically mean superior predictive performance of the model or better forecast accuracy on unseen data.
Predicting Number of No-Shows
Table 4 presents the real numbers of no-shows (i.e., actual), and those predicted by NNAR and ETS models for the respective weeks.
Considering week #136, for example, based on the percentages and a 32-patient schedule, the actual number of no-shows, as well as the predicted numbers according to the two different models, are available. When the numbers are rounded to the nearest whole number, the result is 5 for actual number, NNAR model, and ETS model. If, instead, the integer is chosen, the number is 4 across the board. The implication is that the predicted number matches the actual number of no-shows with either model, for the week in question. Therefore, when overbooking, scheduling four extra patients daily for the week (#136) would, on average, compensate for the daily no-shows for that week. Scheduling five more patients would accomplish the same thing. Each week is, of course, different, but a similar exercise will show no more than one patient difference between the actual no-show and the predicted no-show rate, regardless of the model used. Thus, one can conclude that using the predicted numbers to guide overbooking for no-shows is very useful, because the predicted values are close to the actual values.
One advantage of time series modeling is the production of forecast horizons that can be several time units (weeks, in this case) in duration. The point-forecasts can be presented (and used) with confidence intervals of choice. Figure 3 and Table 3 present a forecast horizon of eight weeks for each of the models. The values are in % no-show; these are easily calculated as real numbers when the denominator (number of patients scheduled per day) is known, as in this case (where it is 32) (Table 4).
Clearly, applying any rounding regimen (e.g., integer, next integer, nearest whole number) consistently, one can show that the number of no-shows predicted by either model in an eight-week horizon forecast varies by no more than one no-show (equivalent to one no-showing patient) between the actual no-show number and the predicted number per day for that week. Also, the variation between the two different predictive models is no more than one no-show (again, equivalent to one no-showing patient) per day for the same week.
Long forecast horizons also can be used to establish goals, targets, and internal benchmarks with regard to no-shows. The real numbers and percentages derived from the models, presented in tables or as plots, make a valuable guide. When presented in dynamic dashboard format with added “what-if scenario” modeling, they constitute a powerful tool for managing schedules, especially where no-show rates are high and highly variable, and overbooking could cause more hardship.
Two different modeling methods were used in this exercise for comparison and in order to demonstrate parity and boost confidence in the forecasts. In practice, either model alone may be sufficient.
Conclusion
This study was able to use time series and applicable predictive neural network and time series models to predict the number of no-shows and produce a long-term forecast horizon using data from a medical office. The eight-week forecasts were compared against the actual number of no-shows based on an office schedule of 32 patients per day and an actual no-show rate range of 12.66% to 17.12% per week during the eight-week period. This is equivalent to four to six actual no-shows per day (after rounding), per week. The nnetar-fitted NNAR model predicted a range of four to five daily no-shows per week. The ts-fitted ETS model predicted five no-shows daily for the week. This means a variance range of 0 to 1 no-show between either model’s prediction and the actual values, and the same variance between the models themselves. The closeness of the predicted numbers to the actual numbers and the closeness of the predictions of the two different models strongly favor the usefulness of this method in deciding by how many extra patients the schedule should be overbooked in anticipation of no-shows.
In summary, the predicted number of no-shows can thus be reliably obtained (within the limits of statistics) and used to inform decisions on padding schedules to mitigate against the adverse effects of no-shows. Long-term forecasts also may be used for planning as targets, goals, and internal benchmarking. Finally, the focus on the number of no-shows in office scheduling and medical practice in this study avoids any controversy surrounding the use of predictive analytics in forecasting human behavior (in this case, no-show behavior) and the labeling or tracking of subjects (“no-show profiling”).(15,16) This study has no data that identify any individual patient and no information that could be used to trace, track, or profile any specific patient as a potential no-show.
References
Davies ML, Goffman RM, May JH, Monte RJ, Rodriguez KL, Tjader YC, Vargas DL. Large-scale no-show patterns and distributions for clinic operational research. Healthcare. 2016;4(1):15.
Denton BT, ed. Handbook of Healthcare Operations Management. New York: Springer; 2013; 252.
Kheirkhah P, Feng Q, Travis LM, Tavakoli-Tabasi S, Sharafkhaneh A. Prevalence, predictors and economic consequences of no-shows. BMC Health Serv Res. 2015;16:13. Published online 2016 Jan 14.
Nuti LA, Lawley M, Turkcan A, et al. No-shows to primary care appointments: subsequent acute care utilization among diabetic patients. BMC Health Serv Res. 2012;12:304.
Bowser DM, Utz S, Glick D, Harmon R. A systematic review of the relationship of diabetes mellitus, depression, and missed appointments in a low-income uninsured population. Arch Psychiatr Nurs. 2010;24:317-329.
Nguyen DL, DeJesus RS, Wieland ML. Missed appointments in resident continuity clinic: patient characteristics and health care outcomes. J Grad Med Educ. 2011;3:350-355.
Johnson BJ, Mold JW, Pontious JM. Reduction and management of no-shows by family medicine residency practice exemplars. Ann Fam Med. 2007;5(6):534-539. doi: 101370/afm752.
DuMontier C, Rindfleisch K, Pruszynski J, Frey JJ III. A multi-method intervention to reduce no-shows in an urban residency clinic. Fam Med. 2013;45:634-641.
Berg B, Murr M, Chermak D, Woodall J, Pignone M, Sandler MS, Denton B. Estimating the cost of no-shows and evaluating the effects of mitigation strategies. Med Decis Making. 2013;33:976-985.
Huang Y, Hanauer DA. Patient no-show predictive model development using multiple data sources for an effective overbooking approach. Appl Clin Inform. 2014;5:836-860.
Farrell MB. Data-driven scheduling predicts patient no-shows. Boston Globe. July 14, 2014. https://www.bostonglobe.com/business/2014/07/13/high-tech-cure-for-doctors-scheduling-pains/ylLD4Fwar8EElFJ32frI9I/story.html . Accessed June 28, 2017.
Keogh E, Kasetty S. On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Mining and Knowledge Discovery. 2003;7:349-371.
The Prize in Economic Sciences 2003. Press Release. Nobelprize.org . October 8, 2003. www.nobelprize.org/nobel_prizes/economic-sciences/laureates/2003/press.html .
Längkvist M, Karlsson L, Loutfi A. A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recognition Letters. 2014;42(1):11-24.
Perry WL, McInnis B, Price CC, Smith SC, Hollywood JS. Predictive Policing: The Role of Crime Forecasting in Law Enforcement Operations. Rand Corporation Safety and Justice Program. www.rand.org/content/dam/rand/pubs/research_reports/RR200/RR233/RAND_RR233.pdf . Accessed June 18, 2017.
Gibbs M. Predicting crime with Big Data . . . welcome to “Minority Report” for real. Network World. September 20, 2014. www.networkworld.com/article/2686051/big-data-business-intelligence/predicting-crime-with-big-data-welcome-to-minority-report-for-real.html .
Topics
Critical Appraisal Skills
Action Orientation
Communication Strategies
Related
Should You Sell Your Practice?The Process of Investigating a Professional Behavior ComplaintThe 4 Interview Red Flags Hiring Managers Say Concern Them Most