A public transportation company is expecting increasing demand for its services and is planning to acquire new buses and to extend its terminals.These investments require a reliable forecast of future demand which should be based on historic demand stored in the companyÃ¢s data warehouse. For each 15-minute interval between 6:30 hours and 22 hours the number of passengers arriving at the terminal has been recorded and stored. As a forecasting consultant you have been asked to forecast the number of passengers arriving at the terminal.
Part of the historic information is available in the file bicup2006.xls. The file contains the worksheet “Historic Information” with known demand for a 3-week period, separated into 15-minute intervals. The second worksheet (“Future”) contains dates and times for a future 3-day period, for which forecasts should be generated (as part of the 2006 competition)
Your goal is to create a model/method that produces accurate forecasts. To evaluate your accuracy, partition the given historic data into two periods: a training period (the first two weeks) and a validation period (the last week). Models should be fitted only to the training data and evaluated on the validation data.
Although the competition winning criterion was the lowest Mean Absolute Error (MAE) on the future 3-day data, this is not the goal for this assignment. Instead, if we consider a more realistic business context, our goal is to create a model that generates reasonably good forecasts on any time/day of the week. Consider not only predictive metrics such as MAE, MAPE, and RMSE, but also look at actual and forecasted values, overlaid on a time plot.
For your final model, present the following summary:
1. Name of the method/combination of methods
2. A brief description of the method/combination
3. All estimated equations associated with constructing forecasts from this method
4. The MAPE and MAE for the training period and the validation period
5. Forecasts for the future period (March 22-24), in 15-min intervals
6. A single chart showing the fit of the final version of the model to the entire period (including training, validation, and future). Note that this model should be fitted using the combined training + validation data
Tips and Suggested Steps
1. Use exploratory analysis to identify the components of this time series. Is there a trend? Is there seasonality? If so, how many “seasons” are there? Are there any other visible patterns? Are the patterns global (the same throughout the series) or local?
2. Consider the frequency of the data from a practical and technical point of view. What are some options?
3. Compare the weekdays and weekends. How do they differ? Consider how these differences can be captured by different methods.
4. Examine the series for missing values or unusual values. Suggest solutions.
5. Based on the patterns that you found in the data, which models or methods should be considered?
6. Consider how to handle actual counts of zero within the computation of MAPE.