The purpose of the study is to provide an analysis of the prediction model for Capital Bikeshare system in what regards the net count of bikes rented per day, from nine independent variables. A central point of the work is to apply the linear regression method for identifying the daily net count for rented bikes at Capital Bikeshare. The core data set is related to the two-year historical log corresponding to years 2011 and 2012 from Capital Bikeshare system, Washington D.C., USA which is publicly available on http://capitalbikeshare.com. We aggregated the data on daily bases and then extracted and added the corresponding weather and seasonal information. Weather information are extracted from http://www.freemeteo.com.
The dataset consisted of 730 rows and 10 columns, organized as follows: season, holiday (0 or 1), week day (0-6), working day (1 or 0), weather cat (listed in the dataset description), temperature, feeling temperature, humidity, wind speed and rented bikes (total count for rented bikes per day).
From implementation standpoint, a customized application was specifically developed for this purpose. It provides visualization and analysis capabilities like identifying the outliers and removing them with ease before building the prediction model.
As the results proved, in the end not all independent variables must be used in order to obtain close to accurate prediction model for the net bikes rented per day. The final results have shown an accurate but not perfect prediction model for Capital Bikeshare.
Develop application that provide data analysis functionality and visualization model for the prediction model.
To view the full study, please register.