Prediction models¶
After a consultation with a Machine Learning expert, we have arrived at several possibilities regarding prediction models.
First of all, a good analysis of the input data is necessary. A careful selection of the input parameters should be made in a "less is more" manner: selecting too many parameters may lead to unpredictable behavior.
Some suggested methods of examining the data had been:
- Finding, whether the data is geometrically interpretable
- Inspecting parameter correlation (using Excel functions)
- Projecting to 2D/3D via PCA and inspecting for clusters (using MATLAB)
If clusters were to be found, clustering could be used to determine several classes of expected "rush". This is, apparently, unlikely. If so, simple k-means clustering could be used to determine these classes.
The classifier itself should likely be a Naive Bayes Classifier, with clustering results providing the supervision.
Aggregate JIS card, authorization system, and occupancy data may, alternatively, be used to verify a prediction model, with an additional model of student "fall off" during the academic year, simply based on the date (Look into: Selection model). With this approach, the goal remains to provide a supervisor to the NBClassifier.
Using decision trees had also been suggested.
Good introductory resource: http://www.r2d3.us/visual-intro-to-machine-learning-part-1/
Also see DBSCAN, possible use of occupancy data.
Aktualizováno uživatelem Zuzana Káčereková před téměř 4 roky(ů) · 4 revizí