Random Forests for Evaluating Pedagogy and Informing Personalized Learning

Kelly Spoon, Joshua Beemer, John C. Whitmer, Juanjuan Fan, James P. Frazee, Jeanne Stronach, Andrew J. Bohonak, Richard A Levine


Random forests are presented as an analytics foundation for educational data mining tasks.  The focus is on course- and program-level analytics including evaluating pedagogical approaches and interventions and identifying and characterizing at-risk students.  As part of this development,
the concept of individualized treatment effects (ITE) is introduced as a method to provide personalized feedback to students.   The ITE quantifies the effectiveness of intervention and/or instructional regimes for a particular student based on institutional student information and performance data.  The proposed random forest framework and methods are illustrated on a study of the efficacy of a supplemental, weekly, one-unit problem-solving session in a large enrollment, bottleneck introductory statistics course.  The analytics tools are used to identify factors for student success, characterize students benefitting from the supplemental instruction section, develop an objective criterion to, at the beginning of the semester, identify and advise these students into that section, and suggest intervention initiatives for at-risk groups in the course.

Full Text:



ALVAREZ GARCIA, G., COLLANTES-FERNANDEZ, E., COSTAS, E., REBORDOSA, X. and ORTEGA-MORA, L.M. 2003. Influence of Age and Purpose for Testing on the Cut-Off Selection of Serological Methods in Bovine Neosporosis. Veterinary Research, 34, 341–352.

ARNOLD, K.E. and PISTILLI, M.D. 2012. Course Signals at Purdue: Using Learning Analytics to Increase Student Success. In Proceedings of the 2nd International Conference on Learning Analytics and Knowledge LAK'12 , 267-270.

BAKER, R.S. and YACEF, K. 2009. The state of educational data mining in 2009: A review and future visions. JEDM-Journal of Educational Data Mining, 1(1), 3-17.

BREIMAN, L. 2001. Random Forests. Machine Learning 45, 5-32.

DEKKER, G.W., PECHENIZKIY, M. and VLEESHOUWERS, J.M. 2009. Predicting Students Drop Out: A Case Study. International Working Group on Educational Data Mining.

DELEN, D. 2010. A comparative analysis of machine learning techniques for student retention management. Decision Support Systems, 49(4), 498-506.

DORRESTEIJN, J.A.N., VISSEREN, F.L.J., RIDKER, P.M., WASSINK, A.M.J., PAYNTER, N.P., STEYERBERG, W.W., VAN DER GRAAF, Y. and COOK, N.R. 2011. Estimating treatment effects for individual patients based on the results of randomised clinical trials. Bmj, 343.

FILELLA, X., ALCOVER, J., MOLINA, R., GIMENEZ, N., RODRIGUEZ, A., JO, J., CARRETERO, P. and BALLESTA, A.M. 1995. Clinical Usefulness of Free PSA Fraction As an Indicator of Prostate Cancer. International Journal of Cancer, 63, 780–784.

FRITZ, J. 2011. Classroom Walls That Talk: Using Online Course Activity Data of Successful Students to Raise Self-Awareness of Underperforming Peers. The Internet and Higher Education 14, 89-97.

GALLOP, R.J., CRITS-CHRISTOPH, P., MUENZ L.R. and TU, X.M. 2003. Determination and Interpretation

of the Optimal Operating Point for ROC Curves Derived through Generalized Linear

Models. Understanding Statistics, 2(4), 219–242.

GOOMAS, D.T. 2014. The Impact of Supplemental Instruction: Results from an Urban Community College. Community College Journal of Research and Practice 38, 1180-1184.

GREINER, M. 1996. Two-graph receiver operating characteristic (TG-ROC): update version supports optimisation of cut-off values that minimise overall misclassification costs. Journal of immunological methods, 191(1), 93-94.

GREINER, M., PFEIFFER, D. and SMITH, R.D. 2000. Principals and Practical Application of the Receiver Operating Characteristic Analysis for Diagnostic Tests. Preventive Veterinary Medicine, 45, 23–41.

HOFFMAN, R.M., CLANON, D.L., LITTENBERG, B., FRANK, J.J. and PEIRCE, J.C. 2000. Using the Free-ToTotal Prostate-Specific Antigen Ratio to Detect Prostate Cancer in Men with Nonspecific Elevations of Prostate-Specific Antigen Levels. Journal of General Internal Medicine, 15, 739–748.

HOSMER JR, D.W. and LEMESHOW, S. 2004. Applied logistic regression. John Wiley & Sons.

JAMES, G., WITTEN, D., HASTIE, T. and TIBSHIRANI, R. 2013. An Introduction to Statistical Learning. Springer, New York.

KIM, J.H., PARK, Y., SONG, J. and JO, I.H. 2014. Predicting Students' Learning Performance by Using Online Behavior Patterns in Blended Learning Environments: Comparison of Two Cases on Linear and Non-linear Model. In Proceedings of the 7th International Conference on Educational Data Mining, 407-408.

KOTSIANTIS, S., PIERRAKEAS, C. and PINTELAS, P. 2004. Predicting Students' Performance In Distance Learning Using Machine Learning Techniques. Applied Artificial Intelligence, 18(5), 411-426.

KUYORO‘SHADE, O., OLUDELE, A., OKOLIE SAMUEL, O. and NICOLAE, G. 2013. Framework of Recommendation System for Tertiary. Framework, 2(04).

LEWIS, J.D., CHUAI, S., NESSEL, L., LICHTENSTEIN, G.R., ABERRA, F.N. and ELLENBERG, J.H. 2008. Use of the Noninvasive Components of the Mayo Score to Assess Clinical Response in Ulcerative Colitis. Inflammatory Bowel Diseases, 14(12), 1660–1666.

LIAW, A. and WIENER, M. 2002. Classification and Regression by randomForest. R News 2(3), 18--22.

LOPEZ-RATON, M., RODRIGUEZ-ALVAREZ, M.X., SUAREZ, C.C. and SAMPEDRO, F.G. 2014. OptimalCutpoints: An R Package for Selecting Optimal Cutpoints in Diagnostic Tests. Journal of Statistical Software, 61(8), 1-36.

MACFADYEN, L.P. and DAWSON, S. 2010. Mining LMS data to develop an “early warning system” for educators: A proof of concept. Computers & Education, 54(2), 588-599.

MCNEIL, B.J., KEELER, E., and ADELSTEIN, S.J. 1975. Primer on certain elements of medical decision making. New England Journal of Medicine, 293(5), 211-215

MEANS, B., TOYAMA, Y., MURPHY, R., BAKIA, M. and JONES, K. 2010. Evaluation of evidence-based practices in online learning: A meta-analysis and review of online learning studies. U.S. Department of Education, Office of Planning, Evaluation, and Policy Development, Washington, D.C.

METZ, C.E. 1978. Basic Principles of ROC Analysis. Seminars in Nuclear Medicine, 8, 183–298.

METZ, C.E., STARR, S.J., LUSTED, L.B. and ROSSMANN, K. 1976. Progress in evaluation of human observer visual detection performance using the ROC curve approach. International Journal of Nuclear Medicine and Biology, 3(3), 178-179.

NORRIS, D.M. and BAER, L.L. 2013. Building Organizational Capacity for Analytics. EDUCAUSE.

PENA-AYALA, A. 2014. Educational data mining: A survey and a data mining-based analysis of recent works. Expert systems with applications, 41(4), 1432-1462.

PHILLIPS, E.D. 2013. Improving Advising Using Technology and Stat Analytics. Change, 48-55.

R CORE TEAM 2013. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL


RIDDLE, D.L. and STRATFORD, P.W. 1999. Interpreting Validity Indexes for Diagnostic Tests: An Illustration Using the Berg Balance Test. Physical Therapy, 79, 939–950.

ROMERO, C. and VENTURA, S. 2007. Educational data mining: A survey from 1995 to 2005. Expert systems with applications, 33(1), 135-146.

ROSSMAN, A.J. and CHANCE, B.L. 2014. Using Simulation-Based Inference for Learning Introductory Statistics. Wiley Interdisciplinary Reviews: Computational Statistics 6, 211-221.

SHARABIANI, A., KARIM, F., SHARABIANI, A., ATANASOV, M. and DARABI, H. 2014. An enhanced bayesian network model for prediction of students' academic performance in engineering programs. In Global Engineering Education Conference (EDUCON), 2014 IEEE (pp. 832-837). IEEE.

SUPERBY, J.F., VANDAMME, J.P. and MESKENS, N. 2006. Determination of factors influencing the achievement of the first-year university students using data mining methods. In Workshop on Educational Data Mining (pp. 37-44).

VAN BARNEVELD, A., ARNOLD, K.E. and CAMPBELL, J.P. 2012. Analytics in Higher Education: Establishing a Common Language. EDUCAUSE Learning Initiative Paper, 1-11.

VAN BUUREN, STEF. and GROOTHUIS-OUDSHOORN, K. 2011. mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67. URL http://www.jstatsoft.org/v45/i03/.

VERMONT, J., BOSSON, J.L., FRANCOIS, P., ROBERT, C., RUEFF, A. and DEMONGEOT, J. 1991. Strategies for Graphical Threshold Determination. Computer Methods and Programs in Biomedicine, 35, 141–150.

WICKMAN, H. 2009. ggplot2: elegant graphics for data analysis. Springer New York.

YOUDEN, W.J. 1950. Index for rating diagnostic tests. Cancer, 3(1), 32-35.

ZHANG, Y., OUSSENA, S., CLARK, T., and HYENSOOK, K. 2010. Using data mining to improve student retention in HE: a case study. In Proceedings of ICEIS 12th International Conference on Enterprise Information Systems, Portugal.


  • There are currently no refbacks.