Back button icon  To front

Predicting student dropout

01. Background

Developed as a Bachelor's thesis in Tallinn University of Technology, a web application was created that interfaced with the underlying machine learning algorithms, predicting student dropout. Data visualization techniques were used to offer both an overview and specific details on the prediction results.

Python back-end with Scikit-learn and Tensorflow, front-end with HTML5/CSS3/JS.

Predictor system form screenshot

02. Mission

With dropout rates being a problem in most universities, a system was needed that could identify students at risk of discontinuing their education. A combined approach of machine learning and data visualization was chosen, to offer administrative personnel and academic staff means to identify student in need of additional counselling and support.

03.Machine learning

This aspect started with identifying any and all information, that could be an indicator in whether the student is at risk. A literature search and comparison on existing models in similar programs was conducted. For this system, I decided to use all the available information on students' grades, demographic information and course completion.

The choice of algorithm was decided by trial and error. For each of the algorithms, a predictor model was developed, that offered both a binary representation of dropout risk and risk factors in form of variable importances. The best-performing models reached accuracy and recall values above 90%.

Predictor system results screenshot

04.Data visualization

A second focus in this thesis was on visualizing the prediction results. The aim was to design a view so that the machine learning results are presented in a simple layout and using terms everyone can understand.

A granular results view for each student was designed, that presents risk percentages and a binary decision from the algorithms. A dashboard with overview of the prediction results was also developed, to offer important information at a glance. The dashboard focused on offering an explanation for the results, showing feature importances and correlation between the most important variables and probabilities of dropout.

Predictor system visualizations screenshot


Overall, the goal was to offer a tool to identify students at risk of dropping out. This information is useful for administrative personnel and departments, to provide support to students and to combat high rates of dropout.

Development of this system is continuing with a refactor of the back-end to facilitate extensibility and a separate notification system for the at-risk students themselves. In a proposed study, I will study the effects different types of notification and motivation methods have on students, regarding their risk of dropping out.