In our previous blog, we described how we used the interdisciplinarity of data science in a graduate course on research methods that was offered by the Department of Labor Studies at Tel Aviv University. Specifically, we shared how we handled the students' expression of programming anxiety. In this blog, we illustrate how we took advantage of the students' expertise in the application domain component of data science, which in our case was human resources and management.
In the questionnaire distributed at the onset of the first semester, we asked the students, among other things, to describe three challenges they were facing in their work at the time, for which they would be interested in learning new research tools. The students suggested challenges related to the organization and the individual,
Examples of challenges suggested by the students that focus on the organization include:
Examples of challenges suggested by the students that focus on the individual include:
Not surprisingly, many of the issues the students mentioned addressed employee perseverance in the company from different perspectives and, specifically, the need to develop a model for the prediction of a candidate's suitability for the organization.
So, with respect to this research topic, in a class activity, the students were first asked to define a research problem, research objectives, research questions, and data collection tools. We present two examples of research frameworks that students developed. As can be seen, the students expressed their expertise in human resources by explicitly describing the need to develop a tool for such a predication and the challenges and price associated with the recruitment of candidates who are not suitable for the organization.
Example 1:
Example 2:
Next, the developing of such a prediction tool suggests using predictive data science modeling techniques. We decided to introduce the KNN algorithm, since it is a simple algorithm that enables the introduction of many data science concepts (Hazzan and Mike, 2022).
Following the introduction of KNN, a small database was constructed in class, whose purpose was to predict the suitability for the organization of individual employees, as well as the probability of their perseverance in the company. To enable the students to work with KNN in a 2D plane, only two of the proposed features were selected.
In the discussion that took place during the construction of this database, the students expressed their expertise in human resources management by suggesting relevant employee characteristics as well as by addressing details that only they, as experts in human resources, can discern.
For example, the students decided they should specify the kind of organization they are dealing with since different characteristics are relevant for different kinds of organizations. They chose a startup with up to 100 employees that was funded three years ago. Since the two chosen features were the number of monthly work hours and the number of absence days (times 9, so that both features are expressed in hours), another working assumption was that the database was constructed to predict employee suitability for a full-time position (182 monthly hours). A lively class discussion ensued on the question of the time period for which the perseverance prediction should be made. It was decided that a 2-year prediction is suitable for this kind of industry and for typical employee preferences.
Table 1 presents the structure of the database constructed in class. As soon as the database is constructed, the suitability of a specific employee (or candidate) to the organization can be predicted based on his or her personal data.
Table 1: The database constructed in class for the prediction of an employee's perseverance in the company
Monthly work hours |
Number of absence days |
Perseverance in the company |
Probability of persevering in the company over the next two years |
Working assumptions |
180 |
9 |
High |
High |
Startup |
157 |
27 |
Medium |
Medium |
Up to 100 employees |
…. |
…. |
…. |
Low |
Funded three years ago |
…. |
…. |
…. |
Full-time position – 182 hours |
The construction of such a database clearly requires an understanding of the application domain, which in our case was human resources. Indeed, this fact was highlighted in class to further emphasize two messages:
In this blog and the previous one, we reported on activities that we facilitated in a graduate course for human resources practitioners that focused on research methods. The challenge we set out to overcome was the fact that, on the one hand, almost none of the students had any programming experience, while on the other hand, they were experts in human resources management. These activities not only highlight the interdisciplinarity of data science, but further, illustrates the importance of the application domain in data science research; in such cases, we argue, situtated learning—that is, learning through goal-directed activity situated in circumstances which are authentic, in terms of the intended application of the learnt knowledge (Billett, 1996)—is relevant.
Following these anecdotal observations reported on in our last two blogs, we embarked on a comprehensive research project whose objective is to characterize the conceptions and feelings of students in this course with respect to the integration of data science in their graduate degree. Our research will use insights from the sociological and social psychological literature on women in STEM in order to understand the emotional response data science course may ignite among human resources managers. We believe that our observations will contribute to the integration of data science into other application domains in the social sciences as well.
Billett, S. (1996). Situated Learning: Bridging Sociocultural and Cognitive Theorising. Learning and Instruction 6(3), pp. 263-280. http://dx.doi.org/10.1016/0959-4752(96)00006-0
Hazzan, O. and Mike, K. (March 2022). Teaching core principles of machine learning with a simple machine learning algorithm: the case of the KNN algorithm in a high school introduction to data science course. ACM Inroads 13, pp. 18–25. https://doi.org/10.1145/3514217
Orit Hazzan is a professor at the Technion's Department of Education in Science and Technology. Her research focuses on computer science, software engineering, and data science education. For additional details, see https://orithazzan.net.technion.ac.il/. Dafna Gelbgiser is a lecturer (tenure track) at the Department of Labor Studies at Tel Aviv University's Faculty of Social Sciences. Her research examines the sources and patterns of inequality in education and labor market outcomes by race, immigrant status, gender, and social class background. For additional details, see https://english.tau.ac.il/profile/dgelbgiser.
No entries found