Showcasing Data Science Methods Analyzing Post-Retirement Employment Decisions

Aktivität: Vorträge und GastvorlesungenKonferenzvorträgeForschung

Jan F. Deller - Sprecher*in

Jürgen F. Deller - Sprecher*in

Post-retirement employment of older individuals has increasingly gained attention in the last decade. In order to better understand post-retirement decision making, Fasbender, Wang, Voltmer, and Deller (2016) investigated the meaning of work and its relationship to post-retirement employment by using population-representative data from the German Transitions and Old Age Potential (TOP) study. They tested their hypotheses with data from 2,149 pensioners aged 60-70 years for relationships of personal, financial, and generative meaning of work with post-retirement employment. They applied logistic regression analysis to predict post retirement employment. In an additional exploratory analysis Fasbender et al. (2016) tried to answer a research question on meaning of work and post-retirement volunteering.
Given the recent developments of data science and artificial intelligence, this paper investigates the capability of new and evolving methods to answer the same scientific questions using the same data set. Additionally, it presents the methods’ upsides and downsides to provide guidance for others interested in applying these methods.
First, the authors provide an overview over data science methods and discuss their applicability. Second, selected methods are applied to the TOP dataset. The methods include both unsupervised and supervised learning. While unsupervised learning aims to identify patterns in the data and can be viewed as method to extend exploratory data analysis, supervised learning models the effect of independent variables on a dependent variable (Hastie et al., 2009). This paper concentrates on the applicability of supervised learning. Into this category falls the generalized linear model (GLM) employed by Fasbender et al. (2016) as well as a wide range of other models, e.g., decision trees, support vector machines, or deep neural networks. These models differ in their assumptions about the functional relationship and therefore the complexity. While GLMs assume a linear effect of each independent variable on some transformation of the dependent variable, more complex models like deep neural networks relax this assumption of linearity and can model any functional relationship. Thus, more complex data generating processes can be fitted. After replicating the results of Fasbender et al. (2016), more complex models are fitted and their predictive performance is compared using statistical cross-validation. As these models are more difficult to interpret, techniques from interpretable machine learning including both local and global interpretation methods like individual conditional expectation and partial dependence plots are employed to better understand the models.
The discussion of the paper shows potential benefits and challenges applying data science methods to extend the toolkit of quantative analysis methods. Benefits of supervised methods include a higher explained variance and more accurate modeling of complex relationships. The challenges surrounding their interpretability are discussed and methods to mitigate this are presented.
This paper provides researchers (a) with a case to educate them on the use of data science methods for data analysis and (b) shows that these methods can be applied in order to explain more variance of post-retirement decision making relevant for individuals and organizations alike.
01.01.2023 → …

Veranstaltung

7th Age in the Workplace Meeting

15.11.2317.11.23

Wilna, Litauen

Veranstaltung: Konferenz

Dokumente