Project Details

Predicting Factors Behind Employee Attrition

Using Logistic Regression, Random Forest Classifier, and XGBoost



This is the capstone project from the Google Advanced Data Analytics Professional Program on Coursera


Attrition Analysis


View the full notebook


Introduction

In today's fiercely competitive job market, employee attrition poses a significant challenge for organizations. High turnover rates disrupt productivity, strain resources, and impact overall business performance. To tackle this issue, I embarked on a data-driven journey to uncover the factors behind employee attrition and identify potential solutions.


Stakeholders:

The stakeholders for this project include the HR department, management team, and company executives, all of whom are invested in understanding employee turnover and identifying factors contributing to attrition.


Goal

The goal of this project is to analyze employee turnover and identify the key factors influencing attrition. This insight enables the company to take proactive measures to improve employee retention and enhance workforce management.


Objective

The objective was to analyze a comprehensive HR dataset and build predictive models to gain insights into why employees leave the company. By understanding the key drivers of attrition, we aimed to provide actionable recommendations to reduce turnover and boost employee retention.


Approach


Situation

Our dataset included various employee attributes such as satisfaction level, last evaluation, number of projects, average monthly hours, tenure, work accidents, promotions, salary, and department. It covered both employees who had left the company and those who had stayed.


Task

The primary task was to conduct Exploratory Data Analysis (EDA) to understand the relationships between variables and identify patterns and trends. We aimed to explore the impact of different factors on employee attrition and create predictive models for informed decision-making.


Action

During EDA, we identified and removed duplicates from the dataset. Columns were renamed, and categorical variables were encoded for model building. Data visualization tools such as box plots, scatter plots, and bar charts were used to uncover insights.

We then applied machine learning models, including Logistic Regression, Random Forest, and XGBoost, to predict employee attrition. Hyperparameters were optimized using GridSearchCV, and models were evaluated based on accuracy, precision, recall, F1-score, and AUC-ROC.


View the full notebook below:


Comparison of Evaluation Metrics:

Model Accuracy Precision Recall F1-score AUC-ROC
Logistic Regression 82.51% 46.69% 26.19% 33.56% 88.13%
Random Forest Classifier 98.50% 98.05% 92.80% 95.35% 98.06%

Result

Key insights from the analysis include:

  • Low satisfaction levels and low tenure were strongly associated with higher attrition rates.
  • Work accidents had a minimal impact on attrition.
  • Lack of promotions was linked to increased attrition.
  • Sales and technical departments experienced higher attrition.
  • Salary levels were crucial, with employees in lower salary brackets more likely to leave.

Conclusion

Our data-driven approach provided valuable insights into employee attrition. Organizations can leverage these findings to take targeted actions aimed at improving employee retention.


Recommendations

Based on the results, we recommend:

  1. Conducting regular employee satisfaction surveys to identify and address areas of improvement.
  2. Implementing talent development programs to offer career advancement opportunities and promotions.
  3. Reviewing and adjusting salary scales to ensure competitiveness and retain top talent.
  4. Providing support for work-life balance to reduce burnout and stress.

Next Steps

While our models demonstrated high accuracy and predictive power, continuous monitoring and refinement are essential. Organizations should regularly update the models with fresh data to maintain their relevance and effectiveness.


Ethical Considerations

Throughout the project, we adhered to ethical principles, ensuring the confidentiality and privacy of sensitive employee data.

In conclusion, our data-driven insights into employee attrition offer a roadmap for organizations to build a motivated and engaged workforce. Employee retention is an investment in the future, and by taking informed actions, companies can create a thriving workplace and foster long-term success.