Project Details
Analyzing Customer Reviews and Predicting Buying Behavior
Project Overview
This project aimed to analyze customer satisfaction and predict customer booking behavior for an airline. The tasks involved data collection and analysis of customer reviews, as well as building a predictive model to identify customers likely to book flights. The analysis was conducted using Python, and the results were presented using PowerPoint.
View the full notebook
Task 1: Customer Satisfaction Analysis
Situation:
The airline company sought to understand customer satisfaction through an analysis of customer reviews. The primary source of data was reviews from the website Skytrax.
Task:
Collect and analyze customer reviews from Skytrax, focusing on the airline. The goal was to gather as much data as possible to ensure accurate analysis.
Action:
- Data Collection: Used Python scripts to scrape data from Skytrax, resulting in 3607 customer reviews.
- Data Cleaning: Processed the data to remove duplicates and incomplete reviews.
- Data Analysis:
- Identified common topics in the reviews.
- Conducted sentiment analysis to determine the overall tone of the reviews.
- Created word clouds to visualize the frequency of key terms.
- Presentation: Summarized the findings in a PowerPoint slide, including visualizations and key metrics.
Result:
- Common Topics Identified:
- Flight experience
- Customer service
- In-flight entertainment
- Food and beverage
- Sentiment Analysis: Mixed sentiments with both positive and negative reviews.
- Visual Summary: Presented findings through visualizations and metrics in a PowerPoint slide.
Task 2: Predictive Modeling of Customer Bookings
Situation:
With increasing competition from low-cost carriers and online travel agencies, the airline needed to predict which customers are most likely to book flights. This information would help in targeting marketing campaigns and enhancing the customer experience.
Task:
Develop a predictive model to identify customers likely to book flights based on historical customer data, including purchase history, travel preferences, and demographics.
Action:
- Data Preparation: Cleaned and pre-processed the dataset.
- Model Training and Evaluation:
- Split the data into training and test sets.
- Evaluated several machine learning algorithms, including decision trees, random forests, and support vector machines.
- Selected the random forest algorithm based on its performance on the test set.
- Model Fine-Tuning: Adjusted the hyperparameters to optimize the model's performance.
Results:
- Model Accuracy: The random forest model achieved an accuracy of 84.68% on the test set.
- Important Features Identified:
- Purchase lead: Days between search and purchase.
- Flight hour: Hour of the day the flight departs.
- Booking origin: Origin airport for the flight.
- Length of stay: Duration between departure and return dates.
- Flight day: Day of the week the flight departs.
Conclusion
The analysis provided valuable insights into customer satisfaction and behavior. The findings from the customer reviews helped identify key areas of improvement for the airline. The predictive model demonstrated the potential to forecast customer bookings with a high degree of accuracy, enabling targeted marketing and improved customer experience. Future studies could explore additional factors such as flight prices, seat availability, and customer travel preferences to further refine the predictive model.