This project explores and analyzes youth unemployment trends in South Africa using a dataset derived from ILOSTAT's SDG indicator 8.5.2 — Unemployment Rate (%) by Age, Sex, and Country.
- Python (Google Colab) for data cleaning, encoding, and exploratory data analysis (EDA)
- Pandas, Matplotlib, Seaborn for visual analytics
- Power BI for interactive storytelling and dashboards
- GitHub for project version control and documentation
To understand the patterns, gender disparities, and time-based changes in unemployment among South African youth (ages 15–24), and present the insights using visual and statistical techniques.
-
Data Cleaning
- Filtered only South African data
- Focused on youth age group (15–24)
- Removed unnecessary columns with excessive null values
- Encoded categorical features for further analysis
-
Data Exploration
- Correlation analysis
- Time-series analysis by gender
- Detected notable spikes (for example post-2020 due to COVID-19)
-
Visualization
- Line plots show trends over time for Male, Female, and Total
- Highest unemployment consistently observed among young females
-
Power BI Storytelling (Coming Next)
- Visual dashboards with filters (gender, year)
- Key insights for policymaking and socio-economic review
- Youth unemployment has fluctuated between 42% to 70% over the past two decades.
- Females consistently face higher unemployment rates than males.
- Sharp increase post-2020 indicates COVID-19’s economic impact on youth employment.
- Add forecasting models (e.g., ARIMA, Facebook Prophet)
- Compare trends across other Southern African countries
- Integrate education/training level to explore deeper insights