For inquiries or feedback, please contact:
This project analyzes and visualizes the popularity of dog breeds based on their occurrences in a dataset. The dataset includes information on various dog breeds, their ages, weights, colors, and genders. The primary goal is to determine and visualize the most and least popular dog breeds.
The dataset used for this analysis is named dogs_dataset.csv
and includes the following columns:
- Breed: The breed of the dog.
- Age (Years): The age of the dog in years.
- Weight (kg): The weight of the dog in kilograms.
- Color: The color of the dog.
- Gender: The gender of the dog.
Load the dataset into a Pandas DataFrame and inspect its structure.
import pandas as pd
# Load the dataset
dogie = pd.read_csv('dogs_dataset.csv') # Replace with your actual file path
# Inspect the first few rows and column names
print(dogie.head())
print(dogie.columns)
Ensure column names are free from leading or trailing spaces.
# Remove any leading/trailing spaces from column names
dogie.columns = dogie.columns.str.strip()
Count how often each breed appears in the dataset to determine popularity.
# Count occurrences of each breed
breed_counts = dogie['Breed'].value_counts().reset_index()
breed_counts.columns = ['Breed', 'Popularity']
# Display the result
print(breed_counts.head())
Create a bar chart to visualize the popularity of each breed.
import matplotlib.pyplot as plt
import seaborn as sns
# Bar Chart
plt.figure(figsize=(12, 6))
sns.barplot(x='Breed', y='Popularity', data=breed_counts, palette='viridis')
plt.title('Dog Breed Popularity')
plt.xlabel('Breed')
plt.ylabel('Popularity')
plt.xticks(rotation=45)
plt.show()
Create a pie chart to show the distribution of breed popularity.
# Pie Chart
plt.figure(figsize=(8, 8))
plt.pie(breed_counts['Popularity'], labels=breed_counts['Breed'], autopct='%1.1f%%', startangle=140, colors=sns.color_palette('viridis', len(breed_counts)))
plt.title('Dog Breed Popularity Distribution')
plt.show()
- pandas
- matplotlib
- seaborn
You can install the required libraries using pip:
pip install pandas matplotlib seaborn
- Ensure you have the dataset
dogs_dataset.csv
in the correct directory. - Run the Python script to load the dataset, analyze breed popularity, and generate visualizations.
This project is licensed under the MIT License. See the LICENSE file for details.
- Thanks to the dataset providers for making this analysis possible.
- Special thanks to the open-source community for the libraries used.
For questions or comments, please contact