Skip to content

BridgingAISocietySummerSchools/Data-Science-AI-Python-Course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

17 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Learn Python: A Course Designed Specifically for Data Science and AI

Python Version Jupyter License: MIT Course Duration Difficulty Data Science

From "What's Python?" to analyzing real datasets in just 3 hours

Course Overview

This comprehensive course bridges the gap between complete programming beginners and functional data science practitioners. Unlike typical Python courses that teach theoretical concepts, every lesson directly prepares you for real data science work.

Duration: 3 hours (180 minutes)
Prerequisites: None - designed for complete beginners
Goal: Master the foundational Python skills needed to understand and execute advanced data science notebooks

What Makes This Course Different?

๐ŸŽฏ Focused on Data Science

  • Every concept connects directly to real data science workflows
  • Learn list slicing (X[0:3]) used in virtually every ML notebook
  • Master NumPy operations that power machine learning algorithms
  • Practice string formatting for data analysis reports

**๐Ÿ“Š Real-World Context **

  • Calculate financial interest instead of printing "Hello, World!"
  • Analyze test scores and weather data
  • Work with realistic datasets and scenarios
  • Build projects that mirror actual data science work

๐Ÿ—๏ธ Progressive Skill Building

  • Each notebook builds on the previous one
  • Concepts are introduced when you need them
  • No overwhelming theory dumps
  • Solid foundation that won't crumble with advanced topics

Course Structure

Module 1: Python Fundamentals (45 minutes)

  • Notebook 1: Python Basics (20 minutes) - Master variables, data types, and operations through practical examples like calculating investment returns and formatting data analysis reports
  • Notebook 2: Control Structures (25 minutes) - Learn to make decisions and repeat operations with real scenarios like temperature analysis and data quality checking

Module 2: Data Structures and Operations (50 minutes)

  • Notebook 3: Lists and Data Structures (25 minutes) - Master the list operations you'll use in every data science project, from indexing to slicing to nested structures
  • Notebook 4: Dictionaries and Advanced Operations (25 minutes) - Work with key-value structures that form the backbone of data manipulation and API interactions

Module 3: Pandas Introduction (15 minutes)

  • Notebook 5: Pandas Preview (15 minutes) - Get a sneak peek at the most important data science library without overwhelming complexity

Module 4: Functions and Code Organization (35 minutes)

  • Notebook 6: Functions and Modules (20 minutes) - Learn to write clean, reusable code that you can maintain and scale
  • Break (15 minutes)

Module 5: Data Science Libraries (50 minutes)

  • Notebook 7: NumPy Fundamentals (25 minutes) - Master the numerical computing library that powers everything from simple statistics to complex machine learning algorithms
  • Notebook 8: Matplotlib Basics (25 minutes) - Create visualizations that turn raw data into compelling insights and actionable intelligence

Capstone Project: Real-World Application (45-60 minutes)

  • Notebook 9: Weather Data Analysis - Put it all together in a comprehensive project analyzing real weather data from multiple cities

Learning Objectives

By the end of this course, students will be able to:

Core Python Skills

  1. Write clean, professional Python code using variables, data types, and control structures
  2. Master data structures including lists, dictionaries, and nested structures with confidence
  3. Understand and debug common Python errors with systematic approaches

Data Science Fundamentals

  1. Use NumPy for numerical computations and array operations that power ML algorithms
  2. Create professional visualizations using matplotlib for data storytelling
  3. Work with pandas DataFrames for data manipulation and analysis

Real-World Application

  1. Read and understand advanced data science notebooks and ML code
  2. Apply Python skills to solve realistic data science problems
  3. Think like a data scientist with proper problem-solving approaches

Getting Started

Prerequisites

Absolutely none. We start from "What is a variable?" and build from there. Perfect for:

  • Business professionals who want to make data-driven decisions
  • Researchers looking to analyze data more effectively
  • Students preparing for a career in tech
  • Anyone curious about the power of data science

Time Investment

  • Core Learning: 3 hours of focused study
  • Practice & Mastery: Additional 2-3 hours working through exercises
  • Total Value: A solid foundation for years of data science growth

What You'll Need

  • A computer with internet access
  • The desire to learn and experiment
  • Patience with yourself (every expert was once a beginner)

Hardware Requirements

Component Minimum Recommended Optimal
RAM 4 GB 8 GB 16 GB+
Storage 2 GB free 5 GB free 10 GB+ free
CPU Dual-core 2.0GHz Quad-core 2.5GHz 8+ cores 3.0GHz+
Python 3.7+ 3.9+ 3.11+
Internet Basic broadband Reliable connection High-speed

Performance Expectations

System Type Notebook Load Time Large Dataset Processing Visualization Rendering
Minimum 5-10 seconds 10-30 seconds 3-5 seconds
Recommended 2-5 seconds 3-10 seconds 1-2 seconds
Optimal <2 seconds <3 seconds <1 second

Quick Setup (Recommended)

The easiest way to get started is using the provided setup script:

# Clone or download this repository
cd Data-Science-AI-Python-Course

# Run the setup script (macOS/Linux)
./setup.sh

# Or manually run the commands:
# python3 -m venv venv
# source venv/bin/activate
# pip install -r requirements.txt

This will:

  • Create a virtual environment (venv/)
  • Install all required packages
  • Set up a Jupyter kernel specifically for this course

Manual Setup

If you prefer to set up manually:

Prerequisites

  • Python 3.7+ installed on your system
  • pip package manager

Installation

  1. Create a virtual environment:
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install required packages:
pip install -r requirements.txt
  1. Install Jupyter kernel:
python -m ipykernel install --user --name=data-science-course --display-name="Python (Data Science Course)"

Running the Course

  1. Activate your virtual environment:
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Start Jupyter Notebook:
jupyter notebook
  1. In Jupyter, make sure to select the "Python (Data Science Course)" kernel
  2. Start with 01_python_basics.ipynb and work through in order
  3. Execute each cell by pressing Shift+Enter
  4. Complete the practice exercises in each notebook

Deactivating the Environment

When you're done working, deactivate the virtual environment:

deactivate

Course Files & Project Structure

๐Ÿ“ Data-Science-AI-Python-Course/
โ”œโ”€โ”€ ๐Ÿ““ 01_python_basics.ipynb          # Variables, data types, financial calculations
โ”œโ”€โ”€ ๐Ÿ““ 02_control_structures.ipynb     # Conditionals, loops, temperature analysis  
โ”œโ”€โ”€ ๐Ÿ““ 03_lists_data_structures.ipynb  # Lists, indexing, data manipulation
โ”œโ”€โ”€ ๐Ÿ““ 04_dictionaries_advanced.ipynb  # Dictionaries, nested structures, APIs
โ”œโ”€โ”€ ๐Ÿ““ 05_pandas_preview.ipynb         # First taste of data science ecosystem
โ”œโ”€โ”€ ๐Ÿ““ 06_functions_modules.ipynb      # Clean, reusable code practices
โ”œโ”€โ”€ ๐Ÿ““ 07_numpy_fundamentals.ipynb     # Numerical computing, ML foundations
โ”œโ”€โ”€ ๐Ÿ““ 08_matplotlib_basics.ipynb     # Professional data visualization
โ”œโ”€โ”€ ๐Ÿ““ 09_capstone_project.ipynb      # Comprehensive weather analysis
โ”œโ”€โ”€ ๐Ÿ“„ README.md                      # Course overview and instructions
โ”œโ”€โ”€ ๐Ÿ“„ requirements.txt               # Python package dependencies
โ”œโ”€โ”€ ๐Ÿ› ๏ธ setup.sh                       # Automated environment setup
โ”œโ”€โ”€ ๐Ÿ“Š Python Data Science Cheat Sheet.md  # Quick reference guide
โ”œโ”€โ”€ ๐Ÿ“š Course Enhancement Summary.md   # Development notes and features
โ””โ”€โ”€ ๐Ÿ“‹ Python for Data Science - 3 Hour Beginner Course.md  # Detailed curriculum

Notebook Descriptions

Notebook Duration Key Skills Real-World Application
01 Python Basics 20 min Variables, data types, operations Investment portfolio calculations
02 Control Structures 25 min If/else, loops, conditions Temperature analysis & data validation
03 Lists & Data Structures 25 min List operations, indexing, slicing Data manipulation workflows
04 Dictionaries & Advanced 25 min Key-value pairs, nested structures API data handling
05 Pandas Preview 15 min DataFrame basics, data loading Real dataset exploration
06 Functions & Modules 20 min Code organization, reusability Clean data science practices
07 NumPy Fundamentals 25 min Array operations, mathematics Machine learning foundations
08 Matplotlib Basics 25 min Data visualization, plotting Professional reporting
09 Capstone Project 45-60 min All skills combined Complete data analysis

Teaching Notes

For Instructors:

  • Each notebook includes detailed explanations and examples
  • Practice exercises are provided throughout
  • Notebooks build progressively - don't skip ahead
  • Encourage students to experiment with the code
  • Allow extra time for students who need it

Pacing Guidelines:

  • Beginners: May need 4-5 hours total
  • Some programming experience: 3 hours as designed
  • Quick learners: May finish in 2.5 hours

Common Issues:

  • Import errors: Ensure numpy and matplotlib are installed
  • Jupyter issues: Make sure Jupyter is properly installed and running
  • Syntax errors: Emphasize proper indentation in Python

Next Steps After Completion

Students will be ready to:

  1. Understand advanced notebooks with machine learning algorithms
  2. Work with pandas for data manipulation and cleaning
  3. Use scikit-learn for machine learning without syntax confusion
  4. Explore real datasets and perform meaningful analysis
  5. Build their own data science projects with confidence
  6. Read and contribute to open-source data science projects

Troubleshooting

Common Issues:

Jupyter won't start:

pip install --upgrade jupyter
jupyter notebook

Import errors:

pip install numpy matplotlib pandas
# Or reinstall all requirements
pip install -r requirements.txt

Plots not showing:

  • Make sure %matplotlib inline is executed
  • Try restarting the Jupyter kernel
  • Check if matplotlib is properly installed: import matplotlib

Code not working:

  • Check for proper indentation (Python is whitespace-sensitive)
  • Ensure all cells are executed in order
  • Restart kernel and run all cells if needed
  • Clear output and restart: Kernel โ†’ Restart & Clear Output

Virtual environment issues:

# Recreate virtual environment
rm -rf venv
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Permission errors (macOS/Linux):

chmod +x setup.sh
./setup.sh

Slow performance:

  • Close unnecessary browser tabs
  • Restart Jupyter notebook
  • Check available RAM: Activity Monitor (Mac) or Task Manager (Windows)
  • Consider upgrading hardware if below minimum requirements

Command Reference

Task Command Description
Environment Setup python3 -m venv venv Create virtual environment
Activate Environment source venv/bin/activate Activate on macOS/Linux
Activate Environment venv\Scripts\activate Activate on Windows
Install Packages pip install -r requirements.txt Install all dependencies
Start Jupyter jupyter notebook Launch Jupyter interface
Check Python Version python --version Verify Python installation
List Packages pip list Show installed packages
Deactivate deactivate Exit virtual environment

Advanced Troubleshooting

Kernel not found:

python -m ipykernel install --user --name=data-science-course

Package conflicts:

pip install --upgrade pip
pip install -r requirements.txt --force-reinstall

Jupyter extensions not working:

jupyter contrib nbextension install --user
jupyter nbextension enable --py widgetsnbextension

Support and Resources

Getting Help

  • ๐Ÿ› Found a bug? Open an issue on GitHub
  • โ“ Have questions? Check the Discussions section
  • ๐Ÿ’ก Want to contribute? See CONTRIBUTING.md
  • ๐Ÿ“ง Need direct support? Contact course maintainers

Additional Learning Resources:

Practice Datasets:

Advanced Courses & Specializations:

  • ๐Ÿค– Machine Learning: Andrew Ng's ML Course (Coursera)
  • ๐Ÿงฎ Deep Learning: fast.ai Practical Deep Learning
  • ๐Ÿ“Š Data Analysis: Python for Data Science (edX)
  • ๐Ÿ”ฌ Statistics: Statistical Learning (Stanford Online)

Development Tools:

  • IDEs: VS Code, PyCharm, Spyder
  • Cloud Notebooks: Google Colab, Azure Notebooks, AWS SageMaker
  • Version Control: Git and GitHub basics
  • Environment Management: Conda, pipenv, Docker

Community:

  • ๐Ÿ Python Community: r/Python, Python Discord
  • ๐Ÿ“Š Data Science: r/datascience, Kaggle Community
  • ๐Ÿง  Machine Learning: r/MachineLearning, ML Twitter
  • ๐Ÿ’ฌ Stack Overflow: python, pandas, matplotlib tags

Take the Next Step

โญ Star this repo if you find it helpful
๐Ÿด Fork it to customize for your own learning
๐Ÿ’ฌ Share your progress with the community
๐Ÿ“ Contribute improvements and suggestions


๐Ÿ“Š Repository Stats

Repository Size Last Commit Contributors Issues

๐Ÿค Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Contributors โœจ

Thanks to everyone who has contributed to making this course better!

๐Ÿ“‹ Changelog

See CHANGELOG.md for detailed version history and updates.

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


๐ŸŒŸ Acknowledgments

  • Inspired by the data science community's need for practical Python education
  • Built with feedback from beginners and experienced practitioners
  • Designed to bridge the gap between theory and real-world application

Remember: Every expert was once a beginner. The only difference is they started.

What will you build with your data science skills? ๐Ÿš€


Made with โค๏ธ for the Data Science Community

โฌ† Back to Top