Skip to content

Fix exercises typos #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 1\n",
"##### Exercise 1\n",
"Use the adult.csv dataset and run the codes shown in the following Screenshots. Then answer the questions."
]
},
Expand Down Expand Up @@ -70,7 +70,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 2 \n",
"##### Exercise 2 \n",
"\n",
"For adult_df use the .groupby() function to run the following code and create the multi-index Series mlt_sr."
]
Expand Down Expand Up @@ -295,7 +295,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 3\n",
"##### Exercise 3\n",
"For this exercise you need to use a new dataset: billboard.csv. Visit https://www.billboard.com/charts/hot-100 and see the latest song rankings of the day. This dataset presents information and ranking of 317 song tracks in 80 columns. The first four columns are artist, track, time, and date_e. The first columns are intuitive descriptions of song tracks. The column date_e shows the date that the songs entered the hot-100 list. The rest of 76 columns are songs ranking at the end of each weeks from 'w1' to 'w76'. Download and read this dataset using pandas and answer the following questions."
]
},
Expand Down Expand Up @@ -431,7 +431,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 4 \n",
"##### Exercise 4 \n",
"\n",
"We will use LaqnData.csv for this exercise. Each row of this dataset shows an hourly measurement recording of one of the five following air pollutants: NO, NO2, NOX, PM10, and PM2.5. The data was collected in a location in Londan for the entirety of year 2017. Read the data using Pandas and perform the following tasks."
]
Expand Down Expand Up @@ -653,7 +653,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 5 \n",
"##### Exercise 5 \n",
"\n",
"We will continue working with LaqnData.csv. \n",
"\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 1\n",
"##### Exercise 1\n",
"Use adult.csv and Boolean Masking to answer the following questions. "
]
},
Expand Down Expand Up @@ -242,7 +242,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 2 \n",
"##### Exercise 2 \n",
" a)\tRepeat the analysis on Exercise 1. a), but this time use groupby function. \n",
" b)\tb) compare the runtime of using BM vs. groupby. (hint: you can import the module time and use the fuction .time()) \n"
]
Expand All @@ -265,7 +265,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 3 \n",
"##### Exercise 3 \n",
"\n",
" If you have not already, solve exercise 4 in the previous chapter. After you created pvt_df for Exercises 4, run the following code.\n"
]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 1\n",
"##### Exercise 1\n",
"1)\tFrom 5 colleagues or classmates ask to provide a definition for the term data. \n",
"\n",
" a)\tReport these definitions and indicate the similarity among them. \n",
Expand All @@ -36,7 +36,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 2\n",
"##### Exercise 2\n",
"\n",
"For this exercise, we are going to use covid_impact_on_airport_traffic.csv. Answer the following questions. This dataset is from Kaggle.com, use this link to see its page: https://www.kaggle.com/terenceshin/covid19s-impact-on-airport-traffic.\n",
"The key attribute of this dataset is PercentOfBaseline which shows the ratio of air traffic in the specific day compared to pre-pandemic time (1st Feb to 15th March 2020)"
Expand Down Expand Up @@ -335,7 +335,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 3 \n",
"##### Exercise 3 \n",
"\n",
"For this exercise, we are going to use US_Accidents.csv. Answer the following questions. This dataset is from Kaggle.com, use this link to see its page: https://www.kaggle.com/sobhanmoosavi/us-accidents.\n",
"This dataset shows all the car accidents in the US from February 2016 to Dec 2020. \n",
Expand Down Expand Up @@ -769,7 +769,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 4 \n",
"##### Exercise 4 \n",
"\n",
"For this exercise, we are going to use fatal-police-shootings-data.csv. There are a lot of debates, discussions, dialogues, and protests happening in the US surrounding police killings. The Washington Post has been collecting data on all fatal police shootings in the US. The dataset available to the government and the public alike has date, age, gender, race, location, and other situational information of these fatal police shootings. You can read more about this data on https://www.washingtonpost.com/graphics/investigations/police-shootings-database/, and you can download the last version of the data from https://github.com/washingtonpost/data-police-shootings"
]
Expand Down Expand Up @@ -980,7 +980,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 5\n",
"##### Exercise 5\n",
"For this exercise, we will be using electricity_prediction.csv. The screenshot below shows the 5 rows of this dataset and a linear regression model created to predict electricity consumption based on the weekday and daily average temperature. "
]
},
Expand Down Expand Up @@ -1137,7 +1137,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 6\n",
"##### Exercise 6\n",
"For this exercise, we will be using adult.csv. we used this dataset extensively in chapter 1. Read the dataset using Padans and call it adult_df."
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 1\n",
"##### Exercise 1\n",
"In your own words, describe the difference between a dataset and a database. \n"
]
},
Expand All @@ -31,7 +31,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 2\n",
"##### Exercise 2\n",
"What are the advantages and disadvantages of structuring data for a relational database? Mention at least two advantages and two disadvantages. Use examples to elucidate. "
]
},
Expand All @@ -54,7 +54,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 3 \n",
"##### Exercise 3 \n",
"\n",
"In this chapter, we were introduced to 4 different types of databases: relational databases, unstructured databases, distributed databases, and blockchain. \n",
"\n",
Expand Down Expand Up @@ -118,7 +118,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 4 \n",
"##### Exercise 4 \n",
"In this chapter, we were introduced to five different methods of connecting to databases: direct connection, webpage connection, API connection, request connection, and publicly shared. Use the following table to indicate a ranking for each of the five methods of connecting to databases based on the specified criteria. Study the rankings and provides reasoning for why they are correct."
]
},
Expand Down Expand Up @@ -176,7 +176,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 5\n",
"##### Exercise 5\n",
"Using the Chinook database as a sample, we want to investigate and find an answer to the following question: Do tracks that are titled using positive words sell better on average than tracks that are titled with negative words. We would like to only focus on the following words in the investigations. \n",
"\n",
"- List of negative words: ['Evil', 'Night', 'Problem', 'Sorrow', 'Dead', 'Curse', 'Venom', 'Pain', 'Lonely', 'Beast']\n",
Expand Down Expand Up @@ -246,7 +246,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Excercise 6\n",
"##### Exercise 6\n",
"In the year 2020, which of the following 12 stocks experienced the highest growth. \n",
"\n",
"Stocks: [‘Baba’, ‘NVR’, ‘AAPL’, ‘NFLX’, ‘FB’, ‘SBUX’, ‘NOW’, ‘AMZN’, ‘GOOGL’, ‘MSFT’, ‘FDX’, ‘TSLA’]\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
" AUTHOR: Dr. Roy Jafari \n",
"\n",
"### Chapter 5: Data Visualization \n",
"#### Excercises"
"#### Exercises"
]
},
{
Expand All @@ -30,7 +30,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 1\n",
"# Exercise 1\n",
"In this exercise, we will be using Universities_imputed_reduced.csv. Draw the following described visualizations.\n",
"\n",
" a.\tUse boxplots to compare the student to faculty ratio (stud./fac. ratio) for the two population public and private universities.\n",
Expand Down Expand Up @@ -233,7 +233,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 2\n",
"# Exercise 2\n",
"\n",
"In this exercise, we will continue using Universities_imputed_reduced.csv. Draw the following described visualizations.\n",
"\n",
Expand Down Expand Up @@ -288,7 +288,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 3\n",
"# Exercise 3\n",
"\n",
"For this example, we will be using WH Report_preprocessed.csv. Draw the following described visualizations.\n",
"\n",
Expand Down Expand Up @@ -352,7 +352,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 4\n",
"# Exercise 4\n",
"\n",
"For this exercise, we will continue using WH Report_preprocessed.csv. Draw the following described visualizations.\n",
"\n",
Expand Down Expand Up @@ -392,7 +392,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 5\n",
"# Exercise 5\n",
"\n",
"For this exercise, we will be using whickham.csv. Draw the following described visualizations.\n",
"\n",
Expand Down Expand Up @@ -587,7 +587,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 6\n",
"# Exercise 6\n",
"\n",
"For this exercise, we will be using WH Report_preprocessed.csv. \n",
"\n",
Expand Down Expand Up @@ -637,7 +637,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 7\n",
"# Exercise 7\n",
"\n",
"For this exercise, we will continue using WH Report_preprocessed.csv. \n",
"\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
" AUTHOR: Dr. Roy Jafari \n",
"\n",
"### Chapter 6: Prediction \n",
"#### Excercises"
"#### Exercises"
]
},
{
Expand All @@ -30,7 +30,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 1\n",
"# Exercise 1\n",
"“MLP has the potential to create prediction models that are more accurate than predictions models that are created by linear regression.” This statement is generally correct. In this exercise, we want to explore one of the reasons why the statement is correct. Answer the following questions.\n",
"\n",
" a) The following formula shows the linear equation that we used to connect the dependent and independent attributes of the MSU number of applications problem. Count and report the number of coefficients that Linear Regression can play with to fit the equation to the data. \n",
Expand All @@ -53,7 +53,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 2\n",
"# Exercise 2\n",
"2.\tIn this exercise, we will be using ToyotaCorolla_preprocessed.csv. This dataset has the following columns: Age, Milage_KM, Quarterly_Tax, Weight, \tFuel_Type_CNG, Fuel_Type_Diesel, Fuel_Type_Petrol, and Price. Each data object in this dataset is a used Toyota Corolla car. We would like to use this dataset to predict the price of used Toyota Corolla cars. \n"
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
" AUTHOR: Dr. Roy Jafari \n",
"\n",
"### Chapter 7: Classification \n",
"#### Excercises"
"#### Exercises"
]
},
{
Expand All @@ -30,7 +30,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 1\n",
"# Exercise 1\n",
"The chapter asserts that before using KNN you will need to have your independent attributes normalized. This is certainly true, but how come we were able to get away with no-normalization when we performed KNN using visualization? See Figure 7.3. \n"
]
},
Expand All @@ -45,7 +45,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 2\n",
"# Exercise 2\n",
"We did not normalize the data when applying the Decision Tree to the Loan Application problem. For practice and deeper understanding, apply the Decision Tree to the normalized data, and answer the following questions. "
]
},
Expand Down Expand Up @@ -88,7 +88,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 3\n",
"# Exercise 3\n",
"For this exercise, we are going to use the Customer Churn.csv. This dataset is randomly collected from an Iranian telecom company’s database over a period of 12 months. A total of 3150 rows of data, each representing a customer, bear information for 13 columns. The attributes that are in this dataset are listed below:\n",
" \n",
" Call Failures: number of call failures\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
" AUTHOR: Dr. Roy Jafari \n",
"\n",
"### Chapter 8: Clustering Analysis\n",
"#### Excercises"
"#### Exercises"
]
},
{
Expand All @@ -29,7 +29,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 1\n",
"# Exercise 1\n",
"In your own words, answer the following two questions. Use at most 200 words, to answer each question.\n",
"\n",
" a.\tWhat is the difference between Classification and Prediction?\n",
Expand All @@ -47,7 +47,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 2\n",
"# Exercise 2\n",
"Consider Figure 8.6 regarding the necessity of normalization before performing Clustering analysis. With this new appreciation you developed in this chapter, would you like to change your answer to the first exercise question from the previous chapter?\n"
]
},
Expand All @@ -62,7 +62,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 3\n",
"# Exercise 3\n",
"In this chapter, we used WH Report_preprocessed.csv to form meaningful clusters of countries only using 2019 data. In this exercise, we want to use the data of all the years 2010-2019. Perform the following steps to do this."
]
},
Expand Down Expand Up @@ -126,7 +126,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 4\n",
"# Exercise 4\n",
"For this exercise we will be using the dataset Mall_Customers.xlsx to form 4 meaningful clusters of customers. The following steps will help you to do this correctly. "
]
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
" AUTHOR: Dr. Roy Jafari \n",
"\n",
"### Chapter 9: Data Cleaning - Levels Ⅰ and Ⅱ \n",
"#### Excercises"
"#### Exercises"
]
},
{
Expand All @@ -28,7 +28,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 1\n",
"# Exercise 1\n",
"In your own words describe the relationship between analytics goals and data cleaning. Your response should answer the following questions.\n",
"\n",
" a.\t Is data cleaning a separate step of data analytics and can be done in isolation? In other words, can data cleaning be performed without knowing about the analytics?\n",
Expand All @@ -47,7 +47,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Excercise 2\n",
"# Exercise 2\n",
"\n",
"A local airport to analyze the usage of its parking has employed a Single Beam Infrared Detector (SBID) technology to count the number of people who pass the gate from the parking to the airport. \n",
"As shown in the following figure, an SBDI records the time every time the infrared connection is blocked signaling the entrance or the exit of a passenger."
Expand Down
Loading