Skip to content

YashMotwani/We-Rate-Dogs-Data-Wrangling

Repository files navigation

We Rate Dogs - Data Wrangling

Introduction

Real-world data rarely comes clean. Here, the main goal is to wrangle WeRateDogs Twitter data to create interesting and trustworthy analyses and visualizations. Using Python and its libraries, I have gathered data from a variety of sources and in a variety of formats, assessed its quality and tidiness, and then cleaned it under the data wrangling process.

Here, with documenting my wrangling efforts, I have also showcased them through analyses and visualizations using Python (and its libraries).

The dataset that I have wrangled (and analyzed and visualized) is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because "they're good dogs Brent." WeRateDogs has over 4 million followers and has received international media coverage.

Softwares needed:

You will need an installation of Python, plus the following libraries:

  1. pandas
  2. NumPy
  3. requests
  4. tweepy
  5. json
  • A text editor, like VS Code or Atom.
  • A terminal application (Terminal on Mac and Linux or Cygwin on Windows).

Installation links for softwares:

Summary:

The whole report can be summarized into the following 2 files which are present in this repository:

  • For getting a brief of the Data Wrangling process, check wrangle_report.html
  • For visualizations and important insights, check act_report.pdf

References

  1. Reading and writing json to a file

  2. Unique Rating System of WeRateDogs

  3. Tidy Data Rules

About

Report emphasizing on wrangling efforts for WeRateDogs tweet archive data

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published