This project explores the geopolitical relationships between countries during the 20th century. Using web scraping and natural language processing (NLP) techniques, the analysis visualizes interrelations and major events to provide insights into how historical ties shaped the current global political landscape.
The goal is to scrape historical data from Wikipedia, analyze it using NLP to identify relationships between countries, and present the findings through a dynamic network visualization.
- Mapped key geopolitical interactions throughout the 20th century.
- Identified central countries in the global network using degree, closeness, and betweenness centrality measures.
- Highlighted major historical milestones and their impact on international relations.
- Data Collection: Web scraping (BeautifulSoup, requests)
- Text Analysis: Natural Language Processing (NLTK, SpaCy)
- Visualization: NetworkX, Matplotlib, Plotly
- Environment: Python, Jupyter Notebook
Data was scraped from Wikipedia pages on 20th-century history and politics. Relationships were extracted using NLP algorithms to create a network graph of country interactions.
- Web Scraping:
Extracted relevant historical information from Wikipedia pages. - NLP Analysis:
Processed textual data to identify countries involved in major global events. - Network Visualization:
Developed an interactive network graph that visualizes country relationships and distinguishes between different communities and interactions.
- Clone the repository.
- Ensure Python and the required libraries are installed (see
requirements.txt
). - Run the Jupyter Notebook to execute the scraping, analysis, and visualization steps.
- Dynamic Network Graph: A clear, interactive visualization of 20th-century country relations.
- Analysis Report: A summary of key relationships and insights derived from the data.
- The analysis is based solely on data from Wikipedia, which may have inherent biases or missing details.
- The NLP approach relies on keyword extraction and may miss nuanced interactions.
- Expand the dataset by including additional historical sources.
- Apply more advanced NLP techniques, such as named entity recognition (NER) and sentiment analysis, for deeper insights.