FIFA World Cup Grouping and Prediction

In this work, we predict the FIFA World Cup matches and attempt to propose a method to generate ideal grouping of teams for a balanced tournament using past match results and rankings which can be used to faciliate a tournament end-to-end. In summary, we use the data to generate and extract relevant features, and then use multiple supervised techniques to predict winner of a match. We implement multiple algorithms and do a thorough comparison of each of them. Apart from real data, we also explore creating fictitious matches and use semi-supervised learning in an attempt to improve the models. Alongside match predictions, we also use unsupervised clustering techniques to create groups that can facilitate a good tournament. Through these two processes, we create an end-to-end tool that can take in participating teams, build groups, predict results of matches and ultimately, predict a complete tournament.

Please find the full report which contains detailed description and results here - Report

Team Members - Ananya Sharma, Snigdha Verma, Apoorva Sinha, Yu-Chen Lin, Neelabh Sinha

Datasets

Results on Base Models

We analyze the performance of the various classification schemes on our dataset using all standard metrics of evaluating a supervised learning algorithm. We primarily want to optimize the accuracy of the prediction, but since we have a balanced dataset and no bias towards avoiding either true negatives or false positives, we treat them fairly. Our final results are as shown below:

Technique	Accuracy	Precision	Recall	F-1 score	ROC-AUC
Logistic Regression	73.49%	73.59%	73.49%	73.49%	0.81
SVM	73.38%	73.51%	73.38%	73.37%	0.81
Decision Tree	70.29%	70.46%	70.29%	70.26%	0.77
kNN	71.41%	71.51%	71.41%	71.40%	0.80
Gaussian Naive Bayes	72.54%	72.54%	72.54%	72.53%	0.80
Random Forest	71.69%	71.81%	71.69%	71.68%	0.80
Gradient Boosting	71.52%	72.06%	71.52%	71.41%	0.80
Adaptive Boost (base model - logistic regression)	73.16%	73.30%	73.16%	73.14%	0.81

Please review the report linked above for a detailed result and other experiments conducted including using synthetic data and self-supervised learning.

Usage Guide

Installation

Clone the repository:

git clone https://github.com/neelabhsinha/fifa-world-cup-prediction-ml.git
cd fifa-world-cup-prediction-ml

Install the required dependencies:
```
pip install -r requirements.txt
```

Arguments

To run the script, you need to use Python's argparse to specify the task and various options. Below are the tasks and arguments available.

Tasks

train: Train a machine learning model.
preprocess: Preprocess the data and generate features.
simulate_tournament: Simulate a FIFA World Cup tournament.
generate_artificial_data: Generate artificial match data.

Arguments

--task: Specifies the task to perform. Choices are train, preprocess, simulate_tournament, and generate_artificial_data.
--model: Specifies the model to use for training and prediction. Choices are defined in the models list.
--tune: If provided, tunes the hyperparameters for the model.
--do_pca: If provided, performs PCA on the data (optional).
--select_features: If provided, generates features from the data automatically.
--simulate_tournament: If provided, simulates a tournament end-to-end (task is not required in this case).
--use_artificial_data: If provided, uses artificial data for training. If used, generate the artificial match data

Example Commands

To train a model, use the following command:

python script_name.py --task train --model model_name --do_pca --select_features --use_artificial_data

To preprocess data and generate features,

python script_name.py --task preprocess

To generate artificial match data for self-supervised learning, do the following -

python script_name.py --task generate_artificial_data

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
checklists		checklists
data		data
feature		feature
metric		metric
model		model
model_hyperparameters		model_hyperparameters
model_parameters		model_parameters
results		results
utils		utils
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
const.py		const.py
env_ml_project.yaml		env_ml_project.yaml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FIFA World Cup Grouping and Prediction

Datasets

Results on Base Models

Usage Guide

Installation

Arguments

Tasks

Arguments

Example Commands

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

neelabhsinha/fifa-world-cup-prediction-ml

Folders and files

Latest commit

History

Repository files navigation

FIFA World Cup Grouping and Prediction

Datasets

Results on Base Models

Usage Guide

Installation

Arguments

Tasks

Arguments

Example Commands

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages