🧬 Omics Data Analysis: A Computational Science Workshop

The omics technology revolution has generated massive volumes of biological data that require differential analysis for correct interpretation. These approaches necessitate the implementation of computational tools and methodologies to derive biological meaning from this type of data across various biological contexts.

This 32-hour intensive course offers practical and up-to-date training in data science applied to the analysis of omics data, such as metagenomics, transcriptomics, and proteomics.

The primary objective is to train researchers, bioinformaticians, and professionals in the biological and health sciences in the management of omics data analysis tools and methodologies to extract meaningful information. Through interactive lectures, practical exercises, and the use of real data, participants will develop skills to explore, visualize, and interpret omics data, in addition to applying biological network models to address questions concerning the data utilized.

The course is structured over four days, beginning with an introduction to the fundamentals of data science and the particular characteristics of omics data. Topics covered will include processing techniques, analysis of metagenomic, transcriptomic, and proteomic data, visualization, functional enrichment, and multi-omics integration. A specific module will be dedicated to the application, analysis, and visualization of biological networks derived from the utilized and analyzed data.

The course culminates with a module where participants will apply everything learned to a real-world case study, working with public data. The practical sessions will be conducted in Python, utilizing Jupyter Notebooks and other visualization tools.

Keywords

Computational microbiology, networks, databases, Python, programming, data, pipelines, data science.

Syllabus

Time	DAY 1	DAY 2	DAY 3	DAY 4
8:00-8:45	Introduction and Housekeeping	Introduction to Python I	Introduction to Networks in Python	Analysing Networks I
8:45-9:30	From Omics to Multi-omics	Introduction to Python II	Introduction to Networks in Python	Analysing Networks II
9:30-10:00	Coffee break	Coffee break	Coffee break	Coffee break
10:00-10:45	Introduction to Networks	Working with Data in Python I	Visualising Networks -- Cytoscape	Invited Speaker - Professor Carlos Muskus
10:45-11:30	Open Science	Working with Data in Python II	Visualising Networks -- Cytoscape	Multi-omics
11:30-13:00	Lunch	Lunch	Lunch	Lunch
13:00-13:45	Standardising Omics Workflows with Nextflow	Visualizing Data in Python I	Omics: Transcriptomics	Multi-omics I
13:45-14:30	Omics: Metagenomics	Visualizing Data in Python II	Preprocessing Transcriptomics with nf-core/RNAseq	Multi-omics II
14:30-15:00	Coffee break	Coffee break	Coffee break	Coffee break
15:00-16:00	Preprocessing Metagenomics with nf-core/Taxprofiler	Metagenomics Basic Analysis I	Transcriptomics Basic Analysis I	Recap and Q&A
16:00-17:00	Recap and Q&A	Metagenomics Basic Analysis II	Transcriptomics Basic Analysis II	Recap and Q&A

Further Resources

References

Empowering bioinformatics communities with Nextflow and nf-core Björn E Langer, Andreia Amaral, Marie-Odile Baudement, Franziska Bonath, Mathieu Charles, Praveen Krishna Chitneedi, Emily L Clark, Paolo Di Tommaso, Sarah Djebali, Philip A Ewels, Sonia Eynard, James A Fellows Yates, Daniel Fischer, Evan W Floden, Sylvain Foissac, Gisela Gabernet, Maxime U Garcia, Gareth Gillard, Manu Kumar Gundappa, Cervin Guyomar, Christopher Hakkaart, Friederike Hanssen, Peter W Harrison, Matthias Hörtenhuber, Cyril Kurylo, Christa Kühn, Sandrine Lagarrigue, Delphine Lallias, Daniel J Macqueen, Edmund Miller, Júlia Mir-Pedrol, Gabriel Costa Monteiro Moreira, Sven Nahnsen, Harshil Patel, Alexander Peltzer, Frederique Pitel, Yuliaxis Ramayo-Caldas, Marcel da Câmara Ribeiro-Dantas, Dominique Rocha, Mazdak Salavati, Alexey Sokolov, Jose Espinosa-Carrasco, Cedric Notredame, The Nf-Core Community resource
nf-core/taxprofiler Sofia Stamouli, Moritz E. Beber, Tanja Normark, Thomas A. Christensen II, Lili Andersson-Li, Maxime Borry, Mahwash Jamy, nf-core community, James A. Fellows Yates resource
nf-core/rnaseq Philip A. Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso, Sven Nahnsen resource
quantms: a cloud-based pipeline for quantitative proteomics enables the reanalysis of public proteomics data *Chengxin Dai, Julianus Pfeuffer, Hong Wang, Ping Zheng, Lukas Käll, Timo Sachsenberg, Vadim Demichev, Mingze Bai, Oliver Kohlbacher, Yasset Perez-Riverol * resource
A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches Ana R Baião, Zhaoxiang Cai, Rebecca C Poulos, Phillip J Robinson, Roger R Reddel, Qing Zhong, Susana Vinga, Emanuel Gonçalves
Scikit-Bio A community-driven Python library for bioinformatics, providing versatile data structures, algorithms and educational resources for Biology.
HMDB 5.0: the Human Metabolome Database for 2022 David S Wishart, AnChi Guo, Eponine Oler, Fei Wang, Afia Anjum, Harrison Peters, Raynard Dizon, Zinat Sayeeda, Siyang Tian, Brian L Lee, Mark Berjanskii, Robert Mah, Mai Yamamoto, Juan Jovel, Claudia Torres-Calzada, Mickel Hiebert-Giesbrecht, Vicki W Lui, Dorna Varshavi, Dorsa Varshavi, Dana Allen, David Arndt, Nitya Khetarpal, Aadhavya Sivakumaran, Karxena Harford, Selena Sanford, Kristen Yee, Xuan Cao, Zachary Budinski, Jaanus Liigand, Lun Zhang, Jiamin Zheng, Rupasri Mandal, Naama Karu, Maija Dambrova, Helgi B Schiöth, Russell Greiner, Vasuk Gautam resource
MicroPhenoDB Associates Metagenomic Data with Pathogenic Microbes, Microbial Core Genes, and Human Disease Phenotypes Guocai Yao, Wenliang Zhang, Minglei Yang, Huan Yang, Jianbo Wang, Haiyue Zhang, Lai Wei, Zhi Xie, Weizhong Li resource
The National Microbiome Data Collaborative: enabling microbiome science Elisha M Wood-Charlson, Anubhav, Deanna Auberry, Hannah Blanco, Mark I Borkum, Yuri E Corilo, Karen W Davenport, Shweta Deshpande, Ranjeet Devarakonda, Meghan Drake, William D Duncan, Mark C Flynn, David Hays, Bin Hu, Marcel Huntemann, Po-E Li, Mary Lipton, Chien-Chi Lo, David Millard, Kayd Miller, Paul D Piehowski, Samuel Purvine, T B K Reddy, Migun Shakya, Jagadish Chandrabose Sundaramurthi, Pajau Vangay, Yaxing Wei, Bruce E Wilson, Shane Canon, Patrick S G Chain, Kjiersten Fagnan, Stanton Martin, Lee Ann McCue, Christopher J Mungall, Nigel J Mouncey, Mary E Maxon, Emiley A Eloe-Fadrosh resource

Cheat Sheets

Basics:
Data Science:
- Numpy
- Pandas
- Scipy
- Scikit-learn
Visualization:
- Matplotlib
- Plot.ly
- Seaborn
- Bokeh

Basics

learnpython.org
- interactive python basics tutorial
Springboard - Data Analysis with Python, SQL, and R
- starts with - Solo Learn and Design of Computer Programs
Scipy Lectures
- Python introduction with a focus on scientific computing
official tutorial

Python Installations

In this course we use Google Colab to execute notebooks. Notebooks are text files allowing the combination of Text, Code and the output of code. Colab offers an extended set of pre-installed tools. See the tutorial series.

Anaconda offers for your private computer an extended installations, including most tools you will ever need for Python.

Acknowledgements

Some of the slides and notebooks have been inspired or reused from the Data Science Platform at the Informatics Platform the Novo Nordisk Foundation Center for Biosustainability at the Technical University of Denmark. Other relevant courses can be found in the Biosustain GitHub (e.g., R viz, Python viz, Nextflow training, Proteomics, Transcriptomics, Metagenomics, Bash, ...).

Some notebooks have been inspired by the course Python Tsunami at the Center for Health Data Science (HeaDS) at the University of Copenhagen.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
example_data		example_data
metagenomics		metagenomics
multiomics		multiomics
notebooks		notebooks
proteomics		proteomics
transcriptomics		transcriptomics
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧬 Omics Data Analysis: A Computational Science Workshop

Keywords

Syllabus

Further Resources

References

Cheat Sheets

Basics

Python Installations

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

Multiomics-Analytics-Group/course_multi-omics_data_science

Folders and files

Latest commit

History

Repository files navigation

🧬 Omics Data Analysis: A Computational Science Workshop

Keywords

Syllabus

Further Resources

References

Cheat Sheets

Basics

Python Installations

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages