|
26 | 26 | " <b>[1] Reference:</b> Virkud, Y. V., Kelly, R. S., Wood, C., & Lasky-Su, J. A. (2019). The nuts and bolts of omics for the clinical allergist. Annals of Allergy, Asthma and Immunology, 123(6), 558-563.\n",
|
27 | 27 | "</div>\n",
|
28 | 28 | "\n",
|
29 |
| - "The central dogma of molecular biology is the representation of omes and omics. Omic data of various kinds are frequently employed in human medical research. The fields of omic research include the omic data produced by the central dogma, which includes the fields of genomics (DNA), transcriptomics (RNA), proteomics (proteins), and metabolomics (small molecules, including amino acids, fatty acids, carbohydrates, vitamins, lipids, and nucleotides); however, new types of omic data have emerged, including the fields of epigenomics (methyl tags and histones), exposomics (allergens, toxins, diet (bacteria and microorganisms). As a result, the majority of early scientific efforts were devoted to describing the genome, transcriptome, and proteome. However, seven major omics disciplines are currently being investigated in great detail: the genome (DNA), transcriptome (RNA), proteome (proteins), epigenome (DNA modifications that influence expression), metabolome (metabolites), microbiome (microbiota), and exposome (exposures).\n", |
| 29 | + "The central dogma of molecular biology is the representation of omes and omics. Omic data of various kinds are frequently employed in human medical research. The fields of omic research include the omic data produced by the central dogma, which includes the fields of genomics (DNA), transcriptomics (RNA), proteomics (proteins), and metabolomics (small molecules, including amino acids, fatty acids, carbohydrates, vitamins, lipids, and nucleotides); however, new types of omic data have emerged, including the fields of epigenomics (methyl tags and histones), exposomics (allergens, toxins, diet, bacteria and microorganisms). As a result, the majority of early scientific efforts were devoted to describing the genome, transcriptome, and proteome. However, seven major omics disciplines are currently being investigated in great detail: the genome (DNA), transcriptome (RNA), proteome (proteins), epigenome (DNA modifications that influence expression), metabolome (metabolites), microbiome (microbiota), and exposome (exposures).\n", |
30 | 30 | "\n",
|
31 | 31 | "### <span> Next Generation Sequencing Technique for RNA-Seq. <span>\n",
|
32 | 32 | "\n",
|
|
121 | 121 | "\n",
|
122 | 122 | "A bioinformatics pipeline called nf-core/rnaseq can be used to analyze RNA sequencing data from organisms with an annotated reference genome. This pipeline represents different stages of the analysis. It contains all the analysis steps, starting from preprocessing of the fastq data followed by genome alignment and quantification. Gene expression levels are generated from mRNA and miRNA sequencing data using RNA-Seq quantification. The next step is pseudo-alignment and quantification, followed by post-processing of the data, and then the final quality control of the input data is performed. The different colors of the pipeline represent the different methods of processing the fastq files. For example, the black line represents STAR, quantification, and salmon software usage to process the files. The user can choose any method of their choice while processing their files. \n",
|
123 | 123 | "\n",
|
124 |
| - "This step is **optional** as it is the preprocessing step to let you experience generating your own gene counts table. To save on computational and storage resources, we have already provided the gene count table with this module that will be copied from our bucket in step 3. The gene counts can also be extracted from the NCBI's GEO website using the same data acccession under the supplementary files section. \n", |
| 124 | + "This step is **optional** as it is the preprocessing step to let you experience generating your own gene counts table. To save on computational and storage resources, we have already provided the gene count table with this module that will be copied from our bucket in step 3. The gene counts can also be extracted from the NCBI's GEO website using the same data accession under the supplementary files section. \n", |
125 | 125 | "\n",
|
126 | 126 | "If however you want to try the nextflow analysis, here are a few tips to help you along. First, if you are not using NIH Cloud Lab as your environment, you need to configure the Nextflow Service Account following [this guide](https://github.com/NIGMS/NIGMS-Sandbox/blob/main/docs/HowToCreateNextflowServiceAccount.md). Second, you will need to configure your config file to point to Google Batch. We provide a template that you can modify with your GCP bucket (need to create one, `gsutil mb gs://UNIQUE-BUCKET-NAME` and your project ID. Again, you only need to create the service account if not using NIH Cloud Lab. For further details on how to use Nextflow for RNA Seq analysis, please refer to [nf-core/rnaseq](https://nf-co.re/rnaseq) or [Transcriptome-Assembly-Refinement-and-Applications](https://github.com/NIGMS/Transcriptome-Assembly-Refinement-and-Applications) module to learn more about pre-processing through Nextflow."
|
127 | 127 | ]
|
|
233 | 233 | "outputs": [],
|
234 | 234 | "source": [
|
235 | 235 | "system('export NXF_MODE=google') \n",
|
236 |
| - "#Install nexflow, make it exceutable, and update it\n", |
| 236 | + "#Install nexflow, make it executable, and update it\n", |
237 | 237 | "system('curl https://get.nextflow.io | bash' , intern=TRUE)\n",
|
238 | 238 | "system('chmod +x nextflow' , intern=TRUE)\n",
|
239 | 239 | "system('./nextflow self-update' , intern=TRUE)"
|
|
275 | 275 | "source": [
|
276 | 276 | "<div class=\"alert alert-block alert-info\">\n",
|
277 | 277 | " <i class=\"fa fa-lightbulb-o\" aria-hidden=\"true\"></i>\n",
|
278 |
| - " <b>Tip: </b> If you don't immediately see a output on your screen check your output directory you have pointed to in your config file to insure that Nextflow is running. You should see some output directories/files.\n", |
| 278 | + " <b>Tip: </b> If you don't immediately see an output on your screen check your output directory you have pointed to in your config file to ensure that Nextflow is running. You should see some output directories/files.\n", |
279 | 279 | "</div>"
|
280 | 280 | ]
|
281 | 281 | },
|
|
340 | 340 | "source": [
|
341 | 341 | "<div class=\"alert alert-block alert-success\">\n",
|
342 | 342 | " <i class=\"fa fa-hand-paper-o\" aria-hidden=\"true\"></i>\n",
|
343 |
| - " <b>Note: </b> If you've used Nextflow to produce your gene counts table and would like to use it for the down processing analysis instead of the provided counts table enter your own files into the code above by copying the <b>salmon.merged.gene_counts.tsv</b> from the salmon subdirectory within your Nextflow output directory.\n", |
| 343 | + " <b>Note: </b> If you've used Nextflow to produce your gene counts table and would like to use it for the downstream processing analysis instead of the provided counts table enter your own files into the code above by copying the <b>salmon.merged.gene_counts.tsv</b> from the salmon subdirectory within your Nextflow output directory.\n", |
344 | 344 | "</div>"
|
345 | 345 | ]
|
346 | 346 | },
|
|
406 | 406 | "outputs": [],
|
407 | 407 | "source": [
|
408 | 408 | "DESeq.ds <- DESeq.ds[ rowSums(counts(DESeq.ds)) > 0, ]\n",
|
409 |
| - "#Inspect data after manupalation\n", |
| 409 | + "#Inspect data after manipulation\n", |
410 | 410 | "rowSums(counts(DESeq.ds)) %>% head\n",
|
411 | 411 | "colSums(readcounts)\n",
|
412 | 412 | "colSums(counts(DESeq.ds)) \n",
|
|
776 | 776 | }
|
777 | 777 | ],
|
778 | 778 | "metadata": {
|
| 779 | + "kernelspec": { |
| 780 | + "display_name": "conda_python3", |
| 781 | + "language": "python", |
| 782 | + "name": "conda_python3" |
| 783 | + }, |
779 | 784 | "language_info": {
|
780 |
| - "name": "python" |
| 785 | + "codemirror_mode": { |
| 786 | + "name": "ipython", |
| 787 | + "version": 3 |
| 788 | + }, |
| 789 | + "file_extension": ".py", |
| 790 | + "mimetype": "text/x-python", |
| 791 | + "name": "python", |
| 792 | + "nbconvert_exporter": "python", |
| 793 | + "pygments_lexer": "ipython3", |
| 794 | + "version": "3.10.16" |
781 | 795 | }
|
782 | 796 | },
|
783 | 797 | "nbformat": 4,
|
|
0 commit comments