tjiagoM
diff --git a/‎14_track_hub.ipynb‎
Lines changed: 9 additions & 7 deletions b/‎14_track_hub.ipynb‎
Lines changed: 9 additions & 7 deletions
diff --git a/‎track_hub/hg19.chrom.sizes‎
Lines changed: 0 additions & 93 deletions b/‎track_hub/hg19.chrom.sizes‎
Lines changed: 0 additions & 93 deletions
@@ -13,11 +13,13 @@
     " - `communities/`: Contains all the files that will be generated following the code in this notebook.\n",
     " - `bedToBigBed`: Program to convert .bed to bigBed format as explained here: https://genome.ucsc.edu/goldenPath/help/bigBed.html\n",
     " - `genomes.txt` / `hub.txt`: Files needed for Tracking Hub\n",
-    " - `hg19.chrom.size`: File needed to execute this notebook, downloaded from https://genome.ucsc.edu/goldenPath/help/bigBed.html\n",
+    " - `hg38.chrom.size`: File needed to execute this notebook, downloaded from https://github.com/igvteam/igv/blob/master/genomes/sizes/hg38.chrom.sizes\n",
     " - `process_beds.sh`: Script that will convert all .bed files to bigBed format, deleting all .bed files. In practice, it executes Example \\#2 from this link: https://genome.ucsc.edu/goldenPath/help/bigBed.html\n",
     "\n",
     "\n",
-    "Link provided to Track Hub: https://raw.githubusercontent.com/tjiagoM/gtex-transcriptome-modelling/master/track_hub/hub.txt"
+    "The link provided to Track Hub is: https://raw.githubusercontent.com/tjiagoM/gtex-transcriptome-modelling/master/track_hub/hub.txt\n",
+    "\n",
+    "For a complete set of genes for all communities, we also provide the file `outputs/all_communities_genes.txt`. Unfortunately, some genes did not have a mapping resulting from the code in this notebook, therefore the tracking hub contains an incomplete set of genes. Instead, `all_communities_genes.txt` is complete. As in the paper, we only consider communities with more then 3 genes."
    ]
   },
   {
@@ -56,7 +58,7 @@
     "# Saving the chromosome's limits, based on the file `hg19.chrom.sizes` downloaded from https://genome.ucsc.edu/goldenPath/help/bigBed.html\n",
     "\n",
     "chr_limits = dict()\n",
-    "with open('track_hub/hg19.chrom.sizes', 'r') as reader:\n",
+    "with open('track_hub/hg38.chrom.sizes', 'r') as reader:\n",
     "    lines = reader.readlines()\n",
     "    for line in lines:\n",
     "        line_info = line.split('\\t')\n",
@@ -104,26 +106,26 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [],
    "source": [
-    "with open('track_hub/communities/trackDb.txt', 'w') as f_track_hubs:\n",
+    "with open('track_hub/hg38/trackDb.txt', 'w') as f_track_hubs:\n",
     "    for tissue in TISSUES:\n",
     "        try:\n",
     "            for community_id in range(1, 999999):\n",
     "                arr_com = []\n",
     "                dic_community = pickle.load(open(\"svm_results/\" + tissue + '_' + str(community_id) + \".pkl\", \"rb\"))\n",
     "                len_common = len(dic_community['genes'])\n",
     "\n",
-    "                with open(f'track_hub/communities/{tissue}_{community_id}.bed', 'w') as f:\n",
+    "                with open(f'track_hub/hg38/{tissue}_{community_id}.bed', 'w') as f:\n",
     "                    for gene in dic_community['genes']:\n",
     "                        if gene in dic_all_genes_info.keys():\n",
     "                            gene_info = dic_all_genes_info[gene]\n",
     "                            f.write(f'{gene_info[\"chr\"]}\\t{gene_info[\"chr_start\"]}\\t{gene_info[\"chr_end\"]}\\n')\n",
     "                \n",
     "                f_track_hubs.write(f'track {tissue}_{community_id}\\n')\n",
-    "                f_track_hubs.write(f'bigDataUrl https://raw.githubusercontent.com/tjiagoM/gtex-transcriptome-modelling/master/track_hub/communities/{tissue}_{community_id}.bb\\n')\n",
+    "                f_track_hubs.write(f'bigDataUrl https://raw.githubusercontent.com/tjiagoM/gtex-transcriptome-modelling/master/track_hub/hg38/{tissue}_{community_id}.bb\\n')\n",
     "                f_track_hubs.write(f'shortLabel {tissue}_{community_id}\\n')\n",
     "                f_track_hubs.write(f'longLabel {tissue}_{community_id}\\n')\n",
     "                f_track_hubs.write(f'type bigBed\\n')\n",