Skip to content

Dylan8527/CryoCRAB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CryoCRAB-Scripts

CryoCRAB: A Large-scale Curated and Filterable Dataset for Cryo-EM Foundation Model Pre-training

Description

The CryoCRAB dataset comprises 152,385 sets of raw movie frames, covering 746 datasets from EMPIAR. Each EMPIAR dataset typically includes approximately 200 cryo-EM images, consisting of raw movies, motion-corrected full-diff micrographs in MRC format along with estimated background images, and preprocessed full-diff micrographs in HDF5 format. The entire dataset, including micrographs and metadata, totals approximately 12.18 TB.

Download

The complete dataset is available for download from the ScienceDB public repository: doi.org/10.57760/sciencedb.17922.

TODO

  • Method List
    • EMPIAR Crawling & Curation Notebook
    • MongoDB Dataset Generation Notebook
    • EMPIAR Download Example Notebook
    • CryoSPARC Automated Processing Notebook
    • CryoCRAB Preprocessing Example Notebook
    • CryoCRAB Full-Diff HDF5 Pair Generation Notebook
  • Visualization List
    • Motion Correction & Visualization Notebook
    • CTF Correction & Visualization Notebook

CryoCRAB

About

CryoCRAB: A Curated-Filterable Dataset for Building Cryo-EM Foundation Model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published