Skip to content

📊 Codon usage tables in code-friendly format + Python bindings

License

Notifications You must be signed in to change notification settings

Edinburgh-Genome-Foundry/python_codon_tables

Repository files navigation

Python Codon Tables

GitHub CI build status

Provides codon usage tables as dictionaries, for Python.

Tables for the following organisms are provided with the package (other tables can be downloaded using a TaxID):

  • B. subtilis
  • C. elegans
  • D. melanogaster
  • E. coli
  • G. gallus
  • H. sapiens
  • M. musculus
  • M. musculus domesticus
  • S. cerevisiae

All tables are from kazusa.or.jp (codon usages were computed using NCBI sequence data). The original publication:

Codon usage tabulated from the international DNA sequence databases:
status for the year 2000.
Nakamura, Y., Gojobori, T. and Ikemura, T. (2000) Nucl. Acids Res. 28, 292.

Usage

import python_codon_tables as pct

# PRINT THE LIST OF NAMES OF ALL AVAILABLE TABLES
print ('Available tables:', pct.available_codon_tables_names)

# LOAD ONE TABLE BY NAME
table = pct.get_codons_table("b_subtilis_1423")
print (table['T']['ACA'])  # returns 0.4
print (table['*']['TAA'])  # returns 0.61

# LOAD ONE TABLE BY TAXID (it will get it from the internet if it is not
# in the builtin tables)
table = pct.get_codons_table(1423)
print (table['T']['ACA'])  # returns 0.4
print (table['*']['TAA'])  # returns 0.61

# LOAD ALL BUIL-IN TABLES AT ONCE
codons_tables = pct.get_all_available_codons_tables()
print (codons_tables['c_elegans_6239']['L']['CTA'])  # returns 0.09
  • Notice that by default the tables use nucleotide T instead of U. Using get_codons_table('e_coli', replace_U_by_T=False) will leave Us as Us.
  • In get_codons_table you can also provide a "shorthand" notation b_subtilis, which will be automatically extended to b_subtilis_1423 as it appears so in the built-in table (use this feature at your own risks!)

The package can also use codon usage data from a CSV file in the form:

` amino_acid,codon,relative_frequency *,UAA,0.64 *,UAG,0.07 *,UGA,0.29 A,GCA,0.21 A,GCC,0.27 K,AAA,0.76 K,AAG,0.24 etc. `

Contribute

This project was started at the Edinburgh Genome Foundry by Zulko and is released on Github under the CC0 (Public Domain) license (and no warranty whatsoever, please cross-check the codon usage with other sources if you are not sure). Feel free to add other tables if you think of more commonly used species.

Installation

via pip:

pip install python_codon_tables

Manual:

python setup.py install

More biology software

https://raw.githubusercontent.com/Edinburgh-Genome-Foundry/Edinburgh-Genome-Foundry.github.io/master/static/imgs/logos/egf-codon-horizontal.png

This library is part of the EGF Codons synthetic biology software suite for DNA design, manufacturing and validation.