ChemGenome 2.0 : An ab-initio Gene Prediction Software

 

 

How to Use Tool

1. You can directly input file by Clicking browse button. Alternatively you can paste the genome in FASTA format

The example for the input of the sequence.

Nucleotide Sequence
>Gene Name (This comment line is necessary)
ATGTTGGTGTCCGCAAGGGTAGAGAAACAAAAGCGTGTTGCTTATCAGGGGAAGGCGACAGTGCTTGCTCTCGGTAAGG CCTTGCCGAGCAATGTTGTTTCCCAGGAGAATCTCGTGGAGGAGTATCTCCGTGAAATCAAATGCGATAACCTTTCTAT

2. You can run chemgenome with default parameters. If you want to specify, you can specify additional parameters

3. You can specify email address optionally to get the result mailed to you.

4. Choose Threshold value from 70, 90, 100, 300 . If you have small genome you can specify lower threshold value to find smaller genes. If you have large genomes you can specify higher threshold value to weed out false positives

4. You can specify Start codons, which will find genes starting with the chosen Start codon. You can select ATG, CTG, GTG and TTG individualy as well as with any combination. For example you can select ATG and GTG by clicking the respective check boxes. The program will find genes starting with start codons either ATG or GTG

5. Specify method to run ChemGenome from 'DNA Space', 'Protein Space' and 'Swissprot Space' .

DNA space: The method searches for genes based on physico-chemical properties of double-helical deoxyribonucleic acid (DNA).

Protein Space: The method takes the result generated from DNA space as input file and works as a filter based on stereochemical properties of protein sequences to reduce false positives.

Swissprot Space: The method takes the result generated from protein space as input file and calculates the standard deviation of a query nucleotide sequence (predicted gene sequence) with the swissprot proteins based on the frequency of occurrence of aminoacids. A threshold standard deviation is chosen to keep the false positives at minimum and precision at maximum.

6. The output generated will be both in tabular and graphical representation.

7. The server side upload limit for file size is around 6MB. We have tested on more than 5 MB genome file size available with us(Taking into account the largest prokaryotic genome size present). If the program crashes on large genome size, more than 5 MB, please intimate us.

8. The computation may take 5-10 minutes depending upon the load on the web server and the size of the genome in the input file.


In case of any Suggestions/Exceptions, Please contact us at scfbio@scfbio-iitd.res.in