Science (2003): Selenoprotein gene prediction in Human

Science (2003): Selenoprotein gene prediction in HumanScience (2003): Selenoprotein gene prediction in Human

Characterization of mammalian selenoproteomes


G. V. Kryukov, S. Castellano, S. V. Novoselov, A. V. Lobanov, O. Zehtab, R. Guigó and V. N. Gladyshev*.


Science, 300(5624):1439-1443 (2003) [
Abstract] [Full Text]



*To whom correspondence should be adressed.

 

Contents

In this site we describe and provide all the programs and data used used to predict selenoproteins in the human genome.
 

 

Selenoproteins overview border=0

Major points on selenoproteins are:
 

  1. They incorporate the aa selenocysteine (U or Sec) which is the 21st aa. It has its own tRNA which carries the anticodon for UGA (which we were taught was only a STOP codon !).
     
  2. So, why not all UGA codons code for Sec? because the alternative decoding of UGA is conferred by an mRNA secondary structure, termed SECIS. This structure, by means of one or more proteins, directs the ribosome to incorporate Sec.
     
  3. They are everywhere: Eukarya, Bacteria and Archaea. But the SECIS element is located in the 3' UTR in eukaryotes and archaeas while in the coding region in bacterias (just after the UGA). Eukarya, Bacteria and Archaea SECIS elements differ substantially.
     
  4. Try standard gene prediction and, as much, you will get truncated selenoprotein genes. Why not accepting that UGA codes for Sec as long as there is a potential SECIS around? why not running SECIS prediction and pinpoint real ones using a comparative approach? This is the work presented here.
     
Genome sequences border=0

 

SECISearch border=0

The SECIS structure, located in the 3' UTR in both eukaryotic and archaea mRNAS, is the secondary/tertiary RNA structure which directs the UGA codon recoding. Eukaryotic and Archaea SECIS structure differ substantially.


SECISearch 2.0 identifies candidate SECIS elements in nucleotide sequence databases on the basis of their primary sequence structure and predicted free energy criteria. The program has 3 modules:

 

1- Search for SECIS (based on PatScan):  
2- Thermodynamic evaluation (based on RNAfold):  
3- SECIS visualization (RNAnice):  

 

The program can be accessed through an online web server . Connect and check the SECIS patterns used in this work.
 

geneid border=0

Please, for a general introduction browse the geneid page . The modified geneid version able to predict selenoproteins can be found just below (source code in ansi C and some parameters file):

 

geneid_SP.tar.gz

 

The parameters file is an external flat file read by geneid at running time. Take a look at it ! . It carries the statistical information, for a given organism, used to predict genes and the gene model (which states the relationships of the exons predicted along a sequence). Please, read the geneid handbook for details.

 

Human with SECIS: Seleno3iso.default.1TGA.both.15.param
 
Human without SECIS: Seleno3iso.default.1TGA.both.15.No_SECIS.0.75.param
 
Fugu (and Tetraodon) without SECIS: tetraodon.param.3.No_SECIS.1.8.param
 
Novel selenoproteins border=0

 

Protein sequence (U stand for Sec) and SECIS sequence divided into structural units for each novel selenoprotein gene in human:

SelV: protein and SECIS
 
SelH: protein and SECIS
 
SelK: protein and SECIS
 
SelS: protein and SECIS
 
SelI: protein and SECIS
 
SelO: protein and SECIS
 
GPx6: protein and SECIS
Group: