You are here
Interplay between the “genetic code” and the “splicing code”. The genetic code is not a cipher
Alternative splicing is a major gene expression regulatory process that increases the diversity of the transcriptome and the proteome encoded by a limited number of genes. For example, 95% of human genes generates different splicing variants encoding different protein isoforms. While tremendous progress has been made in elucidating the molecular mechanisms of alternative splicing regulation (relying on RNA biding proteins or splicing factors), two important related issues have been poorly addressed. The first issue concerns the biological information encoded by coregulated alternative exons (e.g., exons regulated by the same splicing factor). To date, it has been assumed that splicing factor-coregulated exons are diverse and unrelated in terms of coding information features. A second major issue concerns the relationship between coding and exonic splicing regulatory sequences. So far, it has been assumed that coding sequences can accommodate splicing regulatory sequences because of the redundancy of the genetic code (a same amino acid is encoded by different codons). Since exons bear several “biological information” (e.g. coding, nucleosome positioning, splicing regulatory sequences…), it is believed that the genetic code has evolved to accommodate overlapping and unrelated “codes”.
We addressed these questions by analyzing the nucleotide and codon composition of several large sets of human coregulated exons, as well as the nature and physicochemical properties of amino acids encoded by those exons. We demonstrate that the splicing regulation of an exon by any splicing factor is tightly interconnected with the physicochemical properties of the protein domain encoded by the regulated exon. We demonstrate that this interplay relies on two straightforward principles: 1) splicing factors bind to exonic sequences that have a nucleotide composition bias and 2) exon-encoded amino acids with similar physicochemical properties correspond to codons having the same nucleotide composition bias.
Our work unravels how a complex phenomenon (e.g., splicing regulatory process at the RNA level and its biological consequences in terms of protein physicochemical properties) can rely on very straightforward principles. I will put forward these observations by challenging our current understanding of what the genetic code is.