We present a comprehensive database, Information for the Coordinates of Exons (ICE), of genomic splice sites (SSs) for 10,803 human genes. ICE contains 91,846 pairs of donor acceptor sites, supported by the alignment of "full-length" human mRNAs (including transcript variants) on human genomic sequences. ICE represents the largest collection of human SSs known to date and provides a significant resource to both molecular biologists and bioinformaticians alike. A user can visualize and extract genomic sequences around SSs of the donor acceptor pairs and can also visualize the primary structure of individual genes. We list in this article the 22 most frequently found canonical and noncanonical splice sites. The top four most represented donor acceptor pairs (GT-AG, GC-AG, AT-AC, and GT-GG) accounted for 99.16% of our data set. In addition, we calculated the SS matrix models for the three most common donor acceptor pairs. The database is focused on providing SSs and surrounding sequence information, associated SS and sequence characteristics, and relation to overall transcript structure. It allows targeted search and presents evidence for the gene structure.
ASJC Scopus subject areas