We have produced a draft sequence of the rice genome for the most widely cultivated subspecies in China, Oryza sativa L. ssp. indica, by whole-genome shotgun sequencing. The genome was 466 megabases in size, with an estimated 46,022 to 55,615 genes. Functional coverage in the assembled sequences was 92.0%. About 42.2% of the genome was in exact 20-nucleotide oligomer repeats, and most of the transposons were in the intergenic regions between genes. Although 80.6% of predicted Arabidopsis thaliana genes had a homolog in rice, only 49.4% of predicted rice genes had a homolog in A. thaliana. The large proportion of rice genes with no recognizable homologs is due to a gradient in the GC content of rice coding sequences.
ASJC Scopus subject areas