Original Publication Date
DOI of Original Publication
Date of Submission
Background CpG islands, which are clusters of CpG dinucleotides in GC-rich regions, are considered gene markers and represent an important feature of mammalian genomes. Previous studies of CpG islands have largely been on specific loci or within one genome. To date, there seems to be no comparative analysis of CpG islands and their density at the DNA sequence level among mammalian genomes and of their correlations with other genome features.
Results In this study, we performed a systematic analysis of CpG islands in ten mammalian genomes. We found that both the number of CpG islands and their density vary greatly among genomes, though many of these genomes encode similar numbers of genes. We observed significant correlations between CpG island density and genomic features such as number of chromosomes, chromosome size, and recombination rate. We also observed a trend of higher CpG island density in telomeric regions. Furthermore, we evaluated the performance of three computational algorithms for CpG island identifications. Finally, we compared our observations in mammals to other non-mammal vertebrates.
Conclusion Our study revealed that CpG islands vary greatly among mammalian genomes. Some factors such as recombination rate and chromosome size might have influenced the evolution of CpG islands in the course of mammalian evolution. Our results suggest a scenario in which an increase in chromosome number increases the rate of recombination, which in turn elevates GC content to help prevent loss of CpG islands and maintain their density. These findings should be useful for studying mammalian genomes, the role of CpG islands in gene function, and molecular evolution.
© 2008 Han et al.; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Is Part Of
VCU Psychiatry Publications
Numbers of genes estimated in mammalian genomes.
gb-2008-9-5-r79-s2.pdf (19 kB)
Correlations between CGI density and genomic features in ten mammalian genomes (including platypus).
gb-2008-9-5-r79-s3.pdf (21 kB)
Correlations between intergenic CGI density and genomic features in nine mammalian genomes.
gb-2008-9-5-r79-s4.pdf (41 kB)
Correlations between CGI density and average recombination rate (cM/Mb) in the human, mouse and rat genomes.
gb-2008-9-5-r79-s5.pdf (29 kB)
Comparison of CpG islands and other genomic features between mammalian and non-mammalian genomes.
gb-2008-9-5-r79-s6.pdf (19 kB)
Correlations between CGI density and genomic features in mammalian genomes using the Gardiner-Garden and Frommer algorithm in the non-repeat portions of genomes. In both Additional data files 6 and 7, the platypus chromosomes were excluded because of incomplete genome sequence data and chromosome data. The conclusion would be the same when the platypus data were included.
gb-2008-9-5-r79-s7.pdf (19 kB)
Correlations between CGI density and genomic features in mammalian genomes using the CpGcluster algorithm. In both Additional data files 6 and 7, the platypus chromosomes were excluded because of incomplete genome sequence data and chromosome data. The conclusion would be the same when the platypus data were included.
gb-2008-9-5-r79-s8.xls (43 kB)
The first sheet ('overview') summarizes the total number of CGIs in each genome identified by each algorithm. The length distribution of CGIs in each genome is shown in each additional sheet.
gb-2008-9-5-r79-s9.xls (26 kB)
Body temperature and lifespan for each species.