Copy number variation
About 5% of the human genome is made up of segmental duplications - also known as low copy repeats (LCRs). These are stretches of DNA which at some point over the last 35 million years have been duplicated - or triplicated, or quadruplicated... in this sense the name can be a little misleading. They can be up to 400kb long, depending on whose definition you use, and they're interspersed around the genome.
In general, LCRs of a particular sequence which occur on the same chromosome are seperated by less than 10Mb of intervening sequence. That intervening sequence is prone to all sorts of abnormal chromosomal rearrangments - the figure to the left (from the Nature article) demonstrates some of the possibilities.This isn't a new discovery: we've known about structural variation for a while. What has surprised some geneticists, though, is that so many of us have large amounts of structural variation. Structural abnormalities that can be detected cytogenetically - essentially by looking at chromosomes under the microscope - are usually associated with disease and most of the basic research done in the field has been driven by people focused on specific diseases that are caused by chromosomal abnormalities, like DiGeorge syndrome (caused by a large deletion on chromosome 22). This seems to have engendered an unspoken assumption that people with chromosomal variations are invariably afflicted with disease (check out this post at Evolgen for a perspective on this from RPM).
But it turns out that we all have lots of relatively small chromosomal variations rather than one or two major disease causing deletions, duplications or inversions, and the phenotypic effect of all these variations can be subtle.
Sharp et al. from the Eichler lab at the University of Washington undertook a study earlier this year of copy number polymorphisms (CNPs) - the "copy number variants" on the figure. They screened a panel of 47 people who came from a variety of ethnic backgrounds and found 119 different regions where copy number variation had occurred - i.e. regions where a particular sequence has been repeated or deleted. 66 of those 119 regions had copy number variation in more than one person but none of the regions were associated with any particular ethnic group, indicating that they were old - that they'd become established in the population before humans started spreading out across the globe.
Remember that these are substantial regions, not single nucleotides. Their results support the conclusion reached by Sebat et al. - who undertook a similar study the year before - that large scale copy number polymorphisms contribute substantially to genomic variation between normal humans.
Some of the duplicated regions highlighted by the Sharp & Sebat studies contain genes, or parts of genes. Gene duplication is a good thing, in evolutionary terms, as duplicates are freed from evolutionary constraints (if one copy mutates and ceases to perform as it should there's a backup waiting in the wings, so the mutant is free to develop new functions over time - or to fall by the wayside). Indeed, many of the genetic differences between humans and other primates are the result of large duplications and deletions.
The phenotypic differences that arise from having different gene copy numbers is a hot topic for investigation, especially given that many of the association studies to try and find the single point mutations influencing particular complex diseases haven't really lived up to the hopes and hype surrounding them. Sebat et al. are currently involved in exploring potential relationships between CNPs and autism, based on a hypothesis that alterations in gene dosage influence many neurological disorders. More famously, Gonzalez et al. recently showed that the number of copies of the CCL3L1 you carry influences your susceptibility to HIV / AIDS.
If you're interested, bioinformatics wise, a good place to start is the Eichlerlab's Human Structural Variation database, where the data from the Sharp and Sebat studies (amongst others) has been collected.
Anonymous
. This post has trackbacks.
