Figures: Each point is an individual, and the axes are two principal components in the space of genetic variation. Colors correspond to individuals of different Asian ancestry.
Thanks to Chao Tian of UC Davis for sending me an early draft of the paper.
Analysis of East Asia Genetic Substructure: Population Differentiation and PCA Clusters Correlate with Geographic Distribution.
C. Tian1, R. Kosoy1, A. Lee2, P. Gregersen2, J. Belmont2, M. Seldin1
1) Rowe Program Human Genetics, Univ California Sch Medicine, Davis, CA; 2) North Shore-LIJ Res Inst, Manhasset, NY, Baylor Col Med., Houston TX.
Accounting for genetic substructure within European populations has been important in reducing type 1 errors in genetic studies of complex disease. As efforts to understand complex genetic disease are expanded to other continental populations an understanding of genetic substructure within these continents will be useful in design and execution of association tests. In this study, population differentiation(Fst) and Principal Components Analyses(PCA) are examined using >200K genotypes from multiple populations of East Asian ancestry(total 298 subjects). The population groups included those from the Human Genome Diversity Panel[Cambodian(CAMB), Yi, Daur, Mongolian(MGL), Lahu, Dai, Hezhen, Miaozu, Naxi, Oroqen, She, Tu, Tujia, Naxi, and Xibo], HapMap(CHB and JPT), and East Asian or East Asian American subjects of Vietnamese(VIET), Korean(KOR), Filipino(FIL) and Chinese ancestry. Paired Fst(Wei and Cockerham) showed close relationships between CHB and several large East Asian population groups(CHB/KOR, 0.0019; CHB/JPT, 00651; CHB/VIET, 0.0065) with larger separation with FIL(CHB/FIL, 0.014). Low levels of differentiation were also observed between DAI and VIET(0.0045) and between VIET and CAMB(0.0062). Similarly, small Fst's were observed among different presumed Han Chinese populations originating in different regions of mainland of China and Taiwan(Fst < 0.0025 with CHB). For PCA, the first two PC's showed a pattern of relationships that closely followed the geographic distribution of the different East Asian populations. For example, the four "corner" groups were JPT, FIL, CAMB and MGL with the CHB forming the center group, and KOR was between CHB and JPT. Other small ethnic groups were also in rough geographic correlation with their putative origins. These studies have also enabled the selection of a subset of East Asian substructure ancestry informative markers(EASTASAIMS) that may be useful for future genetic association studies in reducing type 1 errors and in identifying homogeneous groups.
Related posts: "no scientific basis for race" , metric on the space of genomes