Tuesday, August 24, 2010

Connect the dots

I thought I'd share this beautiful graphic of human genetic variation from the blog Gene Expression. The original paper from Science.


For panel A, PC1 = 20% of the variance, PC2 = 5%, and PC3 = 3.5%. For panel B, PC1 = 11%, PC2 = 6%, PC3 = 5% and PC4 = 4%.

4 comments:

  1. There's a lot of difference between PC1 and PC2 in panel A. This typically needs fixing the scale to represent unmodified apportions through all the three dimensions, what would essentially leave a line, cramping Eurasians on each other across the "race barrier" and leaving the Hadza difference in a mere bump.

    Following that graph, after due apportion, there can be only two different "races": Africans and Eurasians, with some rather thick cladistics at the frontier zone anyhow. I am assuming here that "race" would have to be a scientific concept with clear measures and no visualization tricks.

    If you've stumbled on this comment and it makes your head explode, just keep repeating to yourself: 'There is scientific basis for race ...'

    ReplyDelete
  2. Beng Gnxp3:09 AM

    Why does the difference between two groups have to be as big as between Africans and Eurasians to constitute valid and useful categories?

    ReplyDelete
  3. Chripe_775:17 PM

    I am a fan of your blog. Can you provide a quick recap of the reasons you focus on the intersection of genetics and racial/ethnic background. I believe in freedom of inquiry and research; but how do you know that extremist political movements won't use this kind of analysis to justify some degree of racism or prejudice? I would be interested in your take on the movie Gattaca, which had portrayed rather negatively a future (American or even world-wide) society where a genetically engineered (one that was multiethnic and gender-equal) elite had taken over. I myself am not trying to be an alarmist or extremist -- "but with great power comes great responsibility" (the latter quote is from Spiderman's Uncle Ben).

    Chris P.

    ReplyDelete
  4. I think you're confused.  The 20% versus 5% represents what you're seeing, they're not scaling factors.  They just reflect that the data points are already spread out more along the first principal component.

    ReplyDelete