Tuesday, August 24, 2010

Connect the dots

I thought I'd share this beautiful graphic of human genetic variation from the blog Gene Expression. The original paper from Science.

For panel A, PC1 = 20% of the variance, PC2 = 5%, and PC3 = 3.5%. For panel B, PC1 = 11%, PC2 = 6%, PC3 = 5% and PC4 = 4%.


Maju said...

There's a lot of difference between PC1 and PC2 in panel A. This typically needs fixing the scale to represent unmodified apportions through all the three dimensions, what would essentially leave a line, cramping Eurasians on each other across the "race barrier" and leaving the Hadza difference in a mere bump.

Following that graph, after due apportion, there can be only two different "races": Africans and Eurasians, with some rather thick cladistics at the frontier zone anyhow. I am assuming here that "race" would have to be a scientific concept with clear measures and no visualization tricks.

If you've stumbled on this comment and it makes your head explode, just keep repeating to yourself: 'There is scientific basis for race ...'

Beng Gnxp said...

Why does the difference between two groups have to be as big as between Africans and Eurasians to constitute valid and useful categories?

Chripe_77 said...

I am a fan of your blog. Can you provide a quick recap of the reasons you focus on the intersection of genetics and racial/ethnic background. I believe in freedom of inquiry and research; but how do you know that extremist political movements won't use this kind of analysis to justify some degree of racism or prejudice? I would be interested in your take on the movie Gattaca, which had portrayed rather negatively a future (American or even world-wide) society where a genetically engineered (one that was multiethnic and gender-equal) elite had taken over. I myself am not trying to be an alarmist or extremist -- "but with great power comes great responsibility" (the latter quote is from Spiderman's Uncle Ben).

Chris P.

Warren Dew said...

I think you're confused.  The 20% versus 5% represents what you're seeing, they're not scaling factors.  They just reflect that the data points are already spread out more along the first principal component.

Blog Archive