Sunday, August 03, 2014

It's all in the gene: cows

Some years ago a German driver took me from the Perimeter Institute to the Toronto airport. He was an immigrant to Canada and had a background in dairy farming. During the ride he told me all about driving German farmers to buy units of semen produced by highly prized Canadian bulls. The use of linear polygenic models in cattle breeding is already widespread, and the review article below gives some idea as to the accuracy.

See also Genomic Prediction: No Bull and Plenty of room at the top.
Invited Review: Reliability of genomic predictions for North American Holstein bulls

Journal of Dairy Science Volume 92, Issue 1, Pages 16–24, January 2009.

Genetic progress will increase when breeders examine genotypes in addition to pedigrees and phenotypes. Genotypes for 38,416 markers and August 2003 genetic evaluations for 3,576 Holstein bulls born before 1999 were used to predict January 2008 daughter deviations for 1,759 bulls born from 1999 through 2002. Genotypes were generated using the Illumina BovineSNP50 BeadChip and DNA from semen contributed by US and Canadian artificial-insemination organizations to the Cooperative Dairy DNA Repository. Genomic predictions for 5 yield traits, 5 fitness traits, 16 conformation traits, and net merit were computed using a linear model with an assumed normal distribution for marker effects and also using a nonlinear model with a heavier tailed prior distribution to account for major genes. The official parent average from 2003 and a 2003 parent average computed from only the subset of genotyped ancestors were combined with genomic predictions using a selection index. Combined predictions were more accurate than official parent averages for all 27 traits. The coefficients of determination (R2) were 0.05 to 0.38 greater with nonlinear genomic predictions included compared with those from parent average alone. Linear genomic predictions had R2 values similar to those from nonlinear predictions but averaged just 0.01 lower. The greatest benefits of genomic prediction were for fat percentage because of a known gene with a large effect. The R2 values were converted to realized reliabilities by dividing by mean reliability of 2008 daughter deviations and then adding the difference between published and observed reliabilities of 2003 parent averages. When averaged across all traits, combined genomic predictions had realized reliabilities that were 23% greater than reliabilities of parent averages (50 vs. 27%), and gains in information were equivalent to 11 additional daughter records. Reliability increased more by doubling the number of bulls genotyped than the number of markers genotyped. Genomic prediction improves reliability by tracing the inheritance of genes even with small effects.

Results and Discussion: ... Marker effects for most other traits were evenly distributed across all chromosomes with only a few regions having larger effects, which may explain why the infinitesimal model and standard quantitative genetic theories have worked well. The distribution of marker effects indicates primarily polygenic rather than simple inheritance and suggests that the favorable alleles will not become homozygous quickly, and genetic variation will remain even after intense selection. Thus, dairy cattle breeders may expect genetic progress to continue for many generations.

... Most animal breeders will conclude that these gains in reliability are sufficient to make genotyping profitable before breeders invest in progeny testing or embryo transfer. Rates of genetic progress should increase substantially as breeders take advantage of these new tools for improving animals (Schaeffer, 2008). Further increases in number of genotyped bulls, revisions to the statistical methods, and additional edits should increase the precision of future genomic predictions.

Table 3

TraitParent averageGenomic predictionGain from nonlinear genomic prediction compared with published parent average
Net merit301467535323
Milk yield353269565823
Fat yield351769656833
Protein yield353169585722
Fat percentage352969697843
Protein percentage353269626934
Productive life272855424518

"Horses ain't like people, man. They can't make themselves better than they're born. See, with a horse, it's all in the gene. It's the fucking gene that does the running. The horse has got absolutely nothing to do with it." --- Paulie (Eric Roberts) in The Pope of Greenwich Village


Stoolio said...

Yeah, that German driver would probably be Werner Friesendorf.

Richard Seiter said...

Interesting to see how linear models relate to parent midpoint and nonlinear models for accuracy. (any idea how a similar analysis applied to human height would look?) Similarly for how the number of SNPs affects prediction accuracy.

One concern seems to be that the Holstein population has some unusual characteristics that may affect the results (any thoughts about this? seems to me this is likely to increase the prediction accuracy observed):

"The genetic history of the Holstein population may help to explain the results. Many animals share common DNA segments from Round Oak Rag Apple Elevation, Pawnee Farm Arlinda Chief, To-Mar Blackstar, and other popular ancestors occurring 4 to 10 generations back in current pedigrees. Few common ancestors occur >10 generations back because individual bulls had limited influence before AI with frozen semen began (Young and Seykora, 1996). Lengths of the shared chromosome segments are thus 0.10 to 0.25 of the mean chromosome length, and a few hundred markers per chromosome are adequate to trace those segments shared within families.

In the next generation, the common ancestors will be 1 generation further back, and more crossovers will occur between their adjacent alleles. If the allele effects estimated from families in this study were applied to less-related animals from other populations, predictions could be much less reliable. Divergent populations may require greater SNP densities. As more bulls are genotyped, more phenotypes will be available to estimate each effect. This will increase the value of having more SNP, but will also require the expense of genotyping the predictor bulls again using a denser chip."

Am I correct in interpreting this as the genomic models include the PA data as inputs as well? "Final genomic predictions for predicted bulls combined 3 terms by selection index: 1) direct genomic prediction; 2) PA computed from the subset of genotyped ancestors using traditional relationships; and 3) published PA or PI."

Blog Archive