This is a new paper on polygenic prediction for breast cancer by a large collaboration that has been working for many years on GWAS and, more recently, genomic risk prediction.
"The lifetime risk of overall breast cancer in the top centile of the PRSs was 32.6%" !
Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer SubtypesNote 10-25x (ER-positive and -negative) range of risk between lowest and highest percentile PRS score.
Stratification of women according to their risk of breast cancer based on polygenic risk scores (PRSs) could improve screening and prevention strategies. Our aim was to develop PRSs, optimized for prediction of estrogen receptor (ER)-specific disease, from the largest available genome-wide association dataset and to empirically validate the PRSs in prospective studies. The development dataset comprised 94,075 case subjects and 75,017 control subjects of European ancestry from 69 studies, divided into training and validation sets. Samples were genotyped using genome-wide arrays, and single-nucleotide polymorphisms (SNPs) were selected by stepwise regression or lasso penalized regression. The best performing PRSs were validated in an independent test set comprising 11,428 case subjects and 18,323 control subjects from 10 prospective studies and 190,040 women from UK Biobank (3,215 incident breast cancers). For the best PRSs (313 SNPs), the odds ratio for overall disease per 1 standard deviation in ten prospective studies was 1.61 (95%CI: 1.57–1.65) with area under receiver-operator curve (AUC) = 0.630 (95%CI: 0.628–0.651). The lifetime risk of overall breast cancer in the top centile of the PRSs was 32.6%. Compared with women in the middle quintile, those in the highest 1% of risk had 4.37- and 2.78-fold risks, and those in the lowest 1% of risk had 0.16- and 0.27-fold risks, of developing ER-positive and ER-negative disease, respectively. Goodness-of-fit tests indicated that this PRS was well calibrated and predicts disease risk accurately in the tails of the distribution. This PRS is a powerful and reliable predictor of breast cancer risk that may improve breast cancer prevention programs.
One of the senior authors (Paul Pharoah of Cambridge) details the history of his work on genetics of breast cancer in a tweet thread. He describes historical progress from simple GWAS associations to full-blown genomic prediction:
1/n This paper has been many years in the making both conceptually and in terms of the time to generate the data. It has been part of almost all my scientific life (or at least since I started my PhD).My small team of physicists has constructed a breast cancer predictor of similar power using UKBB data and our own automated ML pipeline :-)
11/n This PRS is now being used in an EU funded trial of risk stratified screening. It is the culmination of many years of many people working together on samples donated by hundreds of thousands of patients.
See earlier post Advances in Genomic Prediction.