Saturday, October 31, 2020

Precision Embryo Genotyping and CRISPR Chromosome Deletions (Genomic Prediction)

This recent Cell paper received a lot of attention as it suggests that CRISPR editing can result in chromosome loss. It was cited in the recent National Academies report Heritable Human Genome Editing (2020) as an example of unexpected consequences / side-effects from CRISPR.
Allele-Specific Chromosome Removal after Cas9 Cleavage in Human Embryos  
Correction of disease-causing mutations in human embryos holds the potential to reduce the burden of inherited genetic disorders and improve fertility treatments for couples with disease-causing mutations in lieu of embryo selection. Here, we evaluate repair outcomes of a Cas9-induced double-strand break (DSB) introduced on the paternal chromosome at the EYS locus, which carries a frameshift mutation causing blindness. We show that the most common repair outcome is microhomology-mediated end joining, which occurs during the first cell cycle in the zygote, leading to embryos with non-mosaic restoration of the reading frame. Notably, about half of the breaks remain unrepaired, resulting in an undetectable paternal allele and, after mitosis, loss of one or both chromosomal arms. Correspondingly, Cas9 off-target cleavage results in chromosomal losses and hemizygous indels because of cleavage of both alleles. These results demonstrate the ability to manipulate chromosome content and reveal significant challenges for mutation correction in human embryos. 
BioRxiv preprint 

My Genomic Prediction colleagues Jia Xu, Diego Marin, and Nathan Treff are co-authors of the paper. GP's precision embryo genotyping capabilities were necessary to determine that a paternal chromosome is sometimes deleted in the embryo due to CRISPR. In GP's standard embryo testing process both parents are genotyped as well as the embryo. The parental genotypes are used to error correct the embryo genotype: DNA amplification starting from just a few biopsied cells introduces noise, but it can be removed. GP can determine whether specific alleles from a parent are present in the embryo. Detection of the deletion of an entire chunk of chromosome would be fairly straightforward.

Thursday, October 29, 2020

Othram helps solve cold case: killer of Siobhan McGuinness (age 5) identified after 46 years

Othram, a DNA forensics company I co-founded, has helped to solve another cold case. 

Montana Girl, 5, Was Abducted Near Home and Found Dead in Drain — and Killer ID'd 46 Years Later 
For 46 years, the family of Siobhan McGuinness waited to find out who killed the spunky 5-year-old back in 1974 
On a frigid February afternoon in 1974, Siobhan McGuinness was walking the short distance home from a friend’s house in Missoula, Montana, when she vanished. Two days later, the 5-year-old’s body was found in a snow-covered drain culvert near the exit for Turah on I-90, just outside the city limits. She had been sexually assaulted. She also sustained trauma to her head and stab wounds to her chest, according to the FBI. 
Detectives at the time searched tirelessly for the little girl’s killer, but came up empty. The case went cold for decades. 
On Monday, authorities announced that after 46 years, the Missoula County Sheriff’s Office Cold Case Squad, detectives from the Missoula Police Department and others had finally identified the man who took the life of the spunky child who was always smiling. Richard William Davis was 32 when he was traveling through the area at the time of Siobhan’s murder, Missoula Police Chief Jason White said at a press conference on Monday. ... 
Using DNA left behind at the crime scene, specialists at private technology company Othram Inc. were able to create a genealogical profile of the suspect, which led them to Davis, the company says in a press release.

See Othram: the future of DNA forensics

The existing FBI standard (CODIS) for DNA identification uses only 20 markers (STRs -- previously only 13 loci were used!). By contrast, genome wide sequencing can reliably call millions of genetic variants. 

For the first time, the cost curves for these two methods have crossed: modern sequencing costs no more than extracting CODIS markers using the now ~30 year old technology. 

What can you do with millions of genetic markers? 

1. Determine relatedness of two individuals with high precision. This allows detectives to immediately identify a relative (ranging from distant cousin to sibling or parent) of the source of the DNA sample, simply by scanning through large DNA databases. ...

If you have contacts in law enforcement, please alert them to the potential of this new technology.

Sunday, October 25, 2020

David Goldman (Spengler): China's Plan to Sino-Form the World

The latest from the always entertaining David Goldman, who writes (wrote?) the Spengler column at Asia Times.

In the lecture below, Goldman summarizes the main themes of his new book You Will Be Assimilated: China’s Plan to Sino-Form the World.


In this next interview (on the China-Iran deal of summer 2020) Goldman drops his guard a bit and waxes poetic with anti-Chinese rhetoric, as he discusses Israel, Iran, and China.

He refers to the Chinese (speaking broadly) as philo-semitic, but then jokes that this means anti-semites who like jews! In light of that remark I wonder how one should characterize Goldman's views on China and the Chinese: philo-sinic or just plain anti-Chinese?

Saturday, October 24, 2020

Composite Polygenic Risk Score predicts longevity

The paper below (senior author at Johns Hopkins University) builds a composite polygenic risk score for mortality (longevity). Outliers (top vs bottom 5%) differ by about 5 years in life expectancy. 

I expect longevity prediction to improve considerably with more and better data to analyze. See also Live Long and Prosper: Genetic Architecture of Complex Traits and Disease Risk Predictors:
We found that genetic risks are largely uncorrelated for different conditions. This suggests that there can exist individuals with, e.g., low risk simultaneously in each of multiple conditions, for essentially any combination of conditions. There is no trade-off required between different disease risks ... One could speculate that a lucky individual with exceptionally low risk across multiple conditions might have an unusually long life expectancy.

If I read the graph below correctly, in their late 70s a positive outlier (male) has ~90% chance of surviving (not sure of timescale, might be next few years? See comments), whereas for a negative outlier the odds are only ~75%.
Combined Utility of 25 Disease and Risk Factor Polygenic Risk Scores for Stratifying Risk of All-Cause Mortality 
Allison Meisner, Prosenjit Kundu, Yan Dora Zhang, Lauren V. Lan, Sungwon Kim, Disha Ghandwani, Parichoy Pal Choudhury, Sonja I. Berndt, Neal D. Freedman, Montserrat Garcia-Closas, Nilanjan Chatterjee 
The American Journal of Human Genetics doi: 10.1016/j.ajhg.2020.07.002 
While genome-wide association studies have identified susceptibility variants for numerous traits, their combined utility for predicting broad measures of health, such as mortality, remains poorly understood. We used data from the UK Biobank to combine polygenic risk scores (PRS) for 13 diseases and 12 mortality risk factors into sex-specific composite PRS (cPRS). These cPRS were moderately associated with all-cause mortality in independent data: the estimated hazard ratios per standard deviation were 1.10 (95% confidence interval: 1.05, 1.16) and 1.15 (1.10, 1.19) for women and men, respectively. Differences in life expectancy between the top and bottom 5% of the cPRS were estimated to be 4.79 (1.76, 7.81) years and 6.75 (4.16, 9.35) years for women and men, respectively. These associations were substantially attenuated after adjusting for non-genetic mortality risk factors measured at study entry. The cPRS may be useful in counseling younger individuals at higher genetic risk of mortality on modification of non-genetic factors.

Thursday, October 22, 2020

Replications of Height Genomic Prediction: Harvard, Stanford, 23andMe

These are two replications of our 2017 height prediction results (also recently validated using sibling data) that I neglected to blog about previously.

1. Senior author Liang is in the Deptartments of Epidemiology and Biostatistics at Harvard.
Efficient cross-trait penalized regression increases prediction accuracy in large cohorts using secondary phenotypes 
Wonil Chung, Jun Chen, Constance Turman, Sara Lindstrom, Zhaozhong Zhu, Po-Ru Loh, Peter Kraft and Liming Liang 
Nature Communications volume 10, Article number: 569 (2019) 
We introduce cross-trait penalized regression (CTPR), a powerful and practical approach for multi-trait polygenic risk prediction in large cohorts. Specifically, we propose a novel cross-trait penalty function with the Lasso and the minimax concave penalty (MCP) to incorporate the shared genetic effects across multiple traits for large-sample GWAS data. Our approach extracts information from the secondary traits that is beneficial for predicting the primary trait based on individual-level genotypes and/or summary statistics. Our novel implementation of a parallel computing algorithm makes it feasible to apply our method to biobank-scale GWAS data. We illustrate our method using large-scale GWAS data (~1M SNPs) from the UK Biobank (N = 456,837). We show that our multi-trait method outperforms the recently proposed multi-trait analysis of GWAS (MTAG) for predictive performance. The prediction accuracy for height by the aid of BMI improves from R2 = 35.8% (MTAG) to 42.5% (MCP + CTPR) or 42.8% (Lasso + CTPR) with UK Biobank data.

2. This is a 2019 Stanford paper. Tibshirani and Hastie are famous researchers in statistics and machine learning. Figure is from their paper.

A Fast and Flexible Algorithm for Solving the Lasso in Large-scale and Ultrahigh-dimensional Problems 
Junyang Qian, Wenfei Du, Yosuke Tanigawa, Matthew Aguirre, Robert Tibshirani, Manuel A. Rivas, Trevor Hastie 
1Department of Statistics, Stanford University 2Department of Biomedical Data Science, Stanford University 
Since its first proposal in statistics (Tibshirani, 1996), the lasso has been an effective method for simultaneous variable selection and estimation. A number of packages have been developed to solve the lasso efficiently. However as large datasets become more prevalent, many algorithms are constrained by efficiency or memory bounds. In this paper, we propose a meta algorithm batch screening iterative lasso (BASIL) that can take advantage of any existing lasso solver and build a scalable lasso solution for large datasets. We also introduce snpnet, an R package that implements the proposed algorithm on top of glmnet (Friedman et al., 2010a) for large-scale single nucleotide polymorphism (SNP) datasets that are widely studied in genetics. We demonstrate results on a large genotype-phenotype dataset from the UK Biobank, where we achieve state-of-the-art heritability estimation on quantitative and qualitative traits including height, body mass index, asthma and high cholesterol.

The very first validation I heard about was soon after we posted our paper (2018 IIRC): I visited 23andMe to give a talk about genomic prediction and one of the PhD researchers there said that they had reproduced our results, presumably using their own data. At a meeting later in the day, one of the VPs from the business side who had missed my talk in the morning was shocked when I mentioned few cm accuracy for height. He turned to one of the 23andMe scientists in the room and exclaimed 

I thought WE were the best in the world at this stuff!?

Saturday, October 17, 2020

Down the Rabbit Hole: Mark Lane, the Zapruder film, and the JFK Conspiracy

Putting these here for future reference. From comments:
At minimum the evidence is strong for a CIA JFK coverup -- see last video, for example. It doesn't mean they did it, ofc. Johnson pressured Warren to lead the commission with the argument that if the public became convinced the Soviets/Cubans were behind it WW3 would result. This could have affected CIA actions post-Dallas as well. But I suspect something more sinister on the part of certain elements of CIA, and there is tons of evidence to that effect leaking out over the years. 
I enjoy listening to Mark Lane speak even if he turns out to be incorrect in some or many of his allegations. I think he destroys Buckley in their debate: Lane the Rationalist and Buckley a good example of motivated or biased reasoning. 
I've followed Spygate for 4 years now, with the media covering it up and FBI/CIA refusing to produce documents, Barr probably acting to protect the institutions, FISA court obviously corrupt, etc. The JFK matter has a very familiar feel to it. [ Should add the Epstein matter, which unfolded in plain sight over 20y, as another example. ]
Mark Lane, at the peak of his powers, discusses the Warren Commission report with William F. Buckley (1966):


 Mark Lane, near the end of his life:


Astonishing 2014 claims about the Zapruder film in CIA hands in the days after Dallas: the creation of two different briefing boards, one seen by CIA director John McCone, the other given to the Warren Commission. The interview is remarkable.


Horne and Brugioni strike me as very credible. CIA NPIC's main activity was interpreting U2 spy plane photographs, and had some of the most advanced photographic technology of the era. They were a logical choice to have a first look at the Zapruder film, but Brugioni did not learn until decades later that the CIA modified the film (removing certain frames, esp. near #313 which shows Kennedy's head exploding) and only gave a full briefing to McCone while witholding information from the Warren Commission. 
While serving as chief analyst of military records at the Assassination Records Review Board in 1997, Douglas Horne discovered that the Zapruder Film was examined by the CIA's National Photographic Interpretation Center two days after the assassination of President Kennedy. 
In this film, Horne interviews legendary NPIC photo interpreter Dino Brugioni, who speaks for the first time about another NPIC examination of the film the day after the assassination. Brugioni didn't know about the second examination and believes the Zapruder Film in the archives today is not the film he saw the day after the assassination.

Bonus: Interview with son of E. Howard Hunt (CIA, convicted Watergate Plumber), including audio of Hunt's confession. Note link to Cord Meyer, also see Mary Meyer 1964 execution in Georgetown...

James Jesus Angleton -- "A Wilderness of Mirrors"

Note Added in response to a question in the comments:
I suggest you invest an hour or two in 
1. Brugioni interview: establishes that a conspiracy at the highest level of CIA to alter the film evidence was in place within ~24h of the shooting. Hard to explain unless there was very strong motivation already... bureaucracies usually can't react that fast when *surprised* by events. 
2. Interview with Hunt's son, concerning his deathbed confession of being aware of (and playing a minor role in) the assassination conspiracy. Hunt is a well-known CIA figure who was involved in lots of covert ops including Watergate. You don't have to accept this as fully credible of course, but you can't say that conspiracy didn't happen because otherwise information would have leaked out. It may very well have leaked out! But few pay attention because of the groupthink against "conspiracy theory" (this term was literally invented and promulgated by CIA to discourage public attention to what it was doing during the Cold War).   
I would say I am very confident of an active cover up post assasination, less confident of a CIA role in the killing. 
Other facts that have leaked out (now confirmed by official documents and the official CIA historian) include the fact that CIA was very closely monitoring Oswald starting in 1959 and that his file was closely held by none other than CIA prince of darkness James Jesus Angleton. Now look into the unsolved killing of Angleton's friend Mary Meyer (who was having an affair with JFK when he was shot) in 1964 and you are off to the races... Their common friend Ben Bradlee (WaPo editor of Watergate fame) wrote in his memoirs of catching Angleton, having broken into Meyer's house, with her diary... 
BTW, over the years it was wrongly reported that RFK did NOT believe in a conspiracy against his brother. The evidence is pretty convincing now that he always believed in a conspiracy but didn't admit it in public.

Wednesday, October 14, 2020

Election 2020: quant analysis of new party registrations vs actual votes

I think we should ascribe very high uncertainty to polling results in this election, for a number of reasons including the shy Trump voter effect as well as the sampling corrections applied which depend heavily on assumptions about likely turnout. 

Graphs below are from a JP Morgan quant analysis of changes in number of registered voters by party and state, and the correlation with actual votes in subsequent election. Of course it is possible that negative covid impact has largely counteracted the effect discussed below (which is an integrated effect over the last 4 years) -- i.e., Trump was in a strong position at the beginning of 2020 but has declined since then. 

This is an unusual election for a number of reasons so it's quite hard to call the outcome. There's also a good chance the results on election night will be heavily contested.

The author of this analysis is Marko Kolanovic, Global Head of Macro Quantitative and Derivatives Strategy at J.P. Morgan. He graduated from New York University with a PhD in theoretical high-energy physics.

Anyone with high conviction about the election is welcome to post their analysis in the comments.

Sunday, October 11, 2020

US and China: A New Cold War (video interview with Lanxin Xiang)


This is an excellent discussion of the US-China geopolitical situation with Professor Lanxin Xiang. Xiang was trained at SAIS (JHU PhD), and currently holds an academic position in Geneva while directing a research institute in Shanghai.

He has a uniquely deep understanding of both Western and Chinese perspectives on globalization, economic development, US-China competition. 

Interestingly, he recently translated Skidelsky's biography of Keynes.

Two related articles in Asia Times by the Brazilian journalist Pepe Escobar:

Bonus: Bill Owens interview. See comments about Huawei at ~50m.


Wikipedia: William A. Owens (born May 8, 1940) is a retired admiral of the United States Navy and who served as Vice Chairman of the Joint Chiefs of Staff from 1994 to 1996.[1][2] Since leaving the military in 1996, he served as an executive or as a member of the board of directors of various companies, including Nortel Networks Corporation.

Sunday, October 04, 2020

Othram Helps Solve 1974 Cold Case: Carla Walker Murder

Othram, a DNA forensics company I co-founded, has solved another cold case. 

Carla Walker of Fort Worth TX was tortured, raped, and murdered in 1974. Finally the killer has been identified and arrested.

This was an open and high profile case just a few months ago. See this April 2020 episode of The DNA of Murder (Oxygen channel), hosted by Paul Holes, the detective who caught the Golden State Killer.

Who Killed Carla Walker? In 1974, 17-year-old Carla Walker’s reported abduction out of the arms of her boyfriend sent a Texas town on a massive manhunt. She was discovered murdered in a culvert three days later. Paul Holes interviews the only witness, Carla’s boyfriend. 
Walker and her boyfriend, Western Hills High School football quarterback Rodney McCoy, attended a Valentine’s dance on Feb. 16, 1974. After the dance, they met up with friends and then stopped by a Fort Worth bowling alley. 
McCoy has always maintained that a man approached the couple while they were sitting inside his car at the bowling alley parking lot and pointed a gun at him. He was beaten unconscious, and when he awoke, he found his cheerleader girlfriend missing.


See Othram: the future of DNA forensics

The existing FBI standard (CODIS) for DNA identification uses only 20 markers (STRs -- previously only 13 loci were used!). By contrast, genome wide sequencing can reliably call millions of genetic variants. 

For the first time, the cost curves for these two methods have crossed: modern sequencing costs no more than extracting CODIS markers using the now ~30 year old technology. 

What can you do with millions of genetic markers? 

1. Determine relatedness of two individuals with high precision. This allows detectives to immediately identify a relative (ranging from distant cousin to sibling or parent) of the source of the DNA sample, simply by scanning through large DNA databases. ...

If you have contacts in law enforcement, please alert them to the potential of this new technology.

Genomic Prediction and Embryo Selection (video panel discussion)


This is a recent panel discussion on genomic prediction, and applications in IVF and health systems (e.g., early screening of high risk individuals for breast cancer, heart disease). 

Jamie Metzl and Simon Fishel are my co-panelists. Metzl is the author of the best seller Hacking Darwin: Genetic Engineering and the Future of Humanity. Fishel was part of the team that produced the first IVF baby in 1978, and has been a leader in IVF research ever since. 

Today millions of babies are produced through IVF. In most developed countries roughly 3-5 percent of all births are through IVF, and in Denmark the fraction is about 10 percent! But when the technology was first introduced with the birth of Louise Brown in 1978, the pioneering scientists had to overcome significant resistance. There may be an alternate universe in which IVF was not allowed to develop, and those millions of children were never born.
Wikipedia: ...During these controversial early years of IVF, Fishel and his colleagues received extensive opposition from critics both outside of and within the medical and scientific communities, including a civil writ for murder.[16] Fishel has since stated that "the whole establishment was outraged" by their early work and that people thought that he was "potentially a mad scientist".[17]
I predict that within 5 years the use of polygenic risks scores will become common in some health systems and in IVF. Reasonable people will wonder why the technology was ever controversial at all, just as in the case of IVF.

Previous discussion: Sibling Validation of Polygenic Risk Scores and Complex Trait Prediction (Nature Scientific Reports)

Blog Archive