If I understand correctly the issue is how to realize something like backprop when most of the information flow is feed-forward (as in real neurons). How do you transport weights "non-locally"? The L2 optimization studied here doesn't actually transport weights. Rather, the optimized solution realizes the same set of weights in two places...
See earlier post Backpropagation in the Brain? Thanks for STS for the reference.
Center for Brains, Minds and Machines (CBMM)
Published on Apr 3, 2019
Speaker: Dr. Jon Bloom, Broad Institute
Abstract: When trained to minimize reconstruction error, a linear autoencoder (LAE) learns the subspace spanned by the top principal directions but cannot learn the principal directions themselves. In this talk, I'll explain how this observation became the focus of a project on representation learning of neurons using single-cell RNA data. I'll then share how this focus led us to a satisfying conversation between numerical analysis, algebraic topology, random matrix theory, deep learning, and computational neuroscience. We'll see that an L2-regularized LAE learns the principal directions as the left singular vectors of the decoder, providing a simple and scalable PCA algorithm related to Oja's rule. We'll use the lens of Morse theory to smoothly parameterize all LAE critical manifolds and the gradient trajectories between them; and see how algebra and probability theory provide principled foundations for ensemble learning in deep networks, while suggesting new algorithms. Finally, we'll come full circle to neuroscience via the "weight transport problem" (Grossberg 1987), proving that L2-regularized LAEs are symmetric at all critical points. This theorem provides local learning rules by which maximizing information flow and minimizing energy expenditure give rise to less-biologically-implausible analogues of backproprogation, which we are excited to explore in vivo and in silico. Joint learning with Daniel Kunin, Aleksandrina Goeva, and Cotton Seed.