Nonlinear computation in deep linear networks

OpenAI published research on nonlinear computation in deep linear networks. The study explores how deep linear models can perform complex computations despite their simplicity. This insight helps improve understanding of neural network behavior and design.

ArchiveMajor

Signal trust

Single sourceEarly signal

PublishedFriday, September 29, 2017 at 9:00 AMSep 29, 09:00 AM

FreshnessArchive

Story ID#874

Back to feed Original report

Original article excerpt

Server-side extracted preview paragraphs from the original source.

We’ve shown that deep linear networks—as implemented using floating-point arithmetic—are not actually linear and can perform nonlinear computation. We used evolution strategies⁠ to find parameters in linear networks that exploit this trait, letting us solve non-trivial problems.

Neural networks consist of stacks of a linear layer followed by a nonlinearity like tanh or rectified linear unit. Without the nonlinearity, consecutive linear layers would be in theory mathematically equivalent to a single linear layer. So it’s a surprise that floating point arithmetic is nonlinear enough to yield trainable deep networks.

Opening the briefing

Nonlinear computation in deep linear networks

Original article excerpt