The AD community has known since the 1980s that optimal application of the chain rule---a process we call accumulation---can depend on the structure of the computational graph to which it is applied. Since that time, sophisticated heuristics have been developed, the most fine-grained of which operate within a framework called face elimination. The efficacy of these heuristics, however, is difficult to assess because we don't know how to find optimal face elimination sequences (or even whether it is NP-hard to do so!), and, moreover, we don't even know for sure that there is always a face elimination sequence that minimizes the number of operations performed in the accumulation process (though a long-standing conjecture asserts that there is). In this talk we outline a new approach for reasoning about the space of all possible accumulation procedures directly and use it to obtain the first definitive statements regarding the minimum cost of derivative accumulation (as well as algorithms for realizing it) for a wide range of computational graphs, including most of the key examples in the literature. The techniques we develop are also applicable to general graphs, and should always be applied to yield simplified instances to hand off to heuristics.
Bio: Andrew Lyons interned at Argonne National Laboratory (ANL) while an undergrad, then worked at ANL/CI (as "Senior Software Developer") from 2007-2010. He was a Givens Associate in the summer of 2011.
See all upcoming talks at https://www.anl.gov/mcs/lans-seminars