View Abstract
Abstract: While foundation models have gainedconsiderable attention in core AI fields such as natural language processing(NLP) and computer vision (CV), their application to learning complex responsesof physical systems from experimental measurements remains underexplored. Inphysical systems, learning problems are often characterized as discoveringoperators that map between function spaces, using only a few samples ofcorresponding function pairs. For instance, in the automated discovery ofheterogeneous material models, the foundation model must be capable ofidentifying the mapping between applied loading fields and the resultingdisplacement fields, while also inferring the underlying microstructure thatgoverns this mapping. While the former task can be seen as a PDE forwardproblem, the later task frequently constitutes a severely ill-posed PDE inverseproblem. In this talk, we will consider thelearning of heterogeneous material responses as an exemplar problem to explorethe development of a foundation model for physical systems. Specifically, weshow that the attention mechanism is mathematically equivalent to a doubleintegral operator, enabling nonlocal interactions among spatial tokens througha data-dependent kernel that characterizes the inverse mapping from data to thehidden microstructure/parameter field of the underlying operator. Consequently,the attention mechanism captures global prior information from training datagenerated by multiple systems (i.e., specimens with different microstructures)and suggests an exploratory space in the form of a nonlinear kernel map. Basedon this theoretical analysis, we introduce a novel neural operatorarchitecture, the Nonlocal Attention Operator (NAO). By leveraging theattention mechanism, NAO can address ill-posedness and rank deficiency ininverse PDE problems by encoding regularization and enhancing generalizability.To demonstrate the applicability of NAO to material modeling problems, we applyit to the development of a foundation constitutive law across multiplematerials, showcasing its generalizability to unseen data resolutions andsystem states. Furthermore, we investigate the potentials of NAO inmicrostructure discovery and multiscale crack propagation problems. Our worknot only suggests a novel neural operator architecture for learning aninterpretable foundation model of physical systems, but also offers a newperspective towards understanding the attention mechanism.