-
Physics-informed neural networks (PINNs) have recently garnered significant attention as a meshless solution framework for solving partial differential equations (PDEs) in the context of AI-assisted scientific research (AI for Science). However, traditional PINNs exhibit certain limitations. On one hand, their network architecture, typically multilayer perceptrons (MLPs) with unidirectional information transfer, struggles to effectively capture key features embedded in sequential data, resulting in weak information characterization. On the other hand, the loss function of PINNs, a quadratic penalty function embedded with physical constraints, has an unconstrained and infinitely inflated penalty factor that affects the efficiency of the model's training optimization search. To address these challenges, this paper proposes an improved PINN based on information representation and loss optimization, termed allaPINNs, which aims to enhance the model’s key feature extraction capability and training optimization search ability, thereby improving its accuracy and generalization for solving numerical solutions of PDEs. In terms of information characterization, allaPINNs introduces efficient linear attention (LA) to enhance the model’s ability to identify key features while reducing the computational complexity of dynamic weighting. In terms of loss optimization, allaPINNs reconstructs the objective loss function by introducing the augmented Lagrangian (AL) function, utilizing learnable Lagrangian multipliers and penalty factors to efficiently regulate the interaction of each loss residual term. The feasibility of allaPINNs is validated through four benchmark equations: Helmholtz, Black-Scholes, Burgers, and nonlinear Schrödinger. The results demonstrate that allaPINNs can effectively solve various PDEs of different complexities and exhibit excellent numerical solution prediction accuracy and generalization ability. Compared to the current state-of-the-art PINNs, the predictive accuracy is improved by one to two orders of magnitude.
-
Keywords:
- physics-informed neural networks /
- linear attention /
- augmented Lagrangian functions /
- solving partial differential equations
-
图 5 最优拉格朗日乘子$ \lambda_{*} $的频率直方图, 其中, Helmholtz方程包含$ \lambda_{1}^{*} $, Black-Scholes方程包含$ \lambda_{1}^{*}{\text{—}}\lambda_{3}^{*} $, Burgers方程包含$ \lambda_{1}^{*} $和$ \lambda_{2}^{*} $, 非线性Schrödinger方程包含$ \lambda_{1}^{*}{\text{—}}\lambda_{6}^{*} $
Figure 5. The frequency histogram show the optimal Lagrangian multipliers $ \lambda^{*} $, where the Helmholtz equation contains $ \lambda_{1}^{*} $, the Black-Scholes equation contains $ \lambda_{1}^{*}-\lambda_{3}^{*} $, the Burgers equation contains $ \lambda_{1}^{*}, \lambda_{2}^{*} $, and nonlinear Schrödinger equation contains $ \lambda_{1}^{*}-\lambda_{6}^{*} $.
图 6 惩罚因子σ的学习过程, 其中(a) (b) (c) (d)中的$ \sigma^{*} $分别位于第48665、49420、43467和48933轮迭代, 对应的数值分别为2517.21、35.45、5.02和52.08
Figure 6. The learning process of the penalty factor σ, where $ \sigma^{*} $ for (a) (b) (c) (d) are located at rounds 48665, 49420, 43467, and 48933, corresponding to values of 2517.21, 35.45, 5.02, and 52.08, respectively.
表 1 allaPINNs模型和当前先进PINNs模型求解四个基准方程实验的$ L^{2} $相对误差对比结果
Table 1. Comparison of $ L^{2} $ relative errors between allaPINNs model and current state-of-the-art PINNs model for solving the four benchmark equations.
物理信息神经求解器 求解方程 PINNs[23] AL-PINNs[61] f-PICNN[30] KINN[31] allaPINNs (本文) Helmholtz 5.63 × 10–2 1.82 × 10–3 2.51 × 10–3 1.08 × 10–3 8.06 × 10–4 Black-Scholes 7.18 × 10–2 7.41 × 10–3 5.24 × 10–3 4.35 × 10–3 3.48 × 10–4 Burgers 7.04 × 10–2 3.39 × 10–3 2.49 × 10–3 3.02 × 10–3 8.31 × 10–4 非线性Schrödinger 2.09 × 10–2 1.55 × 10–3 4.18 × 10–3 1.37 × 10–3 6.71 × 10–4 表 2 allaPINNs模型在不同网络结构和损失函数条件下的平均$ L^{2} $相对误差对比结果
Table 2. Comparison of mean $ L^{2} $ relative errors of allaPINNs model with different network structure and loss function conditions.
网络结构 基准方程平均$ L^{2} $
相对误差损失函数 基准方程平均$ L^{2} $
相对误差MLPs 7.26 × 10–2 罚函数 8.55 × 10–3 CNN 3.09 × 10–3 罚函数
(高斯噪声)1.63 × 10–3 LA 6.61 × 10–4 AL 6.61 × 10–4 表 B1 各对比模型中超参数配置和求解基准方程时间消耗
Table B1. Hyperparameter configuration and time consumption for solving the benchmark equations in each comparison model.
物理信息神经求解器 对比指标 PINNs[23] AL-PINNs[61] f-PICNN[30] KINN[31] allaPINNs (本文) 网络层数 8 8 6 5 5 神经元数量 200 256 128 80 64 参数学习率 1 × 10–3 1 × 10–4 1 × 10–4 1 × 10–4 1 × 10–3 求解Helmholtz时间消耗/s 1476 1520 1529 1606 1513 求解Black-Scholes时间消耗/s 1538 1585 1604 1714 1624 求解Burgers时间消耗/s 1519 1574 1593 1636 1588 求解非线性Schrödinger时间消耗/s 1545 1628 1650 1745 1661 -
[1] Jin X, Cai S, Li H, Karniadakis G E 2021 J. Comput. Phys. 426 109951
Google Scholar
[2] Roul P, Goura V P 2020 J. Comput. Appl. Math. 363 464
Google Scholar
[3] Pu J C, Li J, Chen Y 2021 Chin. Phys. B 30 60202
Google Scholar
[4] Cuomo S, Di Cola V S, Giampaolo F, Rozza G, Raissi M, Piccialli F 2022 J. Sci. Comput. 92 88
Google Scholar
[5] Samaniego E, Anitescu C, Goswami S, Nguyen-Thanh V M, Guo H, Hamdia K, Zhuang X, Rabczuk T 2020 Comput. Methods Appl. Mech. Eng. 362 112790
Google Scholar
[6] Taylor C A, Hughes T J, Zarins C K 1998 Comput. Methods Appl. Mech. Eng. 158 155
Google Scholar
[7] Zhang Y 2009 Appl. Math. Comput. 215 524
[8] Van Hoecke L, Boeye D, Gonzalez-Quiroga A, Patience G S, Perreault P 2023 Can. J. Chem. Eng. 101 545
Google Scholar
[9] Hasan F, Ali H, Arief H A 2025 Int. J. Appl. Comput. Math. 11 1
Google Scholar
[10] Choo Y S, Choi N, Lee B C 2010 Appl. Math. Modell. 34 14
Google Scholar
[11] Lawrence N D, Montgomery J 2024 R. Soc. Open Sci. 11 231130
Google Scholar
[12] Si Z Z, Wang D L, Zhu B W, Ju Z T, Wang X P, Liu W, Malomed B A, Wang Y Y, Dai C Q 2024 Laser Photonics Rev. 18 2400097
Google Scholar
[13] Fang Y, Han H B, Bo W B, Liu W, Wang B H, Wang Y Y, Dai C Q 2023 Opt. Lett. 48 779
Google Scholar
[14] Li N, Xu S, Sun Y, Chen Q 2025 Nonlinear Dyn. 113 767
Google Scholar
[15] Mouton L, Reiter F, Chen Y, Rebentrost P 2024 Phys. Rev. A 110 022612
Google Scholar
[16] Zhu M, Feng S, Lin Y, Lu L 2023 Comput. Methods Appl. Mech. Eng. 416 116300
Google Scholar
[17] Li X, Liu Z, Cui S, Luo C, Li C, Zhuang Z 2019 Comput. Methods Appl. Mech. Eng. 347 735
Google Scholar
[18] Wang S, Teng Y, Perdikaris P 2021 SIAM J. Sci. Comput. 43 A3055
Google Scholar
[19] Chew A W Z, He R, Zhang L 2025 Arch. Comput. Methods Eng. 32 399
Google Scholar
[20] Bai J, Rabczuk T, Gupta A, Alzubaidi L, Gu Y 2023 Comput. Mech. 71 543
Google Scholar
[21] Son S, Lee H, Jeong D, Oh K Y, Sun K H 2023 Adv. Eng. Inf. 57 102035
Google Scholar
[22] 方泽, 潘泳全, 戴栋, 张俊勃 2024 73 145201
Google Scholar
Fang Z, Pan Y Q, Dai D, Zhang J B 2024 Acta Phys. Sin. 73 145201
Google Scholar
[23] Raissi M, Perdikaris P, Karniadakis G E 2019 J. Comput. Phys. 378 686
Google Scholar
[24] Hornik K 1991 Neural Netw. 4 251
Google Scholar
[25] Baydin A G, Pearlmutter B A, Radul A A, Siskind J M 2018 J. Mach. Learn. Res. 18 1
[26] De Ryck T, Mishra S 2024 Acta Numer. 33 633
Google Scholar
[27] Karniadakis G E, Kevrekidis I G, Lu L, Perdikaris P, Wang S, Yang L 2021 Nat. Rev. Phys. 3 422
Google Scholar
[28] Ren P, Rao C, Liu Y, Wang J X, Sun H 2022 Comput. Methods Appl. Mech. Eng. 389 114399
Google Scholar
[29] Lei L, He Y, Xing Z, Li Z, Zhou Y 2025 IEEE Trans. Ind. Inf. 21 5411
Google Scholar
[30] Yuan B, Wang H, Heitor A, Chen X 2024 J. Comput. Phys. 515 113284
Google Scholar
[31] Wang Y, Sun J, Bai J, Anitescu C, Eshaghi M S, Zhuang X, Rabczuk T, Liu Y 2025 Comput. Methods Appl. Mech. Eng. 433 117518
Google Scholar
[32] Wang Y, Sun J, Bai J, Anitescu C, Eshaghi M S, Zhuang X, Rabczuk T, Liu Y 2025 Comput. Methods Appl. Mech. Eng. 433 117518
Google Scholar
[33] Jahani-Nasab M, Bijarchi M A 2024 Sci. Rep. 14 23836
Google Scholar
[34] Yu J, Lu L, Meng X, Karniadakis G E 2022 Comput. Methods Appl. Mech. Eng. 393 114823
Google Scholar
[35] Jiao Y, Lai Y, Lo Y, Wang Y, Yang Y 2024 Anal. Appl. 22 57
Google Scholar
[36] Yang A, Xu S, Liu H, Li N, Sun Y 2025 Nonlinear Dyn. 113 1523
Google Scholar
[37] Li Y, Zhou Z, Ying S 2022 J. Comput. Phys. 451 110884
Google Scholar
[38] Jacot A, Gabriel F, Hongler C 2018 Adv. Neural Inf. Process. Syst. 31 8570
[39] Xiang Z, Peng W, Liu X, Yao W 2022 Neurocomputing 496 11
Google Scholar
[40] Tancik M, Srinivasan P, Mildenhall B, Fridovich-Keil S, Raghavan N, Singhal U, Ramamoorthi R, Barron J, Ng R 2020 Adv. Neural Inf. Process. Syst. 33 7537
[41] Zhang Z, Wang Y, Tan S, Xia B, Luo Y 2025 Neurocomputing 625 129429
Google Scholar
[42] Zhang W, Li H, Tang L, Gu X, Wang L, Wang L 2022 Acta Geotech. 17 1367
Google Scholar
[43] Zhang Z, Wang Q, Zhang Y, Shen T, Zhang W 2025 Sci. Rep. 15 10523
Google Scholar
[44] Cybenko G 1989 Math. Control Signals Syst. 2 303
Google Scholar
[45] Wang C, Ma C, Zhou J 2014 J. Global Optim. 58 51
Google Scholar
[46] Yi K, Zhang Q, Fan W, Wang S, Wang P, He H, An N, Lian D, Cao L, Niu Z 2023 Adv. Neural Inf. Process. Syst. 36 76656
[47] Durstewitz D, Koppe G, Thurm M I 2023 Nat. Rev. Neurosci. 24 693
Google Scholar
[48] Chang G, Hu S, Huang H 2023 J. Supercomput. 79 6991
Google Scholar
[49] Ocal H 2025 Arabian J. Sci. Eng. 50 1097
Google Scholar
[50] Wu H C 2009 Eur. J. Oper. Res. 196 49
Google Scholar
[51] Curtis F E, Jiang H, Robinson D P 2015 Math. Program. 152 201
Google Scholar
[52] Sun D, Sun J, Zhang L 2008 Math. Program. 114 349
Google Scholar
[53] Kanzow C, Steck D 2019 Math. Program. 177 425
Google Scholar
[54] Rockafellar R T 2023 Math. Program. 198 159
Google Scholar
[55] Dampfhoffer M, Mesquida T, Valentian A, Anghel L 2023 IEEE Trans. Neural Networks Learn. Syst. 35 11906
[56] Humbird K D, Peterson J L, McClarren R G 2018 IEEE Trans. Neural Networks Learn. Syst. 30 1286
[57] Zhang Z, Wang Q, Zhang Y, Shen T 2025 Digital Signal Process. 156 104766
Google Scholar
[58] Zhou P, Xie X, Lin Z, Yan S 2024 IEEE Trans. Pattern Anal. Mach. Intell. 46 6486
Google Scholar
[59] Rather I H, Kumar S, Gandomi A H 2024 Artif. Intell. Rev. 57 226
Google Scholar
[60] Thulasidharan K, Priya N V, Monisha S, Senthilvelan M 2024 Phys. Lett. A 511 129551
Google Scholar
[61] Son H, Cho S W, Hwang H J 2023 Neurocomputing 548 126424
Google Scholar
[62] Song Y, Wang H, Yang H, Taccari M L, Chen X 2024 J. Comput. Phys. 501 112781
Google Scholar
Metrics
- Abstract views: 296
- PDF Downloads: 13
- Cited By: 0