-
第一性原理、热力学模拟等传统的材料计算方法在高熵合金的设计中多用于合金相的预测, 同时会耗费巨大的计算资源. 本文以性能为导向, 选用机器学习的算法建立了一个高熵合金硬度预测模型, 并将机器学习与固溶体强化的物理模型相结合, 使用遗传算法筛选出最具有代表性的3个特征参数, 利用这3个特征构建的随机森林模型, 其R2达到了0.9416, 对高熵合金的硬度取得了较好的预测效果. 本文选用的机器学习算法和3个材料特征在固溶体强化性质方面也有一定的预测效果. 针对随机森林可解释性较差的问题, 本文还利用SHAP可解释机器学习方法挖掘了机器学习模型的内在推理逻辑.Traditional material calculation methods, such as first principles and thermodynamic simulations, have accelerated the discovery of new materials. However, these methods are difficult to construct models flexibly according to various target properties. And they will consume many computational resources and the accuracy of their predictions is not so high. In the last decade, data-driven machine learning techniques have gradually been applied to materials science, which has accumulated a large quantity of theoretical and experimental data. Machine learning is able to dig out the hidden information from these data and help to predict the properties of materials. The data in this work are obtained from the published references. And several performance-oriented algorithms are selected to build a prediction model for the hardness of high entropy alloys. A high entropy alloy hardness dataset containing 19 candidate features is trained, tested, and evaluated by using an ensemble learning algorithm: a genetic algorithm is selected to filter the 19 candidate features to obtain an optimized feature set of 8 features; a two-stage feature selection approach is then combined with a traditional solid solution strengthening theory to optimize the features, three most representative feature parameters are chosen and then used to build a random forest model for hardness prediction. The prediction accuracy achieves an R2 value of 0.9416 by using the 10-fold cross-validation method. To better understand the prediction mechanism, solid solution strengthening theory of the alloy is used to explain the hardness difference. Further, the atomic size, electronegativity and modulus mismatch features are found to have very important effects on the solid solution strengthening of high entropy alloys when genetic algorithms are used for implementing the feature selection. The machine learning algorithm and features are further used for predicting solid solution strengthening properties, resulting in an R2 of 0.8811 by using the 10-fold cross-validation method. These screened-out parameters have good transferability for various high entropy alloy systems. In view of the poor interpretability of the random forest algorithm, the SHAP interpretable machine learning method is used to dig out the internal reasoning logic of established machine learning model and clarify the mechanism of the influence of each feature on hardness. Especially, the valence electron concentration is found to have the most significant weakening effect on the hardness of high entropy alloys.
-
Keywords:
- high entropy alloys /
- machine learning /
- genetic algorithms /
- solid solution strengthening
[1] Wu P F, Gan K F, Yan D S, Fu Z H, Li Z M 2021 Corros. Sci. 183 109341
[2] Ranganathan S 2003 Curr. Sci. 85 1404
[3] Wu Y D, Cai Y H, Wang T, Si J J, Zhu J, Wang Y D, Hui X D 2014 Mater. Lett. 130 277Google Scholar
[4] Yu Y, Wang J, Li J S, Kou H C, Duan H T, Li J, Liu W M 2015 Tribol. Int. 92 203Google Scholar
[5] Cheng P, Zhao Y H, Xu X T, Wang S, Sun Y Y, Hou H 2020 Mater. Sci. Eng. A 772 138681Google Scholar
[6] Beniwal D, Singh P, Gupta S, Kramer M J, Johnson D D, Ray P K 2022 npj Comput. Mater. 8 153Google Scholar
[7] Jiang C, Uberuaga B P 2016 Phys. Rev. Lett. 116 105501Google Scholar
[8] 李志强, 谭晓瑜, 段忻磊, 张敬义, 杨家跃 2022 71 247803Google Scholar
Li Z Q, Tan X Y, Duan X L, Zhang J Y, Yang J Y 2022 Acta Phys. Sin. 71 247803Google Scholar
[9] 任县利, 张伟伟, 伍晓勇, 吴璐, 王月霞 2020 69 046102Google Scholar
Ren X L, Zhang W W, Wu X Y, Wu L, Wang Y X 2020 Acta Phys. Sin. 69 046102Google Scholar
[10] 杨自欣, 高章然, 孙晓帆, 蔡宏灵, 张凤鸣, 吴小山 2019 68 210502Google Scholar
Yang Z X, Gao Z R, Sun X F, Cai H L, Zhang F M, Wu X S 2019 Acta Phys. Sin. 68 210502Google Scholar
[11] 寇雯博, 董灏, 邹岷强, 韩均言, 贾西西 2021 70 030701Google Scholar
Kou W B, Dong H, Zou M Q, Han J Y, Jia X X 2021 Acta Phys. Sin. 70 030701Google Scholar
[12] 黎威, 龙连春, 刘静毅, 杨洋 2022 71 060202Google Scholar
Li W, Long L C, Liu J Y, Yang Y 2022 Acta Phys. Sin. 71 060202Google Scholar
[13] Sun Y, Lu Z C, Liu X J, Du Q, Xie H M, Lv J C, Song R X, Wu Y, Wang H, Jiang S H, Lu Z P 2021 Appl. Phys. Lett. 119 201905
[14] Khakurel H, Taufique M F N, Roy A, Balasubramanian G, Ouyang G, Cui J, Johnson D D, Devanathan R 2021 Sci. Rep. 11 17149
[15] Chang Y J, Jui C Y, Lee W J, Yeh A C 2019 JOM 71 3433Google Scholar
[16] Bakr M, Syarif J, Hashem I A T 2022 Mater. Today Commun. 31 103407Google Scholar
[17] Li Y, Guo W L 2019 Phys. Rev. Mater. 3 095005Google Scholar
[18] Xiong J, Shi S Q, Zhang T Y 2021 J. Mater. Sci. Technol. 87 133Google Scholar
[19] Lee K, Ayyasamy M V, Delsa P, Hartnett T Q, Balachandran P V 2022 npj Comput. Mater. 8 25Google Scholar
[20] Wen C, Zhang Y, Wang C X, Xue D Z, Bai Y, Antonov S, Dai L H, Lookman T, Su Y J 2019 Acta Mater. 170 109Google Scholar
[21] Li S, Li S, Liu D R, Zou R, Yang Z Y 2022 Comput. Mater. Sci. 205 111185Google Scholar
[22] Kusdhany M, Lyth S M 2021 Carbon 179 190Google Scholar
[23] Chang H, Tao Y, Liaw P K, Ren J 2022 J. Alloys Compd. 921 166149Google Scholar
[24] Wang W Y, Shang S L, Wang Y, Han F, Darling K A, Wu Y, Xie X, Senkov O N, Li J, Hui X D, Dahmen K A, Liaw P K, Kecskes L J, Liu Z K 2017 npj Comput. Mater. 3 23
[25] Guo S 2015 Mater. Sci. Technol. 31 1223Google Scholar
[26] Yang C, Ren C, Jia Y, Wang G, Li M, Lu W 2022 Acta Mater. 222 117431Google Scholar
[27] Grinsztajn L, Oyallon E, Varoquaux G 2022 NeurIPS 2022 Datasets and Benchmarks Track New Orleans, United States, November 28, 2022 p507
[28] Zhang Y F, Ren W, Wang W L, Li N, Zhang Y X, Li X M, Li W H 2023 J. Alloys Compd. 945 169329Google Scholar
[29] Zhang Y, Wen C, Wang C X, Antonov S, Xue D Z, Bai Y, Su Y J 2020 Acta Mater. 185 528Google Scholar
[30] Zhang L, Chen H M, Tao X M, Cai H G, Liu J N, Ouyang Y F, Peng Q, Du Y 2020 Mater. Des. 193 108835Google Scholar
[31] Lundberg S, Lee S I 2017 Proceedings of the 31st International Conference on Neural Information Processing Systems, United States, December 3, 2017 p4768
[32] 胡赓祥, 蔡珣, 戎咏华 2010 材料科学基础 (上海: 上海交通大学出版社) 第177页
Hu G X, Cai X, Rong Y H 2010 Material Science Foundation (Shanghai: Shanghai Jiao Tong University Press) p177
[33] Huang X Y, Jin C, Zhang C, Zhang H, Fu H W 2021 Mater. Des. 211 110177Google Scholar
[34] Wen C, Wang C X, Zhang Y, Antonov S, Xue D Z, Lookman T, Su Y J 2021 Acta Mater. 212 116917Google Scholar
[35] Wang Z, Huang Y, Yang Y, Wang J, Liu C T 2015 Scr. Mater. 94 28Google Scholar
[36] Yang X, Zhang Y 2012 Mater. Chem. Phys. 132 233Google Scholar
[37] Labusch R 1970 Phys. Status Solidi B 41 659Google Scholar
[38] Thirathipviwat P, Sato S, Song G, Bednarcik J, Nielsch K, Jung J, Han J 2022 Scr. Mater. 210 114470Google Scholar
[39] Ma E, Wu X 2019 Nat. Commun. 10 5623Google Scholar
[40] Toda-Caraballo I, Rivera-Díaz-del-Castillo P E J 2015 Acta Mater. 85 14Google Scholar
-
图 2 SBS, SFS, RF, RFE算法在不同特征数下选择的最佳特征的RMSE, 曲线中的星号代表了当前特征选择方法选择的最优特征组所包含的特征数
Fig. 2. Different number of features selected by SBS, SFS, RF, RFE algorithm vs. their RMSE performances under 10 fold. The asterisks in the curves represent the number of features contained in the optimal feature group selected by the current feature selection method.
图 3 遗传算法所选优化特征组8种特征的SHAP分析, 8种特征由上到下重要性依次降低, 各个散点根据SHAP值的正负反映了该特征的大小对当前样本点硬度的促进或削弱作用
Fig. 3. SHAP analysis of the eight features of the optimized feature set selected by the genetic algorithm. The eight features decrease in importance from top to bottom. Each scatter reflects the promoting or weakening effect of the size of the feature on the hardness of the current sample point according to the positive or negative SHAP value.
图 4 (a)遗传算法所选特征的PCC热图, 子图为遗传算法所选特征的RF重要性评估排序; (b)主成分分析法计算优化特征组
$ [\gamma , \Delta \chi , {\rm{V}}{\rm{E}}{\rm{C}}, F, \varOmega , e/a, E, {\text{δ}}G] $ 不同主成分数的累计方差贡献率; (c)新构建的特征集进行GA特征选择的迭代过程, 子图为GA选择特征的SHAP重要性排序Fig. 4. (a) PCC heat map of the features selected by the genetic algorithm, with subplots for the RF importance assessment ranking of the features selected by the genetic algorithm; (b) the cumulative variance contribution of different principal component scores of the optimized feature set
$ [\gamma , \Delta \chi , {\rm{V}}{\rm{E}}{\rm{C}}, F, \varOmega , e/a, E, {\text{δ}}G] $ calculated by principal component analysis; (c) iterative process of GA feature selection for the newly constructed feature set, and the subplot is the SHAP importance ranking of the GA selected features.图 5 (a), (c) 在十折交叉验证下的模型拟合结果以及(b), (d)在LOCOCV下的模型拟合结果, 其中(a), (b) 优化特征组
$ [\gamma , {\rm{ }}\Delta \chi , $ $ {\rm{ }}{\rm{V}}{\rm{E}}{\rm{C}}, F, \varOmega , e/a, E, {\rm{ }}{\text{δ}}G] $ 作为RF输入特征; (c), (d)简版优化特征组$ [{\rm{V}}{\rm{E}}{\rm{C}}, G, {\rm{ }}{\rm{M}}. {\rm{E}}] $ 作为RF输入特征Fig. 5. (a), (c) Model fit results under 10-fold cross-validation and (b), (d) model fit results under LOCOCV: (a), (b) Optimized feature set
$ [\gamma , {\rm{ }}\Delta \chi , {\rm{ }}{\rm{V}}{\rm{E}}{\rm{C}}, F, \varOmega , e/a, E, {\text{δ}} G] $ as RF input features; (c), (d) the short version of the optimized feature set$ [{\rm{V}}{\rm{E}}{\rm{C}}, G, {\rm M.E}] $ as RF input features.图 6 数据集去除异常值后的拟合图 (a)使用了10-fold评估; (b) 使用了LOOCV评估; (c) 主图为异常值得分结果, Scores < 0视为离群点; 利用孤立森林对205个高熵合金样本进行异常值检测, 子图为利用主成分分析法降维后的异常值检测可视化结果
Fig. 6. Fitted plots of the dataset after removing outliers: (a) 10-fold is used; (b) LOOCV is used; (c) the outlier score histogram (the orange points being outlier points when scores < 0). The outlier detection is carried out for 205 high-entropy alloy samples by using isolated forest. The inset 3D figure shows the visualization results of the outlier detection after the dimensionality reduction by using principal component analysis.
图 7 以[
$ \xi , G, {\rm{M}}. {\rm{E}} $ ]作为RF输入特征,$ {{\Delta }}{\sigma }_{{\rm{S}}{\rm{S}}{\rm{C}}} $ 作为目标值, 在十折交叉验证下的评估结果Fig. 7. Evaluation results with [
$ \xi , G, {\rm{M}}. {\rm{E}} $ ] as the RF input features and$ {{\Delta }}{\sigma }_{{\rm{S}}{\rm{S}}{\rm{C}}} $ as the target values under 10-fold cross-validation.表 1 与高熵合金硬度相关的 19 个经验特征参数及其计算公式
Table 1. 19 empirical feature parameters related to the hardness of high entropy alloys and their calculation formulae.
材料特征 公式 材料特征 公式 材料特征 公式 Tm, Ec, VEC, $ e/a $,
$ E $, $ G $(由$ \alpha $表示)$ \displaystyle\sum _{i=1}^{n}{c}_{i}{\alpha }_{i} $ ΔSmix $ -R\displaystyle\sum _{i=1}^{n}{c}_{i}{\rm{l}}{\rm{n}}\left({c}_{i}\right) $ $ {w}^{6} $ $ {\left(\displaystyle\sum _{i=1}^{n}{c}_{i}{w}_{i}\right)}^{6} $ $ {\text{δ}}G $ $ \sqrt{\displaystyle\sum _{i=1}^{n}{c}_{i}{\left(1-\frac{{G}_{i}}{G}\right)}^{2}} $ ΔGmix $ {{{\Delta }}H}_{{\rm{m}}{\rm{i}}{\rm{x}}}-{T}_{{\rm{m}}}{{{\Delta }}S}_{{\rm{m}}{\rm{i}}{\rm{x}}} $ μ $ \dfrac{1}{2}E{\text{δ}}r $ $ {\text{δ}}r $ $ \sqrt{\displaystyle\sum _{i=1}^{n}{c}_{i}{\left(1-\frac{{r}_{i}}{r}\right)}^{2}} $ $ {{\Delta }}\chi $ $ \sqrt{\displaystyle\sum _{i=1}^{n}{c}_{i}{\left(\chi -{\chi }_{i}\right)}^{2}} $ $ \varOmega $ Tm$ \dfrac{{{{\Delta }}S}_{{\rm{m}}{\rm{i}}{\rm{x}}}}{{{{\Delta }}H}_{{\rm{m}}{\rm{i}}{\rm{x}}}} $ $ \gamma $ $ \dfrac{1-\sqrt{\dfrac{{\left(r+{r}_{{\rm{m}}{\rm{i}}{\rm{n}}}\right)}^{2}-{r}^{2}}{{\left(r+{r}_{{\rm{m}}{\rm{i}}{\rm{n}}}\right)}^{2}}}}{1-\sqrt{\dfrac{{\left(r+{r}_{{\rm{m}}{\rm{a}}{\rm{x}}}\right)}^{2}-{r}^{2}}{{\left(r+{r}_{{\rm{m}}{\rm{a}}{\rm{x}}}\right)}^{2}}}} $ $ A $ $ G{\text{δ}}{{r}}\dfrac{1+\mu }{1-\mu } $ $ \varLambda $ $ \dfrac{{{{\Delta }}S}_{{\rm{m}}{\rm{i}}{\rm{x}}}}{{\text{δ}}r} $ ΔHmix $ 4\displaystyle\sum _{i=1, j > i}^{n}{c}_{i}{c}_{j}{H}_{i{\text{-}}j}^{{\rm{m}}{\rm{i}}{\rm{x}}} $ $ F $ $ \dfrac{2 G}{1-\mu } $ 表 2 不同机器学习模型搜索的超参数结果
Table 2. Hyperparametric search results for different machine learning models.
算法 超参数 SVM-rbf gamma = 1×10–7, C = 200 RF max_depth = 6, min_samples_leaf = 1, min_samples_split = 2, n_estimators = 50 XGBoost gamma = 0.1, learning_rate = 0.1, max_depth = 12, n_estimators = 100, reg_alpha = 0, reg_lambda = 0.5 ANN max_iter = 210000, hidden_layer_sizes = 16, solver = 'adam', activation = 'relu',
alpha = 0.01Lasso alpha = 1, max_iter = 10000 Ridge alpha = 0.1, max_iter = 10000 表 3 不同特征选择方法筛选的优化特征组及RMSE值
Table 3. Optimized feature sets screened by different feature selection algorithms and their RMSE values.
算法 优化特征组 RMSE GA γ, Δχ, VEC, F, Ω, e/a, E, δG 64.09 SFS δr, Ec, VEC, ΔHmix, Ω, E, G 67.32 SBS Δχ, Ec, VEC, Λ, w, F, δG 67.00 RFE δr, Ec, VEC, ΔSmix, Ω, Λ, E,
μ, w, G, F, A, δG69.67 RF δr, VEC F, Λ, w, δG, μ, G,
A, Ec, Ω, ΔSmix, ΔHmix68.99 -
[1] Wu P F, Gan K F, Yan D S, Fu Z H, Li Z M 2021 Corros. Sci. 183 109341
[2] Ranganathan S 2003 Curr. Sci. 85 1404
[3] Wu Y D, Cai Y H, Wang T, Si J J, Zhu J, Wang Y D, Hui X D 2014 Mater. Lett. 130 277Google Scholar
[4] Yu Y, Wang J, Li J S, Kou H C, Duan H T, Li J, Liu W M 2015 Tribol. Int. 92 203Google Scholar
[5] Cheng P, Zhao Y H, Xu X T, Wang S, Sun Y Y, Hou H 2020 Mater. Sci. Eng. A 772 138681Google Scholar
[6] Beniwal D, Singh P, Gupta S, Kramer M J, Johnson D D, Ray P K 2022 npj Comput. Mater. 8 153Google Scholar
[7] Jiang C, Uberuaga B P 2016 Phys. Rev. Lett. 116 105501Google Scholar
[8] 李志强, 谭晓瑜, 段忻磊, 张敬义, 杨家跃 2022 71 247803Google Scholar
Li Z Q, Tan X Y, Duan X L, Zhang J Y, Yang J Y 2022 Acta Phys. Sin. 71 247803Google Scholar
[9] 任县利, 张伟伟, 伍晓勇, 吴璐, 王月霞 2020 69 046102Google Scholar
Ren X L, Zhang W W, Wu X Y, Wu L, Wang Y X 2020 Acta Phys. Sin. 69 046102Google Scholar
[10] 杨自欣, 高章然, 孙晓帆, 蔡宏灵, 张凤鸣, 吴小山 2019 68 210502Google Scholar
Yang Z X, Gao Z R, Sun X F, Cai H L, Zhang F M, Wu X S 2019 Acta Phys. Sin. 68 210502Google Scholar
[11] 寇雯博, 董灏, 邹岷强, 韩均言, 贾西西 2021 70 030701Google Scholar
Kou W B, Dong H, Zou M Q, Han J Y, Jia X X 2021 Acta Phys. Sin. 70 030701Google Scholar
[12] 黎威, 龙连春, 刘静毅, 杨洋 2022 71 060202Google Scholar
Li W, Long L C, Liu J Y, Yang Y 2022 Acta Phys. Sin. 71 060202Google Scholar
[13] Sun Y, Lu Z C, Liu X J, Du Q, Xie H M, Lv J C, Song R X, Wu Y, Wang H, Jiang S H, Lu Z P 2021 Appl. Phys. Lett. 119 201905
[14] Khakurel H, Taufique M F N, Roy A, Balasubramanian G, Ouyang G, Cui J, Johnson D D, Devanathan R 2021 Sci. Rep. 11 17149
[15] Chang Y J, Jui C Y, Lee W J, Yeh A C 2019 JOM 71 3433Google Scholar
[16] Bakr M, Syarif J, Hashem I A T 2022 Mater. Today Commun. 31 103407Google Scholar
[17] Li Y, Guo W L 2019 Phys. Rev. Mater. 3 095005Google Scholar
[18] Xiong J, Shi S Q, Zhang T Y 2021 J. Mater. Sci. Technol. 87 133Google Scholar
[19] Lee K, Ayyasamy M V, Delsa P, Hartnett T Q, Balachandran P V 2022 npj Comput. Mater. 8 25Google Scholar
[20] Wen C, Zhang Y, Wang C X, Xue D Z, Bai Y, Antonov S, Dai L H, Lookman T, Su Y J 2019 Acta Mater. 170 109Google Scholar
[21] Li S, Li S, Liu D R, Zou R, Yang Z Y 2022 Comput. Mater. Sci. 205 111185Google Scholar
[22] Kusdhany M, Lyth S M 2021 Carbon 179 190Google Scholar
[23] Chang H, Tao Y, Liaw P K, Ren J 2022 J. Alloys Compd. 921 166149Google Scholar
[24] Wang W Y, Shang S L, Wang Y, Han F, Darling K A, Wu Y, Xie X, Senkov O N, Li J, Hui X D, Dahmen K A, Liaw P K, Kecskes L J, Liu Z K 2017 npj Comput. Mater. 3 23
[25] Guo S 2015 Mater. Sci. Technol. 31 1223Google Scholar
[26] Yang C, Ren C, Jia Y, Wang G, Li M, Lu W 2022 Acta Mater. 222 117431Google Scholar
[27] Grinsztajn L, Oyallon E, Varoquaux G 2022 NeurIPS 2022 Datasets and Benchmarks Track New Orleans, United States, November 28, 2022 p507
[28] Zhang Y F, Ren W, Wang W L, Li N, Zhang Y X, Li X M, Li W H 2023 J. Alloys Compd. 945 169329Google Scholar
[29] Zhang Y, Wen C, Wang C X, Antonov S, Xue D Z, Bai Y, Su Y J 2020 Acta Mater. 185 528Google Scholar
[30] Zhang L, Chen H M, Tao X M, Cai H G, Liu J N, Ouyang Y F, Peng Q, Du Y 2020 Mater. Des. 193 108835Google Scholar
[31] Lundberg S, Lee S I 2017 Proceedings of the 31st International Conference on Neural Information Processing Systems, United States, December 3, 2017 p4768
[32] 胡赓祥, 蔡珣, 戎咏华 2010 材料科学基础 (上海: 上海交通大学出版社) 第177页
Hu G X, Cai X, Rong Y H 2010 Material Science Foundation (Shanghai: Shanghai Jiao Tong University Press) p177
[33] Huang X Y, Jin C, Zhang C, Zhang H, Fu H W 2021 Mater. Des. 211 110177Google Scholar
[34] Wen C, Wang C X, Zhang Y, Antonov S, Xue D Z, Lookman T, Su Y J 2021 Acta Mater. 212 116917Google Scholar
[35] Wang Z, Huang Y, Yang Y, Wang J, Liu C T 2015 Scr. Mater. 94 28Google Scholar
[36] Yang X, Zhang Y 2012 Mater. Chem. Phys. 132 233Google Scholar
[37] Labusch R 1970 Phys. Status Solidi B 41 659Google Scholar
[38] Thirathipviwat P, Sato S, Song G, Bednarcik J, Nielsch K, Jung J, Han J 2022 Scr. Mater. 210 114470Google Scholar
[39] Ma E, Wu X 2019 Nat. Commun. 10 5623Google Scholar
[40] Toda-Caraballo I, Rivera-Díaz-del-Castillo P E J 2015 Acta Mater. 85 14Google Scholar
计量
- 文章访问数: 4159
- PDF下载量: 163
- 被引次数: 0