Intelligent mineral image recognition based on improved ConvNeXt network
-
摘要:
矿物识别是地质研究的重要工作,但是如何准确识别矿物仍然是一项重要的挑战。针对矿物形态特征,提出了一种利用迁移学习策略并引入通道注意力的改进ConvNeXt网络矿物图像智能识别模型。首先,利用ImageNet数据集上已预训练的ConvNeXt网络模型,运用迁移学习的方式,加载到矿物识别模型中;其次,在ConvNeXt网络的基础上,以ConvNeXt块之后与注意力机制相结合的方式,进一步提升其特征融合能力;最后,以26类矿物的矿石图像为研究对象,总计34576张图像,以6∶2∶2比例划分训练集、验证集与测试集,模型在实验训练过程中与VGG19、GoogLeNet、ResNet50、ResNeXt50和ConvNeXt网络相比,收敛速度明显加快。实验结果表明,矿物智能识别模型在准确率、精确率和召回率上分别达到98.58%、98.62%和98.73%,而消融实验证明本文提出的优化方法有助于提升模型性能,同时,通过对不同模型矿物图像特征图的可视化对比分析,验证了本文提出的矿物识别模型对于矿物特征的准确提取,进一步证明了模型的有效性,提高了矿物识别的准确率。
Abstract:Mineral identification is a critical task in geological research, yet accurately identifying minerals remains a significant challenge. This study proposes an intelligent mineral image recognition model based on an improved ConvNeXt network, which utilizes transfer learning strategies and incorporates channel attention mechanisms to address the morphological characteristics of minerals.Firstly, the ConvNeXt network model pre-trained on the ImageNet dataset is employed and integrated into the mineral recognition model through transfer learning. Secondly, based on the ConvNeXt network, the model enhances feature fusion capabilities by combining the ConvNeXt blocks with attention mechanisms. Finally, a dataset comprising 34576 ore images of 26 mineral categories is used, divided into training, validation, and test sets in a 6∶2∶2 ratio. During experimental training, the proposed model demonstrates a significantly faster convergence compared to VGG19, GoogLeNet, ResNet50, ResNeXt50, and the ConvNeXt networks.Experimental results indicate that the intelligent mineral recognition model achieves an accuracy, precision, and recall of 98.58%, 98.62%, and 98.73%, respectively. Ablation experiments confirm that the optimization methods proposed in this study enhance model performance. Additionally, comparative visual analysis of feature maps from different models substantiates that the proposed mineral recognition model accurately extracts mineral features, further validating the model's effectiveness and improving mineral identification accuracy.
-
Keywords:
- mineral image /
- ConvNeXt /
- transfer learning /
- attention mechanism /
- mineral recognition
-
-
图 5 混淆矩阵(横坐标与纵坐标的序列号与表2一致,分别表示真实标签与预测标签,颜色的深浅代表准确率的大小)
Figure 5. Confusion matrix
表 1 ConvNeXt-T结构
Table 1 Structure diagram of ConvNeXt-T
层名 输入 ConvNeXt-T 输出 conv1 224×224×3 4×4,96,stride4
Layer Norm56×56×96 conv2_x 56×56×96 [d7×7,961×1,3841×1,96]×3 56×56×96 conv3_x 56×56×96 Downsample [d7×7,1921×1,7681×1,192]×3 28×28×192 conv4_x 28×28×192 Downsample[d7×7,3841×1,15361×1,384]×9 14×14×384 conv5_x 14×14×384 Downsample[d7×7,7681×1,30721×1,768]×3 7×7×768 7×7×768 Global Avg PoolingLayer NormalizationLinear 1000 表 2 矿物类别及图像数量统计
Table 2 Mineral category and image quantity statistics
序列号 矿物类别 原始图像数量/张 增强后的图像数量/张 序列号 矿物类别 原始图像数量/张 增强后的图像数量/张 1 斑铜矿 194 776 14 黄铜矿 360 1440 2 辰砂 214 856 15 辉钼矿 278 2224 3 赤铁矿 258 1032 16 辉锑矿 472 1888 4 磁黄铁矿 146 584 17 辉铜矿 300 1200 5 磁铁矿 286 1144 18 孔雀石 466 1864 6 毒砂 216 864 19 蓝铜矿 310 1240 7 方解石 374 1496 20 铝土矿 298 1192 8 方铅矿 516 2064 21 软锰矿 324 1296 9 橄榄石 142 568 22 闪锌矿 362 1448 10 铬铁矿 246 984 23 石英 257 1285 11 黑钨矿 324 1296 24 铁钼矿 121 968 12 褐铁矿 440 1760 25 雄黄 352 1408 13 黄铁矿 399 1995 26 萤石 426 1704 表 3 分类指标
Table 3 Classification index
预测值 正例(Positive) 反例(Negative) 真实值 正例(Positive) TP FP 反例(Negative) FN TN 表 4 各个模型的测试评估结果
Table 4 Test evaluation results of each model
模型 准确率 精确率 召回率 VGG19 91.25% 91.62% 91.70% GoogLeNet 92.66% 93.36% 92.33% ResNet50 94.88% 95.04% 94.95% ResNeXt50 95.07% 95.45% 95.03% ConvNeXt 96.60% 96.62% 96.73% 本文模型 98.58% 98.62% 98.73% 表 5 不同优化方法对于ConvNeXt模型的影响
Table 5 The impact of different optimization methods on the ConvNeXt model
模型 参数量 模型大小 准确率 精确率 召回率 ConvNeXt 27.80M 106.25M 96.60% 96.62% 96.73% ECA
+ConvNeXt27.82M 106.26M 97.62% 97.65% 97.61% 迁移学习+ConvNeXt 27.82M 106.26M 97.87% 97.90% 98.02% 本文模型 27.84M 106.27M 98.58% 98.62% 98.73% -
Baykan N A, Yılmaz N. 2010. Mineral identification using color spaces and artificial neural networks[J]. Computers & Geosciences, 36(1): 91−97.
El Haddad J, de Lima Filho E S, Vanier F, et al. 2019. Multiphase mineral identification and quantification by laser−induced breakdown spectroscopy[J]. Minerals Engineering, 134: 281−290. doi: 10.1016/j.mineng.2019.02.025
Hu J, Shen L, Sun G. 2018. Squeeze−and−excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition: 7132−7141.
Izadi H, Sadri J, Mehran N A. 2013. Intelligent mineral identification using clustering and artificial neural networks techniques[C]//First Iranian Conference on Pattern Recognition and Image Analysis (PRIA). IEEE, 2013: 1−5.
Khajehzadeh N, Haavisto O, Koresaar L. 2016. On−stream and quantitative mineral identification of tailing slurries using LIBS technique[J]. Minerals Engineering, 98: 101−109.
Li X, Wang W, Hu X, et al. 2019. Selective kernel networks[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition: 510−519.
Liu C, Li M, Zhang Y, et al. 2019. An enhanced rock mineral recognition method integrating a deep learning model and clustering algorithm[J]. Minerals, 9(9): 516. doi: 10.3390/min9090516
Liu Y, Zhang Z, Liu X, et al. 2021. Ore image classification based on small deep learning model: Evaluation and optimization of model depth, model structure and data size[J]. Minerals Engineering, 172: 107020. doi: 10.1016/j.mineng.2021.107020
Liu Z, Mao H, Wu C Y, et al. 2022. A convnet for the 2020s[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition: 11976−11986.
Maitre J, Bouchard K, Bédard L P. 2019. Mineral grains recognition using computer vision and machine learning[J]. Computers & Geosciences, 130: 84−93.
Selvaraju R R, Cogswell M, Das A, et al. 2017. Grad−cam: Visual explanations from deep networks via gradient−based localization[C]//Proceedings of the IEEE international conference on computer vision: 618−626.
Tsuji T, Yamaguchi H, Ishii T, et al. 2010. Mineral classification from quantitative X−ray maps using neural network: Application to volcanic rocks[J]. Island Arc, 19(1): 105−119. doi: 10.1111/j.1440-1738.2009.00682.x
Trejbal J, Valentová T, Neerka V, et al. 2020. Mechanical and image analysis of adhesion between mineral aggregate and bituminous binder[J]. Acta Polytechnica CTU Proceedings, 26: 112−116.
Wang Q, Wu B, Zhu P, et al. 2020. ECA−Net: Efficient channel attention for deep convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition: 11534−11542.
Woo S, Park J, Lee J Y, et al. 2018. Cbam: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV): 3−19.
Zamir A R, Sax A, Shen W, et al. 2018. Taskonomy: Disentangling task transfer learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition: 3712−3722.
Zhou W, Wang H, Wan Z. 2022. Ore image classification based on improved CNN[J]. Computers & Electrical Engineering, 99: 107819.
郭艳军, 周哲, 林贺洵, 等. 2020. 基于深度学习的智能矿物识别方法研究[J]. 地学前缘, 27(5): 39−47. 郝慧珍, 顾庆, 胡修棉. 2021. 基于机器学习的矿物智能识别方法研究进展与展望[J]. 地球科学, 46(9): 16. 李明超, 刘承照, 张野. 2020. 耦合颜色和纹理特征的矿物图像数据深度学习模型与智能识别方法[J]. 大地构造与成矿学, 44(2): 203−11. 刘艳鹏, 朱立新, 周永章. 2020. 大数据挖掘与智能预测找矿靶区实验研究——卷积神经网络模型的应用[J]. 大地构造与成矿学, 44(2): 192−202. 彭伟航, 白林, 商世为. 2019. 基于改进InceptionV3模型的常见矿物智能识别[J]. 地质通报, 38(12): 2059−66. 王李管, 陈斯佳, 贾明滔, 等. 2020. 基于深度学习的黑钨矿图像识别选矿方法[J]. 中国有色金属学报, 30(5): 1192−1201. 许振浩, 马文, 林鹏, 等. 2021. 基于岩石图像迁移学习的岩性智能识别[J]. 应用基础与工程科学学报, 29(5): 1075−1092. 徐述腾, 周永章. 2018. 基于深度学习的镜下矿石矿物的智能识别实验研究[J]. 岩石学报, 34(11): 3244−3252. 赵明. 2010. 矿物学导论[M]. 北京: 地质出版社. 张旭, 于明鑫, 祝连庆, 等. 2020. 基于全光衍射深度神经网络的矿物拉曼光谱识别方法[J]. 红外与激光工程, 49(10): 168−175. 周永章, 张良均, 张傲多, 等. 2018. 地球科学大数据挖掘与机器学习[M]. 广州: 中山大学出版社: 1−269. 周永章, 左仁广, 刘刚, 等. 2021. 数学地球科学跨越发展的十年: 大数据、人工智能算法正在改变地质学[J]. 矿物岩石地球化学通报, 40(3): 556−573, 777.