基于非靶向代谢组学与机器学习的朝天椒产地鉴别
CSTR:
作者:
作者单位:

(1.浙江科技大学生物与化学工程学院 浙江省农产品化学与生物加工技术重点实验室浙江省农业生物资源生化制造协同中心 杭州 310023;2.中国生物发酵产业协会 北京 100005)

作者简介:

通讯作者:

中图分类号:

基金项目:

浙江省重点研发计划项目(2017C02009)


Identification of Capsicum frutescens Origin Based on Untargeted Metabolomics and Machine Learning
Author:
Affiliation:

(1.School of Biological and Chemical Engineering, Zhejiang University of Science & Technology, Zhejiang Provincial Key Lab for Chem & Bio Processing Technology of Farm Product, Zhejiang Provincial Collaborative Innovation Center of Agricultural Biological Resources Biochemical Manufacturing, Hangzhou 310023;2.China Biofermentation Industry Association, Beijing 100005)

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了研究不同产地朝天椒品质差异,实现朝天椒产地溯源,将非靶向代谢组学与机器学习相结合,建立辣椒产地的分类模型。以四川、湖南和河北3个产地共计108个朝天椒样本为研究对象,采用正交偏最小二乘法(OPLS-DA)对不同产地辣椒进行差异分析,鉴定出糖类、醇类、有机酸类、氨基酸类等14种差异代谢物。将差异代谢物通过京都基因与基因组百科全书(KEGG)进行通路富集分析,发现柠檬酸循环、乙醛酸和二羧酸代谢、氨酰基tRNA生物合成等3条代谢通路在不同产地辣椒中差异显著。通过Lasso回归对14种差异代谢物进行筛选,筛选出D-果糖、柠檬酸、D-塔格糖、D-甘露糖、奎宁酸、苹果酸、天冬氨酸、L-苏氨酸8种重要差异代谢物,建立支持向量机(SVM),随机森林(RF)2种分类模型,其测试集的准确度为97%。利用受试者工作特征曲线(ROC)验证随机森林模型与支持向量机模型的可靠性和有效性。研究结果为辣椒产地鉴别和辣椒相关产品开发提供了理论依据。

    Abstract:

    In order to identify chili peppers from different regions, a classification model for chili peppers based on a combination of untargeted metabolomics and machine learning was established. A total of 108 samples of Sichuan, Hunan, and Hebei peppers were selected as the research objects. The orthogonal partial least squares (OPLS-DA) method was used to analyze the differences in peppers from different regions, identifying 14 differential metabolites such as sugars, alcohols, organic acids, and amino acids. The differential metabolites were enriched and analyzed through the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, revealing the citric acid cycle, glyoxylate, and dicarboxylic acid metabolism. The three metabolic pathways, including aminoacyl tRNA biosynthesis, differ significantly in chili peppers from different regions. 14 differential metabolites were screened using Lasso regression, and 8 important differential metabolites were identified. Two classification models, support vector machine (SVM) and random forest (RF), were established, with an accuracy of 97% in the test set. The reliability and effectiveness of the random forest model and support vector machine model were validated using receiver operating characteristic (ROC) curves. This study provides a theoretical basis for identifying the origin of chili peppers and developing chili related products.

    参考文献
    相似文献
    引证文献
引用本文

王晓宇,王珍珍,胡梦雅,戴静,沙如意,毛建卫.基于非靶向代谢组学与机器学习的朝天椒产地鉴别[J].中国食品学报,2025,25(5):418-427

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-05-26
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-06-24
  • 出版日期:
文章二维码
版权所有 :《中国食品学报》杂志社     京ICP备09084417号-4
地址 :北京市海淀区阜成路北三街8号9层      邮政编码 :100048
电话 :010-65223596 65265375      电子邮箱 :chinaspxb@vip.163.com
技术支持:北京勤云科技发展有限公司

漂浮通知


×
喜报 | 《中国食品学报》入选2025年度首都科技期刊卓越行动计划中英文单刊