INTRODUCTION
Common vetch, also known as the big nest vegetable or wilderness pea, is an annual or perennial species in the legume family and belongs to the genus Vicia [
1]. Common vetch is noted for its resilience to cold, drought, and infertile soils. In northwest China, Common vetch is widely integrated into wheat fields through intercropping and multiple cropping systems to enhance protein content and pasture yield [
2]. As global living standards improve, the demand for ruminant meat rises, driving the need for high-quality forage. Recent domestic and international research has demonstrated that feeding common vetch or blending it with Gramineae grasses can significantly improve the palatability and digestibility of feed, and enhance ruminants’ performance through adjusting rumen microbial communities [
3]. As a result, common vetch has become a crucial source of premium forage in animal husbandry [
4]. To address feed ingredient shortages, breeders and scientists have focused on developing common vetch varieties with enhanced stress tolerance, high yields, disease resistance, and nutritional richness over recent decades [
5,
6]. The findings of this study have facilitated the development of a diverse array of common vetch varieties in the market, which offers producers an expanded selection and allows for optimized feed production in accordance with regional demands [
7].
However, the determination of nutrient content in common vetch still relies on traditional wet chemical analysis, which not only consumes a significant amount of time and resources but also imposes specific requirements on experimental equipment and operators [
8]. The absence of a rapid, convenient, and cost-effective method for quality evaluation of common vetch presents challenges for producers in accurately selecting the desired varieties promptly. Therefore, there is an urgent need to develop an efficient, accurate, simple, and affordable method to evaluate the nutritional value of common vetch. This endeavor holds great significance in promoting the rapid development of the common vetch industry.
The near-infrared reflectance spectroscopy (NIRS) is not only characterized by its simplicity, speed, efficiency, accuracy, and cost-effectiveness but also eliminates the need for complex pre-treatment and chemical analysis processes. Additionally, it enables simultaneous analysis of multiple chemical indices [
9]. In recent years, research has demonstrated the effective utilization of near-infrared spectroscopy combined with chemometrics for the comprehensive analysis of plant chemical composition and high-throughput phenotype [
10]. However, previous studies primarily focused on selecting samples from the same variety of plants or feeds, resulting in minimal variations in various chemical indexes among these samples. This could potentially hinder the accurate prediction of nutritional indexes for different varieties within the same variety by the established model [
11]. Therefore, there is still limited available literature on the utilization of NIRS models for predicting nutrient composition and mineral element content in different varieties and regions of common vetch.
In this context, the objective of this study was to develop a rapid, efficient, and non-destructive quantitative analysis model for nutrients and mineral elements in common vetch from diverse regions and varieties by integrating near-infrared spectroscopy with chemical composition analysis. We hypothesized that through extensive sampling of common vetches from various regions and varieties, a rapid, accurate, and reliable NIRS model could be established to effectively predict the nutritional quality of common vetches. This model is expected to accurately predict the nutritional quality of common vetch, thereby providing a solid theoretical foundation and advanced technical support for the selection of high-quality varieties. This will significantly advance the development of the common vetch-based forage industry.
DISCUSSION
Currently, the breeding and genetics of common vetch have primarily focused on the adaptability and yield of different varieties in various regions, while neglecting the significance of nutritional quality and mineral elements [
19]. Research has indicated that common vetches are extensively cultivated worldwide and possess abundant nutrients and mineral elements [
20]. The rapid screening of the relationship between adaptation and nutritional quality in various regions is increasingly imperative. Among the promising alternative methods, NIRS is widely employed due to its capability to rapidly and accurately determine target quality components [
21]. Therefore, the establishment of a NIRS prediction model can offer a more efficient and convenient approach for quality prediction and adaptation selection in diverse regions of common vetch. The accuracy and reliability of NIRS are widely believed to be primarily determined by the number of samples modeled and the variability of components [
22]. To establish a more precise calibration model for quantitative analysis of nutrients and mineral elements in common vetch, we utilized the average spectrum from 190 different varieties of common vetch for modeling. The findings from this study demonstrate diversity and coefficient of variation in the nutritional quality among the selected samples, which can be attributed to variations in common vetch varieties. Therefore, it is essential to incorporate certain levels of variability in sample nutritional quality not only for establishing an accurate and comprehensive NIRS model but also ensuring its applicability for future predictions across various sample types and component contents [
23].
In this study, we discovered that the original spectrum and spectral pattern of common vetch were consistent with previous research findings [
24]. The spectrum of common vetch exhibits five prominent absorption peaks across the entire wavelength range, specifically located at approximately 1,450, 1,740, 1,910, 2,120 and 2,350 nm. Previous studies have indicated that within the ranges of 1,150 to 1,250, 1,400 to 1,650, 2,050 to 2,150 nm, as well as at wavelengths of 2,250 and 2,350 nm, C-H and O-H overtones originating from carbohydrates and sugars play a dominant role [
25]. Cellulose C-H and O-H bonds exhibit signals at wavelengths around 1,470, 1,780, 1,845 and 2,085 nm. N-H overtone absorption bands are associated with proteins; these associations can be observed at approximately 1,455, 1,555, 1,742.5, 2,087.5 and 2,182.5 nm [
26]. The peak value for the C-O stretchy overtone signal is close to 2,303.75 nm and the peak value for O-H bond sin water molecules appear sat approximately 1,432.5 and 1,932.5 nm [
27].
To enhance the accuracy and reliability of the NIRS calibration model, most studies typically partition the sample subset into a calibration set and a validation set [
28]. For this study, the calibration and validation sets were randomly allocated at a ratio of 4:1 based on modified partial least squares score. PCA analysis revealed that both validated and calibrated samples exhibited similar distributions and comparable ranges in terms of reference values, indicating their suitability for establishing and verifying NIRS equations. Subsequently, employing MPLS regression technology along with different wavelength ranges, mathematical processing techniques, and scattering correction methods resulted in the generation of 30 equations for each parameter index. The optimal model is selected by combining R
2C and R
2CV.
R
2C values indicate the extent to which the model fits the calibration set. In NIRS analysis, a high R
2C value indicates that the model can effectively elucidate the relationship between the absorption peaks of the calibration spectra and the properties of the samples, which enables us to identify the most stable and reliable models [
29]. In this study, we observed that the R
2C effect of K, P, Mg, and Fe is relatively limited compared to other indicators. This could be attributed to the narrow concentration range of these elements and their insufficient combination with organic compounds such as amino acids, proteins, and carbohydrates in the sample [
30]. Additionally, a significant portion of these elements exist in the form of free ions within the sample without any absorption effect in the near-infrared spectral region. Consequently, no absorption peak is generated which hampers the establishment of a near-infrared prediction model [
31]. In addition, modeling elements may need to consider a variety of chemical forms and states (such as organic carbon versus inorganic carbon), and these complex reactions and states may lead to challenges in the accuracy of the model, resulting in K, P, Mg, and Fe modeling effects greatly affected by the complexity of the sample [
32]. In this study, the R
2C values for DM, CP, NDF, ADF, ash, EE, and OM all exceeded 0.90. This high correlation can be attributed to the derivative and scattering correction treatment of the spectrum, which establishes a precise relationship between the spectral data and the nutrient content of common vetch at specific wavelengths and absorption intensities [
33]. Studies have demonstrated that the integration of derivative analysis with descattering treatment markedly enhances the R
2C value in establishing the predicted nutrient composition of beans, a finding that aligns with the results of this study [
34]. In NIRS, the R
2CV primarily serves to assess the model’s consistency and accuracy concerning spectral features and sample properties across various data subsets derived from the calibration set [
35]. In this study, the R
2CV values for DM, CP, NDF, ADF, ash, EE, and OM all exceeded 0.90. This high performance can be attributed to the selection of common vetch samples from diverse varieties and regions, ensuring a representative and varied dataset that enhances the model’s generalization capability [
36]. Additionally, appropriate preprocessing methods were employed during modeling, and an adequate amount of data was utilized, which further contributed to the elevated R
2CV values. The low R
2CV values observed for P, Mg, K, and Fe in this study may be attributed to the relatively weak absorption signals of complex compounds formed by these mineral elements in the near-infrared spectrum. These weak signals are susceptible to interference and noise from other components, thereby complicating the model’s ability to accurately establish a quantitative relationship between these elements and the spectral data [
37].
The validation set was utilized to assess the predictive capability of the NIRS model for determining nutrient and mineral element content in external common vetch samples. By validating the prediction set model, we observed that the RMSEP values for common vetch indicators ranged from 0.01 to 1.87, with CP, NDF, ADF, and ash exhibiting RMSEP values exceeding 1.0 while other indicators demonstrated values below 1.0. R
2p values for P, K, and Mg were found to be 0.78, 0.80, and 0.70 respectively indicating relatively lower prediction accuracy, whereas R
2p values for other nutrients and mineral elements ranged from 0.88 to 0.96. R
2p is a critical metric for evaluating the accuracy of model predictions on independent test sets. In this study, the R
2p values for DM, CP, NDF, ADF, ash, and OM were consistently high. Carbas et al demonstrated that R
2p values are positively correlated with R
2c and R
2cv when using near-infrared spectroscopy to assess nutritional and anti-nutritional parameters in common legumes, findings which are consistent with our results [
38]. Furthermore, the RMSEP value obtained in this study is relatively low, signifying minimal prediction error and superior model accuracy, which contributes significantly to the elevated R
2p value observed [
39].
In terms of RPD value, the values of P, K, and Mg are all below 2.5, whereas the RPD values of other indicators exceed 2.5. Based on a threshold of RPD≥2.5, the prediction results obtained from the constructed model demonstrate accuracy and applicability in practical quantitative analysis. When 2.0<RPD≤2.5, it indicates relatively lower prediction accuracy of the constructed model and is suitable for preliminary screening purposes only. If RPD<2.0, it implies that the effectiveness of the constructed model is significantly reduced and cannot be applied in actual production practice [
40]. In this study, the prediction model for P, K, and Mg exhibited relatively poor performance, possibly due to the limited variability in their contents, weak correlation with certain major elements in the sample, inadequate sample size and range of variation, as well as alterations in their chemical bonds caused by fluctuations in temperature and humidity within the external environment during sample storage [
41]. It is plausible that the absorption characteristics of P, K, and Mg in the near-infrared spectral region are either weak or significantly overlap with the spectral signatures of other components [
32]. This complicates the accurate extraction of characteristic information for these elements from the spectrum, thereby impacting the prediction accuracy of the model and leading to a lower RPD value [
42]. In this study, the RPD value for CP exceeds 4.0, indicating a high level of prediction accuracy [
18]. The protein is abundant in N-H groups, which exhibit distinct characteristic absorption peaks in the near-infrared (NIR) spectral region. The position, intensity, and shape of these absorption peaks are closely associated with the chemical environment of the N-H groups and demonstrate a strong linear correlation with protein content [
43]. By effectively capturing these features, the model establishes a robust correlation between spectral information and protein content, thereby minimizing the deviation between predicted and actual values and significantly enhancing the RPD value. In this study, the RPD values of NDF, ADF and OM all exceeded 3. This phenomenon can be attributed to the large number of C-H and O-H functional groups in these nutrients, which enable the model to effectively capture their characteristics and accurately correlate the spectral characteristics with the nutrients, thereby reducing the deviation between the predicted value and the actual value and thereby increasing the RPD value [
44]. Furthermore, the modeling samples encompass a broad spectrum of sources and conditions, enabling the model to capture the spectral characteristics and content relationships of NDF, ADF, and OM across diverse scenarios [
45]. For instance, when modeling common vetches, the samples include various common vetches grown under different environmental conditions, thereby enhancing the model’s adaptability to variations in NDF, ADF, and OM contents and improving prediction accuracy, which in turn increases the RPD value. In contrast, the RPD value for ash is relatively low, primarily because crude ash does not occur in isolation within actual samples. The surrounding organic components, such as proteins and carbohydrates, interfere with the weak spectral signals of mineral elements in ash [
46]. This interference makes it challenging for the model to extract relevant spectral features associated with ash, thereby affecting the accurate prediction of ash content and consequently reducing the RPD value [
47].
The accuracy of the model for P, K, and Mg contents is sufficient for screening purposes, while other parameter indicators can accurately predict their values and be applied in actual production. Some studies have evaluated the modeling efficacy of mineral element content in common raw beans. The findings indicate that the predictive performance for certain mineral elements is suboptimal, allowing only for preliminary screening [
48]. This discrepancy may be attributed to interactions between mineral elements and nutrients components, leading to overlapping absorption peaks of different nutrients in the near-infrared spectral region [
49]. Additionally, the formation of complexes between some mineral elements (such as Mg) and organic acids alters the original spectral characteristics of these elements, causing them to deviate from the typical absorption patterns observed in near-infrared spectra. Consequently, the overall prediction accuracy of the model is diminished [
50]. The difference in results shown in this study may also be caused by the above results. Numerous studies have demonstrated the immense potential of NIR in screening quality traits of diverse plants [
51,
52]. This approach will also facilitate future cultivation of different varieties of common vetch. In forthcoming research, near-infrared spectroscopy could be employed to forecast genetic data of common vetch for enhanced screening of high-quality varieties.
The NIR model established in this study can be utilized for the screening of high-quality varieties of common vetches, including those with elevated protein content and fiber content, to facilitate the cultivation of mixed forage. This investigation not only presents a rapid, convenient, cost-effective, and accurate screening method for the efficient application of common vetch but also offers a robust strategy for its future large-scale cultivation and utilization.