INTRODUCTION
In the extensive history of cattle breeding in Guizhou, a diverse range of local cattle breeds has developed, with the Guanling cattle being a prominent example. These cattle primarily inhabit the vast mountainous regions within the Panjiang River basin, spanning across Guizhou, Yunnan, and northern Guangxi, with Guanling County being their most renowned habitat [
1]. Guanling cattle represent a superior local breed in Guizhou Province and are among the 78 nationally protected local livestock breeds. They are highly regarded for their culinary, economic, and developmental potential. Previous research has employed biochemical and molecular techniques to analyze the genetic characteristics of the Guanling breed [
2]. These studies have revealed significant genetic variation and growth potential. However, the advancement and utilization of these breeds have faced limitations attributed to the introduction of foreign breeds, crossbreeding enhancements, environmental degradation, and various natural factors, which have led to a decline in their population.
Advances in sequencing technologies have greatly facilitated comprehensive and in-depth genome analysis. And, sequencing of the bovine genome and HapMap projects have revealed a substantial number of genetic variations [
3], with single nucleotide polymorphisms (SNPs) being the most extensively studied variant. SNPs serve as valuable tools for identifying genomic regions through association analysis, as they exhibit linkage disequilibrium with quantitative trait loci influencing target traits, a phenomenon observed in various animal species. For instance, Eck et al [
4] identified 2.4 million SNPs in Holstein cattle using the Illumina HiSeq platform. Stothard et al [
5] employed SOLiD technology to successfully map genomic variations between Black Angus bulls and Holstein bulls, identifying approximately 7 million SNPs and 790 copy number variations. The utilization of SNPs as selection criteria for meat traits in marker-assisted selection can significantly enhance cattle selection and breeding programs.
Recent studies on bovine genomic variation have been extensive, however, whole-genome studies on Guanling cattle have yet to be reported. Here, we resequenced the Guanling cattle and compared them with the Limousin, Simmental, and Angus breeds to reveal their genomic characteristics and variations. This study’s primary objective was to unravel the genetic traits of Guanling cattle, pinpoint genes advantageous for muscle growth, and provide essential genomic data to support further analysis of genetic mechanisms tied to economic traits and the preservation of cattle breed genetic diversity.
MATERIALS AND METHODS
Ethics approval
All animal experiments in the study were reviewed and approved by the Subcommittee of Experimetal Animal Ethics of Guizhou Academy of Agricultural Sciences.
Animal samples
Blood samples of 58 Guanling cattle from the central production area in Guanling County were collected. To minimize the degree of relationship among individuals, adult bulls were randomly selected from farmers in different areas. And, three of them underwent whole-genome resequencing, while the remaining 55 cattle were analyzed via high-multiplex polymerase chain reaction (PCR) technology. Blood samples of one of each Simmental, Angus, and Limousin cattle (Guizhou Breeding Bull Station, Qianxi, Guizhou, China) were also collected for whole-genome resequencing as experimental controls.
DNA library construction and sequencing
Genomic DNA was extracted from blood samples using the Blood Genomic DNA Extraction Kit (Tiangen Biotech Co., Ltd., Beijing, China) following the manufacturer’s instructions, and the extracted DNA was quality-inspected by Qubit2.0 and 0.8% agarose gel electrophoresis. Qualified DNA can be used for subsequent sequencing experiments. The genomic DNA was randomly interrupted by CovarisS2, recovered the DNA fragments (~300 bp) by electrophoresis, added the joint, according to the corresponding process shown in cBot User Guide, complete Cluster generation on the cBot equipped with the Illumina HiSeq sequencer according to the Illumina User Guide preparation sequencing reagent, and the flow cell with cluster was carried on the machine. The paired-end procedure was selected for two-end sequencing, which was controlled by data collection software provided by Illumina and used for real-time data analysis. The two-end sequencing length was 200 bp to give the final sequencing data. The DNA libraries were sequenced by Shanghai Biotechnology Corporation (Shanghai, China).
Quality control and data filtering
In order to ensure the quality of the data, the sequencing raw data should go through quality control and data filtering. The quality of the raw data (three Guanling cattle, one of each Simmental, Angus, and Limousin cattle) is controlled by analyzing the composition and quality values of the bases (
Table 1). The raw reads with some joint or low quality reads was filtered by fastp [
6] to obtain high quality clean data, and subsequent analysis was based on clean data. Data filtering is mainly about removing paired reads with joints; removing paired reads with N base (N indicates uncertain base information) greater than 10%; removing paired reads with low quality (mass value Q 7) bases exceeding 30% of the total number of reads.
Sequencing data alignment
Clean reads from each sample were aligned with the bovine reference genome (GCA_002263795.2) using Burrows–Wheeler Aligner 0.7.13 (BWA) [
7] with the following parameters: “mem 4-k 32-M”. In this context, “-k” denotes the minimum seed length, and “-M” flags shorter split alignments as secondary. Sorting and deduplication were carried out using the SAMtools and Picard toolkits. SNP/indel detection was conducted using the Genomic Analysis Toolkit (GATK) HaplotypeCaller [
8].
After obtaining SNP information from the samples, genotypes showing polymorphisms with the reference sequence were filtered using GATK’s variant filtration method with specified criteria (-Window 4, -filter “QD<2.0 ||FS>60.0|| MQ<40.0”, -G_filter “GQ<20”). To establish a high-confidence SNP/indel dataset, the identified SNPs and indels were called in variant call format and cross-referenced with the dbSNP database to identify novel variants. Finally, the snpEff tool was used for mutation annotation and statistical analysis.
Functional gene enrichment and single nucleotide polymorphism selection
For the purpose of this study, Groups G26, G27, and G28, mean Guanling cattle, were collectively referred to as Group 1, while AG, LM, and XM (represent Limousin, Simmental, and Angus cattle, respectively) constituted Group 2. Mutations common to and distinct between these two groups were categorized separately. Subsequently, the gene loci associated with these mutations were subjected to gene ontology (GO) enrichment and pathway enrichment analysis via the DAVID database (
https://david.ncif-crf.gov/).
Following enrichment analysis, SNP selection was initiated. First, group-specific SNP sites exhibiting consistent homozygous genotypes across three samples per group were chosen. Second, functional variant SNPs, including nonsynonymous mutations, premature terminations, intron-exon splice sites, and early starts, were selected. Third, variants with significant protein impacts were chosen, with exclusions made for those located on sex chromosomes. The proteins were encoded by those genes that are enriched. In the fourth step, group-specific SNPs obtained in the initial selection were filtered through a literature review to identify genes associated with meat quality traits [
9–
12]. Subsequently, functional SNPs were selected from those obtained in the fourth step, once again excluding variants on sex chromosomes. Finally, SNPs with high impact and associations with meat quality traits were evaluated for technical feasibility using Sanger sequencing based on PCR.
Single nucleotide polymorphism detection
High-multiplex PCR technology was employed to detect high-quality SNP sites within a population of 55 Guanling cattle. Specific capture primers were designed, and the detection process was conducted by Shanghai Biowing Applied Biotechnology Co., Ltd. (Shanghai, China).
DISCUSSION
The number of Guanling cattle is small, so it is particularly important to select individuals which can represent varieties for sequencing. In order to avoid individual differences, more individuals are gathered at a low cost to reflect the population genetic diversity of Guanling cattle varieties. Therefore, three Guanling cattle samples were used for resequencing, and 55 Guanling cattle samples were used for further SNP detection. Finally, we get an average of 26 Gb raw data, 99.5% reads aligned to the reference genome, with high single base correctness, similar to the previous sequencing results of ordinary cattle [
4,
5]. High sequencing depth, and the detected variants are fully credible [
15].
Here, we found 598,688 SNPs and 42,423 indels in 29 autosomes and X chromosomes. Of the total SNP, heterozygous SNPs 574,082 (95.89%), homozygous SNPs 24,606 (4.11%) and the ratio of heterozygous/homozygous SNPs was 23.33, significantly higher than that of Japanese cattle (1.24) [
16] and Korean cattle (1.63) [
17]. From a sequencing perspective, pure sequences and SNPs exhibit distinct characteristics. In mixed samples, the base types at a particular site are consistent with the reference genome. However, heterozygous SNPs indicate multiple base types at the same site across all mixed samples. Guanling cattle display a low homozygosity ratio for SNPs, suggesting high heterozygosity. This observation may reflect sequencing variations among individuals and could also be attributed to low levels of cattle breeding. Additionally, there appears to be increased gene communication between Guanling cattle and other breeds, potentially resulting in the loss of specific functional genes. Protecting the genetic diversity of Guanling cattle varieties is therefore of paramount importance. By comparing SNPs and indels, 13,769 new SNPs and 2,206 new indels were found, accounting for 2.3% of the total SNPs and 5.2% of the total indels, respectively. Due to the development of genome sequencing in recent years, more new SNPs and indels have been found, and the database is more and more perfect. So, the proportion of comparison is significantly increased, and the number of new discoveries gradually less. Most indels are short in length, with deletions ranging from 1 to 29 kb, insertions from 1 to 44 kb, the number of deletions and insertions concentrated in 1 to 10 bp, with 1 to 3 bp being the most, and similar phenomena are observed from human genome data [
18]. In Guanling cattle data, nearly 85.6% insertions and 77.9% deletions were less than 5 kb. SNPs and indels detected on 29 autosomes are proportional to chromosome length and the expected results with the lowest X chromosome mutation rate of 4.61%, and on small population studies, the X chromosome has a lower mutation rate compared to the autosomes [
19].
Guanling cattle are mainly used for cultivated land, and have gradually developed towards the direction of serving meat. The
LEPR gene is located on bovine chromosome 3 and encodes a protein belonging to the cytokine receptor gp130 family. It plays a crucial role in regulating fat metabolism and is a novel hematopoietic pathway essential for normal lymphocyte production [
20]. Previous research by Raza et al [
21] investigated that three SNPs of
LEPR gene were associated with backfat thickness, and intramuscular fat content in Chinese beef cattle breeds. Moreover, the expression level of
LEPR was significant difference in the longissimus dorsi muscle of Yunling and Simmental cattle [
22]. On bovine chromosome 4, the
AKAP9 gene, a member of the AKAP family, encodes a kinase anchoring protein. Raza et al [
10] findings suggest that bovine
AKAP9 may be involved in regulating fat formation, growth traits, differentiation of adipose tissue, regeneration of skeletal muscle, and metabolism.
Like the bovine chromosome 10 gene, the
SIX4 gene, which is part of the Sine Oculis/Six gene family, plays a crucial role in skeletal muscle development. Genetic variations or deletions in
SIX4 can have implications for pituitary function [
23]. Wang et al [
24] research revealed significant correlations between three SNPs in the
SIX4 gene of Qinchuan cattle and body measurement traits, suggesting that
SIX4 is a candidate gene influencing cattle body size traits. Huang et al [
25] demonstrated that
SPIDR regulates the assembly or stability of RAD51/DMC1 on single-stranded DNA, a vital recombination factor in meiotic recombination in mammals. Zhang et al [
26] demonstrated the significant effects of the
SPIDR gene on growth parameters and carcass traits in 1,173 Chinese Simmental beef cattle.
Along bovine chromosome 16, the
PRG4 gene encodes a proteoglycan-like glycoprotein synthesized by various tissues, including joint cartilage, the meniscus, the synovial lining, and tendon cells [
27]. Research by Abubacker et al [
28] emphasized the importance of the intermolecular disulfide bond polymers of
PRG4 for its adsorption onto cartilage surfaces and function as a boundary lubricant. These findings suggested that
PRG4 may play a role in reducing friction in tissues and spaces of bovine knee joints, contributing to effective weight-bearing systems, particularly during weight gain in Guanling cattle.
Fatty acid synthase (FASN) encodes a versatile protein primarily responsible for catalyzing the synthesis of palmitic acid from acetyl-CoA and malonyl-CoA, facilitated by nicotinamide adenine dinucleotide phosphate (NADPH), resulting in the formation of long-chain saturated fatty acids [
29]. Previous studies have identified
FASN as a pivotal candidate gene influencing the composition of fat in both milk and meat [
30,
31]. Chu et al [
32] observed significant variations in intramuscular fat content among Datong yaks based on different
FASN gene genotypes. Individuals with the HH and HG genotypes exhibited notably greater intramuscular fat content than did those with the GG genotype [
32].
FASN mRNA expression levels in subcutaneous fat and abdominal fat in Yan yellow cattle were significantly higher than that in Yanbian yellow cattle [
33]. Genome-wide association analysis revealed that g.841G>C SNP of
FASN gene showed significant associations with the percentages of C14:0, C14:1, C16:1, and C18:1 at 5% genome-wide significance level in Japanese Black cattle [
34]. In our study, we screened seven SNP sites spanning six genes within the Guanling cattle population. The dominant alleles in this population were found to be A at rs43347904 within the
LEPR gene, A at rs133120166, T at rs109170670 within the
SIX4 gene, C at rs208094969 within the
SPIDR gene, and A at rs715140536 within the
FASN gene. However, the biological functions of these mutation sites require further investigation.