DISCUSSION
The GWAS is performed to screen high-density molecular markers (most of which are SNPs) at the genome-wide level, use statistical methods to perform association analysis between the obtained information about molecular markers and phenotypic traits, combine the results with the genetics theory to screen and validate genetic variations related to targeted traits, and then select the candidate genes relevant to targeted traits. Compared with the traditional candidate gene method, the GWAS method can locate the mutation regions more accurately and has become a research hotspot in the field of genetics and breeding at home and abroad. Therefore, GWAS has become an effective method for identifying candidate genes relevant to the complex traits of animals and plants as well as human diseases. Among the various traits of livestock, the reproductive trait has significant economic benefits but involves extremely complex genetic mechanisms and greater heritability, and the phenotype of the reproductive trait could be revealed only after birth. Hence, genetic advances could be achieved very slowly through traditional breeding methods. Even for livestock such as pigs with great reproductive ability, slower improvements have been observed in their reproductive traits. By contrast, the GWAS method enables the QTL regions and candidate genes to be localized more accurately. It can also select genetic markers relevant to breeding and reproduction more precisely and enable its direct application in breeding, thus accelerating the breeding process. As for singleton animals of cows, sheep, and goats, the GWAS method could be used to select and localize candidate genes or genome regions associated with reproductive traits. This is more remarkable for applying genomic selection breeding techniques and enhancing the reproduction efficiency of the corresponding species.
In the GWAS of 200 ewes of Qianhua mutton merino, 151 SNP loci exhibiting significance at the chromosome level were identified. Among them, 3 showed significance at the genome-wide level, 2 of which were located on chromosomes 8 and 14. Additionally, the annotation results revealed that two loci were located on the SYNE1 and SLC12A4 genes, which are genes associated with birth weight. One locus was located on chromosome 27, and the annotation results revealed that it was located on LOC101121916 (related to weaning weight). Based on these analytical results, we can preliminarily deduce that these genes are vital candidate genes influencing the reproductive traits of Qianhua mutton merino.
Spectrin repeat-containing nuclear envelope protein (
SYNE) 1, also called the
Nesprin-1 gene, encodes a spectrin repeat sequence that encodes proteins expressed in the skeletal muscle, smooth muscle, and peripheral blood lymphocytes. The protein is expressed and localized in the karyotheca, Golgi apparatus, and cytoskeleton, including the central nervous system [
8]. This protein is characterized by multiple repeat sequences of various spectrins highly expressed in the striated muscle [
9]. It participates in the processes of cell proliferation and division (including cytokinesis and localization of the nucleus), and the related pathways include meiosis, cell cycle, and mitosis. GO annotations related to this gene are involved in binding with nucleotides, and
SYNE2 is an important paralogue of this gene. The most prominent characteristic of SYNE1 is its size, and it encodes a protein comprising approximately 8,797 amino acid residues (41,000 kDa). The encoded protein contains two actin-binding domains at the N-terminal, which consist of a tandem pair of calmodulin homologous domains, a transmembrane domain, several spectrin repeat sequences, and a C-terminal Klarsicht structural domain (KASH). Syne-1 is involved in the anchoring of the special muscle nucleus at the neuromuscular junction (NMJ) [
10]. Higher vertebrates may have additional nuclear migration compensation mechanisms. Noteworthy, the muscle nucleus of muscle tissues from affected individuals exhibited abnormal localization at the NMJ. However, clinically or electro-physiologically detectable muscle phenotypes were not detected in the affected individuals. Abnormal aggregation of the postsynaptic nucleus detected in affected individuals presented no critical effects on the maturity or maintenance of the NMJs. All members of the spectrin family, including spectrin, dystrophin, and utrophin, seem to share a common function: they enable linkages between the plasmalemma and actin cytoskeleton. The GWAS findings in this study showed that SYNE1 was significantly correlated (at the genome-wide level) with the trait of birth weight in Qianhua mutton merino. Nevertheless, the specific functioning mechanisms remain to be confirmed in subsequent studies.
Solute carrier family 12 (
SLC12) member 4, also called potassium/chloride transporter protein, is an important ion transporter protein–encoding gene. It also encodes proteins that mediate the coupled movement of potassium and chloride ions across the plasmalemma. Members of the SLC family have shown modulatory effects on exocytosis. Nonetheless, the functions of the SLC12 family have not been ascertained [
11]. Hanzawa et al [
12] reported the localization of SLC12A4 and SLC7A10 homologs in horses, features of molecular polymorphisms, and relationships between these genes as well as
SLC7A9 and variability of erythrocyte osmotic fragility in horses. However, only few studies have focused on SLC12A4 in sheep.
Bone morphogenetic protein 2 inducible kinase (
BMP2K), that is, BMP2-induced kinase, is protein-encoding gene. Bone morphogenetic proteins (BMPs) are a type of secretory protein participating in the regulation of organisms’ functions such as cell proliferation and differentiation, apoptosis, morphogenesis, and organofaction. Among these functions, BMP2 is closely related to bone growth and differentiation [
13]. The
BMP2 gene also promotes the differentiation of preadipocytes in sheep, thereby participating in fat deposition in sheep tail and further influencing the development of sheep tail shape [
14]. In addition to cell cycle modulation, BMP2K may participate in the differentiation of megakaryocytes through other modulating mechanisms.
Inner mitochondrial membrane peptidase subunit 2 (IMMP2L) is a type of mitochondrial protein located in the mitochondria. This protein is essential for the catalytic activity of the mitochondrial intimal peptidase (IMP) complexes. It can cleave the signal peptide sequence of mitochondrial cytochrome C1 (cyc1) and glycerophosphate dehydrogenase 2 (GDP2). Mutations in IMMP2L can increase the risk of cerebral injury induced by transient cerebral ischemia, inhibit and decrease cellular activity, promote ROS generation, and reduce mitochondrial membrane potential [
15]. Thus, it could lead to a decrease in food intake and weight loss in animals [
16]. KEGG annotations related to this gene include ovarian follicle development, female gamete generation, ovulation, and so on.
Zinc finger protein 326 (
ZNF326), a member of the protein family AKAP95, is a gene encoding zinc finger transcription factor proteins. It is a new partner for interactions and substrate of the PRMT5/WDR77 nuclear complexes. ZNF326 interacts with DBC1 and promotes the expression of matrix metalloproteins (MMPs), cyclins, and factors associated with epithelial-mesenchymal transition (EMT) [
17].
Sortilin-related receptor 1 (SORL1), also named LR11 or SORLA, is a membrane-bound protein. The encodes heterozygous receptors with several domains involved in endosomal intracellular sorting and transport of the proteins to the corresponding subcellular compartments, mainly located in endosomes and Golgi apparatus [
18]. The encoded proteins include repeat sequences of fibronectin (type III) and epidermal growth factors. The proprotein is converted to mature receptors after protein hydrolysis, and these mature receptors may play a role in endocytosis and sorting. It is mainly located in the lysosomal system of cells, trans-Golgi network, and mitochondria-associated membrane [
19] of the middle endoplasmic reticulum. It is also expressed in other tissues, such as testis, ovary, thyroid, and lymph nodes [
20].
Acetyl-CoA carboxylase alpha (ACACA) mainly converts acetyl-CoA into malonyl-CoA through carboxylation and acts as a rate-limiting enzyme for the synthesis of fatty acids. Therefore, it is usually used as a marker to evaluate the differentiation degree of adipocytes of intramuscular fat (IMF) in animals and candidate genes [
21]. ACACA participates in the development of fat tissues and the metabolism of lipids [
22]. Based on this theory, it has been used to study whether it could be used as molecular markers for traits of pork quality [
23].
LASP1 (LIM and SH3 protein 1) is a scaffolding protein that mediates cell migration, proliferation, and survival in several human breast cancer cell lines [
24]. This gene encodes members in the protein subfamily of LIM characterized by the sequence motif of LIM and the structural domain of Src homologous region 3. It is also a member of the actin-binding protein family, and the encoded proteins are dependent on the signaling proteins of cAMP and cGMP. The elongated regions of the encoded proteins are combined with the actin cytoskeleton of the cell membrane. In human macrophages, LASP1 is associated with cyclic-F-actin in corpuscles, which promotes protein hydrolysis in the extracellular matrix [
25]. At present, studies on LASP1 have focused mostly on human cancers, and the underlying mechanisms remain unclear.
Secretagogin (SCGN) is a calcium-binding protein found in the cytoplasm, and this protein is involved in KClstimulated calcium flux, cell proliferation, and insulin release from the pancreas. It is expressed in the interneurons of the glomerulus and granular cell layers, thyroid, gastrointestinal tract, adrenal medulla, paranephros, and brain [
26]. The exact mechanisms for the effects of CGN on insulin release have not been known until recent years. It involves SGN-interacting proteins, which are actin-binding proteins participating in the transport and exocytosis of insulin granules [
27] or modulating actin cytoskeleton by promoting the transport of vesicles toward the periphery during insulin release [
28].
A disintegrin and metalloprotease 12 (ADAM12) is a member of the multi-domain metalloproteinases-bisintegrin family. It has characteristics of cell binding and metalloproteinases and participates in several biological processes related to the cell-cell and cell-extracellular matrix interactions (including fertilization, muscle development, and neurogenesis). Diseases associated with ADAM12 include lung cancer and ectopic gestation, and the related pathways include RET signal transduction and G-protein–coupled receptor signal transduction. During the early stage of mammalian development, ADAM12 mRNA expression is significant in mesenchymal cells, while mesenchymal cells would develop into skeletal muscle, bones, and visceral organs [
29]. In adult mammals, ADAM12 has shown high expression in tissues, and its expression is the highest in bones [
30].
In addition to analyses for annotated candidate genes of the aforementioned significant SNP loci, we performed gene function annotation and clustering analyses for the other 139 SNP loci that exhibited significance at the chromosome level. In the past two decades, research on the application of the GWAS method for the association analysis of growth, development, and reproduction traits of sheep has gained wide focus for selecting and localizing corresponding candidate genes. Zhang et al [
31] used 50KSNP chips of sheep to conduct GWAS on Sunit sheep, German merino, and Dorper sheep and concluded that 36 SNPs were significantly correlated with seven traits (birth weight, weight at the age of 6 months, daily weight gain before weaning, daily weight gain, chest circumference, and tube circumference). Among these SNPs, 10 were significantly correlated (at the genome-wide level) with daily weight gain before weaning; based on gene annotations, five candidate genes associated with daily weight gain before weaning were
MEF2B,
RFXANK,
CAMKMT,
TRHDE, and
RIPK2. Gholizadeh et al [
32] conducted GWAS for birth weight, weaning weight, weight at the age of 6 months, and weight upon the first birthday in Baluchi sheep. Concomitantly, the authors found that 13 SNPs were significant at the chromosome level; gene annotations indicated that the
STRBP and
TRAMIL genes were candidate genes for birth weight, while
APIP and
DAAMI were candidate genes for weight at the age of 3 months. By contrast, Al-Mamun et al [
33] found that 39 SNPs were associated with body weight in Australian merino; among these SNPs, 13 related to body weight were located in the zone of 36.15 to 38.56 Mb on chromosome 6, and this zone involved candidate genes for body weight (including
LAP3,
NCAPG, and
LCORL). Demars et al [
34] used the GWAS method and analyzed the lambing traits of French Grivette sheep and Polish Olkuska sheep and confirmed that
BMP15 was an important candidate gene determining the number of lambs birthed. Based on 50KSNP chips of sheep, Våge et al [
35] applied GWAS for the lambing numbers of female generations of Norwegian white sheep rams and found that the best candidate gene that could effectively increase the ovulation rate or lambing birthing was located in chromosome 5 of sheep (near the
GDF9 gene). Gholizadeh et al [
36] applied GWAS for 96 individuals from two groups of Baluchi sheep according to the rate of twin production in the first four generations and concluded that SNPs influencing the total number of produced lambs were located on chromosomes 10 and 15.
In this study, GWAS was used, and the results revealed that 151 SNPs in Qianhua mutton merino were correlated (at the chromosome level) with five traits (i.e., traits of singleton or twins, birth weight, age [in days] at sexual maturity, weaning weight, and daily weight gain from birth to weaning) of its ewe groups. Among the SNPs, two were located on chromosomes 8 and 14 (associated with birth weight) and 1 was located on chromosome 27 (associated with weaning weight), exhibiting significance at the genome-wide level. All the other 148 SNPs presented significant correlations (at the chromosome level) with the five traits, which were located on the other chromosomes. These results were similar to those of other relevant traits of sheep [
37], pig [
38], and cow [
39], suggesting that GWAS was efficient in exploring significant loci of the targeted traits at the genome-wide level, which cannot be replaced by traditional methods.
In this study, we performed GO and KEGG functional annotation and enrichment analysis for 151 SNPs of five reproduction-related traits of Qianhua mutton merino. Genes associated with these traits were
SYNE1,
SLC12A4,
BMP2K,
CAMK2D,
MMP2L,
ZNF326,
PELO,
SORL1,
ACACA,
LASP1,
SCGN,
ADAM12, and other genes near the 151 SNP loci related to reproduction traits. Among these genes, ACACA mainly converts acetyl-CoA into malonyl-CoA by carboxylation, functioning as a rate-limiting enzyme during the synthesis of fatty acids. Therefore, this gene is usually used as a marker for evaluating the differentiating degree of adipocytes of IMF in animals and candidate genes [
21]. It also participates in the development of fat tissues and lipid metabolism [
22]. ZNF326 could interact with DBC1, thus promoting the expression of MMPs, cyclins, and factors associated with EMT [
17]. Proteins encoded by IMMP2L are located in the mitochondria. IMMP2L is essential for the catalytic activity of IMP complexes. IMMP2L is involved in ovarian follicle development, generation of the female gamete, ovulation, and so on. CAMK2D belongs to the family of serine/threonine protein kinases and the subfamily of calmodulin-dependent protein kinase. GO annotations for this gene include serine/threonine protein kinase activity, purine ribonucleotide binding, homologous dimerization of proteins, and protein kinase activity. Relevant KEGG annotations include ErbB signaling pathway, HIF-1 signaling pathway, meiosis of oocytes, and oxytocin signaling pathway. The encoded protein contains two actin-binding domains at the N-terminal, which consist of a tandem pair of calmodulin homologous domains, a transmembrane domain, several spectrin repeat sequences, and a C-terminal KASH. This protein is characterized by highly expressed repeat sequences of various spectrins in the striated muscle [
9] and participates in cell proliferation and division (including cytokinesis and localization of nucleus), and the related pathways include meiosis, cell cycle, and mitosis. Moreover, it is involved in the anchoring of the special muscle nucleus at the NMJ and provides linkages between the plasmalemma and actin cytoskeleton [
10].
The SNP loci selected in this study are mainly enriched in nucleotide metabolism, metal ion binding, circadian rhythm, oxytocin signaling pathway, and neurotrophin signaling pathway. Therefore, genes associated with these signaling pathways are presumed to be candidate genes correlated with reproduction-related traits in sheep. However, reproduction-related traits in sheep are complex quantitative traits controlled by various genes. As such, it remains unclear whether candidate genes selected in this study could directly affect the targeted traits, which needs to be further confirmed.