Genome-wide association study for intramuscular fat content in Chinese Lulai black pigs
Article information
Abstract
Objective
Intramuscular fat (IMF) content plays an important role in meat quality. Identification of single nucleotide polymorphisms (SNPs) and genes related to pig IMF, especially using pig populations with high IMF content variation, can help to establish novel molecular breeding tools for optimizing IMF in pork and unveil the mechanisms that underlie fat metabolism.
Methods
We collected muscle samples of 453 Chinese Lulai black pigs, measured IMF content by Soxhlet petroleum-ether extraction method, and genotyped genome-wide SNPs using GeneSeek Genomic Profiler Porcine HD BeadChip. Then a genome-wide association study was performed using a linear mixed model implemented in the GEMMA software.
Results
A total of 43 SNPs were identified to be significantly associated with IMF content by the cutoff p<0.001. Among these significant SNPs, the greatest number of SNPs (n = 19) were detected on Chr.9, and two linkage disequilibrium blocks were formed among them. Additionally, 17 significant SNPs are mapped to previously reported quantitative trait loci (QTLs) of IMF and confirmed previous QTLs studies. Forty-two annotated genes centering these significant SNPs were obtained from Ensembl database. Overrepresentation test of pathways and gene ontology (GO) terms revealed some enriched reactome pathways and GO terms, which mainly involved regulation of basic material transport, energy metabolic process and signaling pathway.
Conclusion
These findings improve our understanding of the genetic architecture of IMF content in pork and facilitate the follow-up study of fine-mapping genes that influence fat deposition in muscle.
INTRODUCTION
Pork is an important dietary source for humans, accounting for about 40% of the world’s meat consumption. Its quality has a great influence on consumer preferences. It is generally accepted that a high level of marbling or intramuscular fat (IMF) content has a positive influence on the eating quality of pork [1]. Over the last decades, pig selection has mainly focused on growth traits and lean meat rate. Due to the reverse correlation between lean meat rate and IMF content, current commercial pigs deposit less fat in backfat as well as muscle at market weights than native breeds. Therefore, the pork industry is greatly interested in augmenting the IMF content to satisfy the eating experience of the consumer.
IMF content displays substantial genetic variations between breeds and even lines within breed [2,3]. Although large differences exist, improvement by normal breeding programs is challenging due to the difficulty of their phenotypic measurement. The measurement of the trait is not only time-consuming and costly, but samples are collected only after slaughtering the pigs. Genetic selection is generally believed to be a promising approach to improve the trait. Thus, it is worthwhile to study the genetic mechanisms underlying IMF content, which would allow us to establish novel molecular breeding tools for its optimization in pork. Furthermore, since the pig is an excellent animal model to study various human diseases [4], dissection of the genetic architecture of fat deposition in porcine muscle can also provide information for unveiling the mechanisms that underlie fat metabolism in humans.
Over the past two decades, using microsatellite markers, many quantitative trait loci (QTLs) that contribute to the IMF content [5,6] have been detected in pigs. However, due to the low density of microsatellite markers, these QTLs represent large chromosomal regions, and further fine-mapping studies are necessary to find the causal mutation responsible for these effects. With the development of high-throughput single nucleotide polymorphism (SNP) genotyping methods, pig-specific high-density genotyping chips have become available. Genome-wide association studies (GWAS), using the chips, makes it possible to detect the genetic variants underlying the traits of interest. Recently, several GWAS have made substantial progresses in identifying genetic factors associated with or underlying a variety of traits, including IMF content [7–10]. However, compared with other traits, studies dissecting variations for IMF content are still rare. Therefore, research should further investigate the genetic mechanism of IMF content, especially using pig populations with high IMF content variation.
Lulai black pigs are a newly developed breed which originated from a cross between Chinese Laiwu pigs (a typical Chinese indigenous breed) and Yorkshire pigs, and then artificially selected for more than eight generations. This breed, similar as Laiwu pigs, is known for its high IMF content [11,12]. Here, to unveil the genetic architecture of IMF content of pork, we measured IMF content of Lulai black pigs, and conducted GWAS for the trait.
MATERIALS AND METHODS
Ethics statement
The whole study protocols for animal rearing and slaughter were reviewed and approved by the guideline (IACC20060101, 1 January 2006) of the Institutional Animal Care and Use Committee of Institute of Animal Science and Veterinary Medicine, Shandong Academy of Agricultural Sciences.
Animal resource
Four hundred and eighty Lulai black pigs were purchased from nucleus farms of the breed in Laiwu, Shandong province. These pigs originated from 12 sires and 95 dams, representing offspring of all boars and most sows in the farm. Boars were castrated before day 60. All pigs had ad libitum access to a corn-soybean based diet containing 13.3% crude protein, 12.43 MJ/kg digestible energy and 0.80% lysine under standard management conditions. Out of the 480 pigs, 453 pigs, including 289 males and 164 females, grew well and were slaughtered when their weights were in the range of 70 to 100 kg. About 200 g longissimus dorsi muscle was collected from the last rib of each pig and stored at −20°C for further measurement.
Measurement of intramuscular fat content
After removing adipose and connective tissue, the longissimus dorsi muscle was ground and analyzed for moisture by routine oven-drying method. Then, samples that had already been analyzed for moisture were ground into powder, which were further used for measuring IMF content. IMF content was determined using the Soxhlet petroleum-ether extraction method and expressed as the weight percentage of wet muscle tissue.
Single nucleotide polymorphism array genotyping and quality control
A standard phenol/chloroform method was used to extract genomic DNA from muscle samples for all the pigs. The quality and concentration of genomic DNA fulfilled the requirements for the Illumina SNP genotyping platform. Samples were genotyped with GeneSeek Genomic Profiler Porcine HD BeadChip (Neogen Corporation, Lansing, MI, USA) according to the manufacturer’s protocol and genotypes were called using GenomeStudio (version 2011.1; Illumina Inc., San Diego, CA, USA).
Quality controls were implemented by Plink v1.07 [13] according to the following filtering. Firstly, SNPs with GC score below 0.2 were considered failed genotypes, and Fimpute [14] was used to impute the failed loci. Then, SNPs were excluded from the data set if i) SNPs without genome location based on the pig genome assembly Sus scrofa Build 10.2 or located on Y chromosome, ii) its minor allele frequency was <5%, or iii) it departed severely from Hardy–Weinberg equilibrium with a p-value lower than 106.
Statistical analysis
The association analysis was performed using the software GEMMA [15]. SNPs were individually tested for association with IMF content using the following linear mixed model:
where Y is the vector of phenotypic values of IMF content; X is the incidence matrices of fixed effects including population mean, sex and weight, and f is the vector for these fixed effects; Z is the identity matrix, and p is the vector for polygenic effects with distribution of
Firstly, permutation was adopted to adjust for multiple testing for the number of SNPs tested through constructing an empirical distribution of the test statistic under the null hypothesis. Specifically, the phenotypic observations of each trait were randomly shuffled 10,000 times, and the empirical critical values for chromosome-wise and genome-wise significance were determined by the 95th percentile of the highest test statistic over the 10,000 permutation replicates. Then, to detect significantly associated SNPs in a study with low power, a liberal cutoff (raw p-value and p<0.001) was used as the cutoff for statistical significance of SNPs. The Q-Q plot and linkage disequilibrium (LD) analysis of the significant SNPs were constructed using the R software and Haploview [16], respectively.
Characterization of candidate genes
To identify plausible candidate genes, we retrieved annotated genes within a 50 kb region centering each significant SNP from Ensembl Genes 89 Database using BioMart (http://asia.ensembl.org/biomart/martview/). Functional analyses, including gene ontology (GO) and reactome pathway enrichment, were performed using PANTHER 13.1 (Feb. 2018) (http://www.pantherdb.org/) to reveal the potential biological function of annotated genes. Furthermore, QTLs of IMF were downloaded from the AnimalQTL database (www.animalgenome.org, Release 34, Dec 21, 2017), and compared with those significant SNPs based on the putative location of these QTLs.
RESULTS
Summary of phenotypic and single nucleotide polymorphism data
The mean and variation coefficient of the phenotypic observations of the IMF content were 5.18% and 65.07%, respectively. The IMF content increased significantly with the weight increment (p<0.05), with the average of 4.80% for weight range 70 to 80 kg, 4.94% for 80 to 90 kg and 5.93% for 90 to 100 kg, respectively. On the other hand, out of the 68,516 SNPs genotyped, a final set of 49,383 SNPs passed the filters and was retained in the data set for further statistical analyses. The distribution of SNP markers and marker density on autosomes are shown in Figure 1. The number of SNPs on every chromosome varies from 1,403 on Chr.18 to 4,728 on Chr.1, and the adjacent distance ranges from 32.00 Kb on Chr.12 to 66.65 Kb on Chr.1. Compared with the common used Porcine SNP60 BeadChip in previous study [17,18], this porcine BeadChip has more usable, and evenly distributed SNPs.
Significantly associated single nucleotide polymorphisms identified in genome-wide association study
Linear mixed models implemented in GEMMA [15] were used in the study to perform the association analysis of IMF content and SNPs. To adjust the effect of inbreeding, a genetic relationship matrix was constructed and used during the analysis. The profile plot of the p values (in terms of −log10(p)) for IMF content is shown in Figure 2. Q-Q plot of the tested SNPs is provided in Supplementary Figure S1, and the SNPs tested show no evidence of overall systematic bias.
By the cutoff p<0.001, a total of 43 SNPs from 18 autosomes were identified to be significantly associated with IMF content, and the details of the significant SNPs are given in Table 1. Among these significant SNPs, the greatest number (n = 19) was detected on Chr.9. The LD block analysis (Figure 3) by Haploview 4.2 [16] indicated that there are two LD blocks formed among these significant SNPs on Chr.9. Especially in the larger LD, five SNPs are in the complete linkage.
Functional characterization of significant single nucleotide polymorphisms
Among the 43 significant SNPs identified in the study, there are 15 significant SNPs located in the protein coding sequences, while the others are mapped in the non-coding sequences. It is consistent with the previous studies that the clear majority of significantly associated SNPs reside outside the protein-coding regions, underscoring the important function of non-coding sequences in the genome [19].
Association studies essentially identify a genomic location related to phenotypic traits but provide little functional significance, especially those significant SNPs occurring in non-coding sequences. To identify possible candidate genes related with IMF content, a total of 42 annotated genes (Supplementary Table S1) centering the significant SNPs detected were obtained from Ensembl database. Overrepresentation test of reactome pathways and GO terms using PANTHER revealed some enriched reactome pathways and GO terms, which are provided in Supplementary Table S2 and S3. Most of the terms and pathways were involved regulation of basic material transport, energy metabolic process and signaling pathway.
To test whether the significant SNPs were mapping into QTLs of IMF detected in previous studies, we downloaded QTLs of IMF from the pig QTL database and compared by their physical positions. Consequently, these are 17 significant SNPs (Supplementary Table S4) mapped to 6 QTLs of IMF.
DISCUSSION
In the present study, we measured IMF content of 453 Lulai black pigs. Consistent with previous reports [11,20], our results suggested that Lulai black pigs had high capability in depositing IMF, with the average of 5.18%. The IMF content increased significantly with the weight increment, which was also reported in the previous studies [20,21], indicating IMF is deposited at a greater rate when pigs get older. Furthermore, the population variation of IMF content was rather large, with variation coefficient of 65.07%, suggesting genotypes of SNPs affecting IMF content are still segregating in the population.
One of the major challenges in GWAS is multiple hypothesis testing. Permutation test is the gold standard in multiple testing corrections [22]. In our study, permutation was firstly used to adjust multiple testing. However, no significant SNPs were identified to be associated with IMF content at the genome-wide significance level, and only three SNPs (green dots in Figure 1) were found at chromosome-wide significance. The specific samples used in this study may be one possible reason leading to the few significant SNPs detected. As mentioned above, Lulai black pigs originated from a cross between typical Chinese Laiwu pigs and Yorkshire pigs. So, they have a complicated genetic background, and strict adjustments were implemented to correct potential effects of population structure in the analyses using GEMMA, which may decrease the significant SNPs identified. The other reason may be the smaller sample size used in our study. In our study, the sample size used (453 Lulai pigs) is small, which reduces the power to detect SNPs related to IMF contents. Consequently, to detect significantly associated SNPs in the study, a liberal cutoff (raw p value and p<0.001) was taken instead of generally used multiple testing correction method. This might cause some significant SNPs to be false positives. But, as proved by mapping with previously reported QTLs and functional analyses of annotated genes centering significant SNP, the SNPs detected by liberal p value in this study can be suggested as SNPs affecting IMF content.
Furthermore, part (n = 17) of the significant SNPs identified are mapped to previously reported QTLs of IMF. However, compared with the previous GWAS, our results do not replicate the significant SNPs identified by them. The different SNP chips and genome assembly used may be possible reasons. However, even using the same SNP chips, a lack of reproducibility of genetic associations has been frequently observed in GWAS [23]. For instance, using Illumina PorcineSNP60 Beadchip, Luo et al [8] identified 40 significant SNPs for IMF in a porcine Large White×Minzhu intercross population, while Ma et al [24] detected 7 and 25 significant SNPs for IMF in the Duroc×Erhualian F2 animals and Chinese Sutai pigs. Of note, no common SNPs for IMF were found in the two populations, Duroc×Erhualian F2 animals identified in Large White×Minzhu intercross population was confirmed by Ma et al [24]. One reason for the lower reproducibility is that GWAS can generate some false positive and false negative associations, although sophisticated statistical tests have been proposed to reduce false positives. Another possible reason is that IMF content is complex trait, for which gene-gene and gene-environment interactions possibly have different genetic effects across population. Therefore, further confirmation of these associations in larger independent populations would be warranted to test if the significant SNPs identified in this study are population-specific before their incorporation into breeding programs.
Supplementary Data
Notes
AUTHOR CONTRIBUTIONS
YW and JW conceived and designed the research. YaW and CN carried out computational analysis and wrote the manuscript. JG and CW contributed to the sample collecting, DNA extraction, genotyping, and interpretation of data. All authors read and approved the final manuscript.
CONFLICT OF INTEREST
We certify that there is no conflict of interest with any financial organization regarding the material discussed in the manuscript.
ACKNOWLEDGMENTS
We would like to thank National Natural Science Foundation of China (31372293), Natural Science Foundations of Shandong Province of China (ZR2017MC043), Key Research and Development Plan of Shandong Province (2018GNC110004), Shandong Swine Industry Technology System Innovation (SDAIT-08-03) supported this study.