Screening for candidate genes related with histological microstructure, meat quality and carcass characteristic in pig based on RNA-seq data

Objective The aim of the present study was to identify genetic variants based on RNA-seq data, obtained via transcriptome sequencing of muscle tissue of pigs differing in muscle histological structure, and to verify the variants’ effect on histological microstructure and production traits in a larger pig population. Methods RNA-seq data was used to identify the panel of single nucleotide polymorphisms (SNPs) significantly related with percentage and diameter of each fiber type (I, IIA, IIB). Detected polymorphisms were mapped to quantitative trait loci (QTLs) regions. Next, the association study was performed on 944 animals representing five breeds (Landrace, Large White, Pietrain, Duroc, and native Puławska breed) in order to evaluate the relationship of selected SNPs and histological characteristics, meat quality and carcasses traits. Results Mapping of detected genetic variants to QTL regions showed that chromosome 14 was the most overrepresented with the identification of four QTLs related to percentage of fiber types I and IIA. The association study performed on a 293 longissimus muscle samples confirmed a significant positive effect of transforming acidic coiled-coil-containing protein 2 (TACC2) polymorphisms on fiber diameter, while SNP within forkhead box O1 (FOXO1) locus was associated with decrease of diameter of fiber types IIA and IIB. Moreover, subsequent general linear model analysis showed significant relationship of FOXO1, delta 4-desaturase, sphingolipid 1 (DEGS1), and troponin T2 (TNNT2) genes with loin ‘eye’ area, FOXO1 with loin weight, as well as FOXO1 and TACC2 with lean meat percentage. Furthermore, the intramuscular fat content was positively associated (p<0.01) with occurrence of polymorphisms within DEGS1, TNNT2 genes and negatively with occurrence of TACC2 polymorphism. Conclusion This study’s results indicate that the SNP calling analysis based on RNA-seq data can be used to search candidate genes and establish the genetic basis of phenotypic traits. The presented results can be used for future studies evaluating the use of selected SNPs as genetic markers related to muscle histological profile and production traits in pig breeding.


INTRODUCTION
Analyzing transcriptomes using next generation sequencing methods (RNA-seq) has been useful in various applications: estimation of gene expression; identification of novel genes or transcripts; and detection of genomic variants, gene fusion and RNA editing [1,2]. According to numerous studies, calling variants from RNA-seq data has many advantages. First, RNA sequencing is less cost-intensive compared to genome sequencing, and often, genomic variants can be identified from the existing RNA-seq data [3]. Furthermore, the comparison of single nucleotide polymorphism (SNP) panels obtained from animals differing in the levels of given traits can be useful in searching for the genetic background of analyzed traits. To date, the most efficient and reliable methods of SNP calling from RNA-seq data have been validated [4,5].
In pigs, an RNA-seq approach enabled Jung et al [6] to detect nonsynonymous single nucleotide variations potentially related to enzyme activity in porcine liver. Moreover, the authors showed that selected SNPs were associated with pork quality and could be used in marker-assisted selection in future pig breeding programs. Transcriptome sequencing of liver tissue collected from four different pig breeds enabled the identification of non-synonymous SNPs within genes coding for enzymes that are critical during post-mortem metabolic process in muscle [7]. Using RNA-seq data, Piórkowska et al [8] compared distributions of genetic variants detected in two breeds with extreme fatness parameters and identified genes involved in fatty acid metabolism and lipid storage, which can be potential genetic markers associated with meat quality in pigs. Conversely, Fisher et al [9] detected non-synonymous mutations with a possible influence on porcine reproduction traits using transcriptome data obtained after RNA sequence from oviduct and testis tissue.
It has been well-established that percentage number of fiber types in muscle determines metabolic properties and water holding capacity of muscle tissue, as well as marbling, intramuscular fat (IMF) level, meat color and texture [10][11][12]. In our previous study performed on Large White pigs, we identified genes differentially expressed in muscle contingent on histological microstructure [13]. According to the differential expression and biological function, several genes were selected with possible effect on variation of fiber-type distribution. Thus, the aim of the present study was to identify genetic variants based on RNA-seq data, obtained via transcriptome sequencing of muscle tissue of pigs differing in muscle histological structure. The second aim was to select the most interesting variants, based on transcriptome-wide association study, and to verify the theirs effect on production traits and histological microstructure in a larger pig population.

RNA-seq data
The SNPs in coding and untranslated regions (UTR) were identified based on RNA-seq data obtained previously by Ropka-Molik et al [13]. Whole transcriptome sequencing was performed on longissimus lumborum samples collected from 11 Large White pigs that differed in the histological structure of this muscle. Pigs (unrelated sows) were maintained under the same housing and feeding conditions and slaughtered at final body weight of 100 kg (±2.5 kg). Immediately after slaughter, muscle samples were collected and stored at -20°C in RNAlater solution (Ambion Inc., Austin, TX, USA) for RNA-seq application and at -80°C for histological analysis. The details of the animals used, the RNA-seq experiment and histological analysis were described in a previous study focused on differential expression analysis [13].
SNPs identification and association study based on RNA-seq data Quality control of the raw RNA-seq data was performed using FastQC (v0.11.4; Illumina, San Diego, CA, USA). Adapters, poly-A sequences, and reads with a quality score lower than 20 and a fragment size reads lower than 36 bp were removed using the Flexbar tool (v.2.5GNU General Public License version 3.0 [GPLv3]). The cleaned reads were mapped to a Sus scrofa reference genome (Sscrofa10.2) using STAR aligner [14]. Next, reads containing Ns (N cigar elements) were split (CIGAR string), and base quality score recalibration, INDEL realignment, duplicate removal, and identification of SNPs and INDELs (small insertions and deletions) were performed using Picard and GATK software packages [15]. The filtering parameters were as follows: FisherStrand value>50, Qual-ByDepth<2.0, RMS mapping quality<40, quality<40 and a SnpCluster filter. The genome annotation of identified polymorphisms was established using SnpEff 4.1b and Variant Effect Predictor (EMBL-European Bioinformatics Institute, Hinxton, Cambridge, UK) [16], and for downstream association analysis, only SNPs located within coding regions were selected (5'prime UTR variants; missense variants; synonymous variants; 3'prime UTR variants; noncoding transcript exon variants).
The association analysis for additive (fixed) effect of SNP was carried out for percentage and diameter of the I, IIA, and IIB types of fibers. The whole procedure was implemented in R-project. SNPs were considered significant if they met the following criteria: frequency (chisq p<0.05), a minimum of 9 observations, and a significant association analysis (p<0.05).

QTLs identification
The differential expressed genes (DEGs) identified by RNAseq, and genes with SNPs (based on whole coding sequence analysis) significantly associated with percentage and/or dia meter of I, IIA, and IIB types of fibers were mapped to quantitative trait loci (QTL) regions. The QTLs positions were taken from Animal Quantitative Trait Loci Database (Animal QTLdb) [17], which corresponds to the genome release used in this study (Sscrofta10.2, Ensemble database). We used R script [18] prepared by us to conduit the whole procedure. Based on RNA-seq results, 355 DEGs were mapped to QTLs regions. In contrast, based on the transcriptome-wide association study, 222 polymorphisms correlated with at least two histological traits and located within 149 genes were selected to the subsequent analysis. Candidate genes -association study on a larger population For an association study, four genes were selected based on high gene expression differences during RNA-seq analysis and their location within QTLs related to percentage of fibers I, IIA, and IIB (troponin T2 [TNNT2]; forkhead box O1 [FOXO1]). Two additional polymorphisms within two genes related to analyzed histological properties (transforming acidic coiled-coilcontaining protein 2 [TACC2]; delta 4-desaturase, sphin golipid 1 [DEGS1]) were selected as significant based on the transcriptome-wide association study. For all genes, genotypes were assessed using polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) method (TNNT2; TACC2; DEGS1) and polymerase chain reaction-artificially created restriction site (PCR-ACRS) method (FOXO1). The details of the analyses were presented in Table 1.
For the downstream association study, a larger pig population comprising 944 animals representing five breeds (Landrace n = 306, Large White n = 345, Pietrain n = 80, Duroc n = 61, native Puławska breed n = 152) was used. All pigs (sows) were unrelated at least three generations back and were maintained under the same housing and feeding conditions according to Pig Test Station procedures. In the Test Station pigs are slaughtered, dissected and after carcass evaluation, meat is standard intended for consumption, thus research does not require the approval of Animal Experimentation committee. Pigs were fattened from 30 kg up to 100 (±2.5) kg according to test procedures described previously by Ropka-Molik et al [19] and were slaughtered and dissected. For all animals, the following carcass characteristic traits were measured: weight of loin (kg), loin "eye" area (cm) and lean meat percentage (%). Furthermore, several meat quality traits were measured for selected animals: IMF content (the Soxhlet method) [20] and meat color (redness, a*; yellowness, b*; and lightness, L* parameters; Minolta Chroma Meter CR-310; Osaka, Japan) (699 pigs). The determination of meat color was performed using Chroma Meter CR-310 with a pulse xenon arc lamp D65 and 50 mm orifice. The a*, b*, and L* parame-ters were assessed 24 h post mortem on muscle tissue samples (longissimus lumborum; 150 to 200 g). The analysis was performed in two measurement points for each sample.
For selected animals, in total 293 (Landrace n = 110, Large White n = 70, Duroc n = 20, Puławska pigs n = 93), longissimus muscle samples were collected immediately after slaughter and histological profile-percentage and diameter of the I, IIA, and IIB types of fibers-were estimated according to the procedure described previously by Ropka-Molik et al [13]. The animals used to histological analysis were randomly selected from population included in association study. Tissue samples (frozen at -80°C) were cut on 10 μm thick slides using a cryostat (Slee MEV, Mainz, Germany) and then fibers were detected using NADPH dehydrogenase (diaphorase) method. Myosin heavy chains (MHCs) were detected immunohistochemically with NCL-MHC antibodies (Leica, Novocasta, Leider Lane, Buffalo Grove, IL, USA) and NovoLink Polymer Detection Systems (Zalesie Gorne, Poland) (Leica, Novocasta, USA) [21]. The obtained preparations were analyzed in terms of the percentage content of types of muscle fibers (I, IIA, IIB) and their diameter using the Nikon Eclipse 50i light microscope and MultiScan image analysis software (ver. 18.03; Sony, Tokyo, Japan). In each histological sample, the analysis covered a minimum of 300 fibers of each types. The percentage of fiber types was calculated based on 10 bundles measurements.
The statistical analysis for all selected genes was performed using the general linear model (GLM) procedure in SAS software (SASv. 8.02; Wadowice, Poland) and the model used was: Where, Y ijk , the trait measured; μ, the overall mean for the trait; f i , the fixed effect of j genotype group of analyzed gene; h j , the fixed effect of k breed; α(x ij ), covariate with weight of right half-carcass for selected slaughter traits; e ijk , random error.
The correlation coefficients between histological features and carcass characteristics and meat quality traits were calculated using Pearson correlation method (SAS software, SASv.

SNP identification and transcriptome-wide association analysis
Based on RNA-seq data, 65,535 SNPs were identified. After filtering according to the criteria outlined above, 51,590 SNPs were selected for use in the association study. The significant association of SNPs and percentage of I, IIA, and IIB fiber types were identified (p<0.05; 560, 251, and 837; respectively) ( Table 2). We identified the panel of SNPs significantly related to the diameter of each fiber type. Few polymorphisms affected fiber I and IIB diameters compared to those that were associated with percentage of the same fiber types ( Several polymorphisms/genes influenced more than one analyzed histological trait, simultaneously affecting the percentage and the diameter of the fibers. The identified panel of genes was the basis for selecting significant SNPs for subsequent GLM analysis performed on a larger population. From such genes, we selected the TACC2, DEGS1, and FOXO1 genes, which were related to multiple histological properties.

Genes mapping to QTLs
Based on the pig QTL database (pigQTLdb), DEGs (n = 355) identified by the RNA-seq, were located within QTLs related with fiber percentage and diameter (78 and 65, respectively). The most interesting locus seems to be on chromosome 14, at which 23 genes located in regions affecting percentage of I and IIA fibers were identified. DEGs localized in QTLs associated with percentage of fiber type IIB were identified mainly on chromosome 2 (SSC2) and SSC10 (Supplementary Table  S2). The most represented QTLs related with muscle fiber diameter were identified on SSC12 and SSC15; SSC2, SSC14 and SSC15; SSC2 and SSC14 (diameter of fibers I, IIA, and IIB; respectively) (Supplementary Table S3). On chromosome 14, 13 DEGs associated with diameter of both fibers IIB and IIA were mapped to four QTL regions (Supplementary Table  S3).
After the transcriptome-wide association study, 149 genes were selected (222 SNPs) and mapped to QTL regions. Results showed that chromosome 14, on which four QTLs were identified (related with percentage of fiber type I: QTLs #2822 and #2824; and IIB fiber type: #2823 and #7044), was the most overrepresented (Table 4). Also, on SSC10, four QTLs were identified, but these were represented by low numbers of genes (#7012 and #7026, percentage of fiber type I; #7031 and #7034, percentage of fiber type IIA) ( Table 4). In the case of diameter of fiber, the higher numbers of genes were detected on SSC14 and SSC3 within QTL regions associated with diameter of fiber types IIA and IIB ( Table 5).

Association of selected SNPs with carcass characteristic and pork quality traits
Based on findings of other authors indicating relations of histological structure and meta quality and carcass traits [10][11][12], we also performed an association study on a large number of pigs. For this analysis four genes were selected: TNNT2; FOXO1; TACC2; DEGS1. These genes were previously identified as differentially expressed between muscle characterized by different histological properties; were localized within QTL regions associated with percentage of fiber types (FOXO1, IIB and IIA; TNNT2 and DEGS1, I and IIA), and contained SNPs  potentially related with muscle microstructure. Moreover, subsequent GLM analysis showed significant relationships of FOXO1, DEGS1, and TNNT2 genes with loin 'eye' area, FOXO1 with loin weight, as well as FOXO1 and TACC2 with lean meat percentage. Furthermore, the significant (p<0.01) effect of polymorphisms within DEGS1, TACC2, and TNNT2 genes on IMF content was detected (Table 6). Only a small extent of SNPs was associated with pork color. TACC2 polymorphism showed an effect on value of a* color parameter and TNNT2 polymorphism influenced the b* color parameter ( Table 6).

Association of selected SNPs with histological microstructure of muscle tissue
The association study performed on a large number of animals (293 samples of longissimus muscle) confirmed significant effect of FOXO1 and TACC2 genes on histological characteristics in pigs. FOXO1 was associated with diameter of fiber types IIA and IIB (p<0.01) ( Table 7). The lowest diameter of both fiber types was detected in pigs with GG genotype, while the highest diameter was observed in heterozygote pigs. Conversely, a missense variant analyzed in the TACC2 locus showed a significant effect on diameter of fibers I and IIA and percentage of fiber type IIB. Our results showed that pigs with the AA genotype were characterized by the lowest diameter of fibers I and IIA, as well as the lowest percentage of fiber type IIB (Table 7). Furthermore, our result confirmed also significant correlation between selected histological traits and carcass characteristic. The diameters of all fiber types were positively, significant correlated with weight of loin. Diameters of fibers I and IIA was also correlated with lean meat percentage, while diameter of fibers I and IIB with meat color parameter, yellowness (b*) ( Table 8). Percentage of fiber types was only corelated with loin 'eye' area (percentage of fiber I and IIB) ( Table 8). For the other analyzed traits, the significant correlations were not obtained.

DISCUSSION
Currently, much focus is placed on improving meat quality traits in pig breeding programs while maintaining satisfactory levels of meat and fatness characteristics [22]. Research has been performed on refining pork quality through different methods including environmental conditions (maintenance, nutrition), animal transport, slaughter procedures, and genetics. One of the factors determining pork quality is fiber types distribution in muscle tissue [23,24], which can be affected by genetic background [25,26]. However, the genetic basis of Table 5. Genes obtained after transcriptome-wide association study mapped to QTL regions related to percentage of fiber types I, IIA, and IIB   I fiber type  SSC  QTL ID  IIA fiber type  SSC  QTL ID  IIB fiber type  SSC  QTL ID  SCRN2  12  2819  2821  SLF1  2  2797  RGP1  1  2795  UTP18  2819  2821  CAST  2797  IKBKAP  2795  TOM1L1  2819  2821  ANAPC7  14  7018  TTL  3    muscle fiber structure has not been well understood. The present study is a continuation of previous research concerning identification of genes determining muscle fiber structure in pigs [13]. The differential expression analysis was performed on pigs representing one breed (Large White) to avoid an influence of breed factor on gene expression profile. The comparison of global gene expression profiles enabled us to detect the panel of genes that can play a role in formation of muscle fiber such as TNNT2, FOXO1, myosin XVIIIB, FERM domain containing 4B, and scavenger receptor class a member 5 [13]. The SNP calling analysis is a next step facilitating the search for candidate genes and establishing the exact genetic basis of phenotypic traits. In our research, the widetranscriptome association study, performed as a preliminary analysis, allowed us to select the most important genetic variants potentially related with histological profile in muscle, the effects of which were validated on a large pig population.
Mapping both DEGs and SNPs significantly correlated with muscle microstructure to the known QTLs showed that a high number of genes were localized on SSC14. The most overrepresented were QTL regions associated with percentage of fiber type I (ID#2822 and #2824), IIB (ID#2823 and #7024), and diameter of fibers IIA (ID#7038) and IIB (ID#7018). These results were consistent with previous findings of Wimmers et al [26] (QTLs #2822; #2823; #2824) and Estelle et al [27] (QTLs #7018; #7024; #7038), who mapped QTLs associated with size and percentage of fibers on SSC14 and SSC15. These genome regions also span QTLs related with meat quality traits: pork color-redness and lightness [28], as well as fatness traits-abdominal fat weight and IMF content [29]. This may indicate that identified genes can affect some production traits in pigs.
Genome location was one of the selection criteria of genes for the subsequent GLM analysis. The association study carried out on a larger group of animals confirmed the significant effect of scrutinized polymorphisms in FOXO1 and TACC2 genes on some histological characteristics. A prior study [13] indicated differential expression of the FOXO1 gene between pigs characterized by various percentages of fiber types I and IIB. The results obtained by transcriptome-wide association analysis, supplemented with information on genes localization in QTLs regions, were confirmed for both FOXO1 and TACC2. The most interesting gene seems to be TACC2 coding for transforming acidic coiled-coil protein 2. The RNAseq approach enabled us to detect several polymorphisms within the TACC2 gene. For the association study, the missense variant within TACC2 locus was selected (p.Arg2613Cys) due to the highly potential results on protein structure, as well as biological function.
Our results show that investigated SNP in TACC2 gene affected diameter of both fiber types I and IIA and percentage amount of fiber IIB. The same polymorphism was also significantly related with lean meat percentage as well as IMF content. In the cell cycle, TACC2 protein is involved in G2/M progression [30] by playing a key role in microtubule dynamics. It has been established that the TACC2 gene is a critical factor controlling mitotic spindle dynamics during the normal course of the mitosis process [30]. To date, scarce research has investigated the involvement of the TACC2 gene in fiber type specification or muscle growth, even though this gene is highly expressed in skeletal and cardiac muscle. Our study suggested that transforming acidic coiled-coil protein 2 could play an important role during muscle fiber differentiation, resulting in modification of the number and size of muscle fibers. In turn, the differences in histological features are related to muscle mass and/or meat quality traits. The fiber composition in muscle is related with fatty acid and glycogen content due to the type of energy sources. The fiber type I is characterized by high fatty acid storage and low glycogen content, whereas the opposite trend is observed for the fiber type IIB. The observed missense variant changing arginine to cysteine in two different TACC2 isoforms (ENSSSCP00000029 474.1:p.Arg795Cys; ENSSSCP00000025150.1:p.Arg2613Cys) may affect function of synthesized proteins. Our results confirmed previous findings that percentage amount of type fiber IIB is inversely related with IMF [11]. According to the TACC2 missense variant, pigs with the AA genotype had significantly lower numbers of fiber type IIB and higher IMF content.
Conversely, our results indicated significant association between a FOXO1 mutation and diameter of fibers IIA and IIB and carcass characteristics such as weight of loin, loin 'eye' area and lean meat percentage. Furthermore, the significant correlation was observed between diameter of all fiber types and weight of loin. A previous study [13] showed differential expression of the FOXO1 gene depends on histological microstructure of muscle tissue. Moreover, our current research, which confirmed a significant relationship of a 3'UTR variant in the FOXO1 gene with fiber diameter, suggests that the analyzed SNP can affect FOXO1 transcript levels and histological profile as a result. Schiaffino and Mammucari [31] showed that FOXO proteins can regulate skeletal muscle growth via interaction with the IGF1-Akt/PKB pathway. Moreover, the research performed on transgenic mice confirmed that overexpression of the FOXO1 gene leads to muscle atrophy and increases transcript level of cathepsin L [32]. Researchers demonstrated that skeletal muscle of mice overexpressing FOXO1 was characterized by decrease size of both fiber types I and II, likewise reducing the number of fiber type I. Subsequently, the up-regulation of the cathepsin L gene can indicate increasing protein degradation processes in muscle tissues. The FOXO1 gene has been considered as a main genetic factor closely associated with muscle fiber type specification and skeletal muscle differentiation [33]. Our results confirmed that in pigs, the FOXO1 gene is related to muscle mass, as well as size of fiber types IIA and IIB, and should be investigated in the future for relation with important production traits. Our results did not show the significant association of DEGS1 and TNNT2 genes with histological traits. However, polymorphisms identified in both genes affected loin 'eye' area and lean meat content in carcasses. This suggests that detected SNPs can be related with muscle growth processes but do not affect histological structure of muscle tissue at the same time. Furthermore, there is also the possibility of occurrence of another polymorphism within the DEGS1 and TNNT2 loci (in promoter or intron regions), which are associated with gene expression, and as a result, with cell size and fiber type distribution.

CONCLUSION
The use of RNA-seq data enabled us to identify a panel of SNPs within coding and UTRs regions with potential effects on histological microstructure. The genes detected by differential expression and the transcriptome-wide association analyses enabled us to select candidate genes potentially related to distribution of fiber types in porcine muscle tissue. These results can inform future studies evaluating the use of selected SNPs as genetic markers related to muscle histological profile and production traits in pig breeding.

CONFLICT OF INTEREST
We certify that there is no conflict of interest with any financial organization regarding the material discussed in the manuscript.