Genome-wide scans for detecting the selection signature of the Jeju-island native pig in Korea

Objective The Jeju native pig (JNP) found on the Jeju Island of Korea is a unique black pig known for high-quality meat. To investigate the genetic uniqueness of JNP, we analyzed the selection signature of the JNP in comparison to commercial pigs such as Berkshire and Yorkshire pigs. Methods We surveyed the genetic diversity to identify the genetic stability of the JNP, using the linkage disequilibrium method. A selective sweep of the JNP was performed to identify the selection signatures. To do so, the population differentiation measure, Weir-Cockerham’s Fst was utilized. This statistic directly measures the population differentiation at the variant level. Additionally, we investigated the gene ontologies (GOs) and genetic features. Results Compared to the Berkshire and Yorkshire pigs, the JNP had lower genetic diversity in terms of linkage disequilibrium decays. We summarized the selection signatures of the JNP as GO. In the JNP and Berkshire pigs, the most enriched GO terms were epithelium development and neuron-related. Considering the JNP and Yorkshire pigs, cellular response to oxygen-containing compound and generation of neurons were the most enriched GO. Conclusion The selection signatures of the JNP were identified through the population differentiation statistic. The genes with possible selection signatures are expected to play a role in JNP’s unique pork quality.


INTRODUCTION
Of the various pig breeds, the Jeju native pig (JNP) can be considered as one of the representatives of Korean native black pig (KNBP) [1]. The JNP has black skin, erect, unfolded ears and is well-known for its tender and juicy meat. Consumers prefer JNP meat because of its superior taste, tenderness and marbling quality compared to the meat from western breeds [2]. JNP meat can be regarded as one of the best pork in Korea. The uniqueness of the JNP meat may come from the distinct climate of Jeju Island of Korea and the life pattern of the people of Jeju Island. Notably, the winter temperature on Jeju Island remains above zero and the JNP are used to dispose of human waste. These factors might change the allele patterns in the JNP genome compared to other pig breeds. JNP has problems of lower feed efficiency and smaller litter sizes as compared to Berkshire pig breeds [2]. Thus to understand the uniqueness of the JNP and JNP pork quality, a comparative genomic study between the JNP and other commercial pig breeds like Berkshire or Yorkshire pigs, is needed. In a previous study, Ghosh et al [3] used RNA-seq to identify pork quality-related genes.
Population differentiation (subdivision) is a fundamental process of evolution and its infer-ence is required for genetics, phylogeography and conservation biology. It can be recognized as "fundamental" because every species unavoidably undergoes population differentiation, and it may lead to new speciation or extinction under certain conditions. Additionally, genetic differentiation of populations can be the result of uneven (nonrandom) spatial distribution of genetic variation and allele frequencies in a species. The fixation index, F st , is the measure of population differentiation [4].
In previous study, Kim et al [5], used the cross-population extended haplotype homozygosity (XP-EHH) and cross population composite likelihood ratio (XP-CLR) to identify the putative selection signature causes of meat quality of JNP. The XP-EHH statistic examines the haplotype difference between two populations and detects alleles that have increased in frequency at the point of fixation or near-fixation. The haplotypes that are frequent and have longer than expected values are regarded as being positively selected. Alternatively, XP-CLR evaluates the allele frequency differentiation between two populations to assess the candidate regions of the selective sweeps. Selective sweep regions can be considered to be highly positive-selected. Oh et al [6], used the F st of the microsatellite markers of JNP to assess meat quality. Our approach was to use Fst of single nucleotide polymorphism (SNP) markers.
In this study, we aimed to reveal genomic differences and the selective sweep regions of the JNP in comparison to the Berkshire and Yorkshire breeds using the fixation index. The selection signatures can be identified through the selective sweep analysis of the genomic regions like population differentiation. The selection signature of the JNP was revealed through the population differentiation statistic.

Ethical statement
The experimental procedure followed Institutional Animal Care and Use Committee regulations (CRONEX-IACUC 201810005).

Data preparation
We randomly sampled 50 JNP from the Jeju Livestock Promotion Institute (Jeju, Korea), 151 Yorkshire pigs from GGP farms, and 67 Berkshire pigs from Dasan Breeding Farm (Namwon, Korea). The genomic DNAs of the individuals were genotyped using an Illumina Porcine 60 K SNP Beadchip (Illumina, San Diego, CA, USA) following the standard protocol. We merged the three pig breeds' data using vcftools (vcftools.sourceforge.net) and obtained 62,551 SNPs [7]. For the quality control, we excluded the SNPs with minor allele frequency (<0.05) and Hardy-Weinberg equilibrium (p<0.0001) and genotyping call rate (<0.05).

Structure analysis and linkage disequilibrium
To understand the characteristics of the JNP, Berkshire and Yorkshire pig populations, structure analysis was performed with the structure program [8]. We set the 5,000 iterations after the burn-in 5,000 iterations and K = 3, 4, and 5. We evaluated the JNP's purity and structure analysis showed that the JNP was mixed with Berkshire in part. Thus, we eliminated impure JNP individuals for the next analysis.
The correlation coefficient of the linkage disequilibrium (r 2 ) was computed to assess the genetic diversity of the JNP, Berkshire and Yorkshire pigs. Linkage disequilibrium is the nonrandom associations of alleles at two loci, A and B. Linkage disequilibrium is usually used to assess the genetic diversity of a given population, estimation of effective population size and population genetic stability. Using the R package "LDcorSV", we examined all pairwise r 2 decays with 100 SNP bins along the physical distance [9]. The formula based on this is as follows: Using the R package "LDcorSV", we examined all pairwise r 2 dec 108 physical distance [9]. The formula based on this is as follows:   Using the R package "LDcorSV", we examined all pairwise r 2 dec 108 physical distance [9]. The formula based on this is as follows:  is the linkage disequilibrium of two alleles A, B and P X is the frequency of major allele X and P X is the frequency of minor allele x.

Population differentiation analysis
We used the population differentiation statistic (F st ) to find selection signatures of the JNP breeds. The reference pig breeds were Berkshire and Yorkshire. Berkshire and Yorkshire pigs are worldwide commercial pig breeds that we felt provided an adequate comparison to the JNPs. Among various population differentiation statistics, the statistic was the Weir-Cockerham F st in vcftools [10][11][12]. The Weir-Cockerham F st formula is as follows: diversity of the JNP, Berkshire and Yorkshire pigs. Linkage disequilibrium is the nonrandom 105 associations of alleles at two loci, A and B. Linkage disequilbrium is usually used to assess the genetic 106 diversity of a given population, estimation of effective population size and population genetic stability.

107
Using the R package "LDcorSV", we examined all pairwise r 2 decays with 100 SNP bins along the 108 physical distance [9]. The formula based on this is as follows:  Weir-Cockerham Fst formula is as follows: Where 2 is the sample variance of the allele frequency of A over subpopulations and ̅ is the average 125 of the allele frequency of A in the total populations of the biallelic system [11,13].
Where s 2 is the sample variance of the allele frequency of A over subpopulations and diversity of the JNP, Berkshire and Yorkshire pigs. Linkage disequilibrium is the nonrandom 105 associations of alleles at two loci, A and B. Linkage disequilbrium is usually used to assess the genetic 106 diversity of a given population, estimation of effective population size and population genetic stability.

107
Using the R package "LDcorSV", we examined all pairwise r 2 decays with 100 SNP bins along the 108 physical distance [9]. The formula based on this is as follows:  Weir-Cockerham Fst formula is as follows: Where 2 is the sample variance of the allele frequency of A over subpopulations and ̅ is the average 125 of the allele frequency of A in the total populations of the biallelic system [11,13].

Genotype data
130 is the average of the allele frequency of A in the total populations of the biallelic system [11,13].

Genotype data
Among the 62,551 autosomal SNPs genotyped in this analysis, 46,505 in JNP-Berkshire and 44,306 in JNP-Yorkshire remained after quality control. After filtering and imputation, the number of SNPs per autosome ranged from 1,085 to 6,000 in the JNP-Berkshire data and from 1,030 to 5,864 www.ajas.info 541 Lee et al (2020) Asian-Australas J Anim Sci 33:539-546 in the JNP-Yorkshire data, and this value was closely related to the chromosome length and total number of SNPs, as shown in Supplementary Figure S1. The remaining minor allele frequency of SNPs exhibited a uniform distribution, with an average of 0.42±0.22 (standard deviation [SD]). The mean distance between adjacent SNP pairs from this analysis was 50,134±206,871 (SD).

Linkage disequilibrium of Jeju native pig
We evaluated the structure distinctness of the JNP, Berkshire and Yorkshire pigs, and the linkage disequilibrium decays were examined to assess the genetic diversity. We used the R package "LDcorSV" to examine the linkage disequilibrium. All pairwise linkage disequilibriums were calculated within adjacent 100 SNPs. Figure 1 shows r 2 decay with the physical distance in the JNP, Yorkshire and Berkshire pigs. According to Figure  1, the genetic diversity of the JNP were considerably lower than that of the Yorkshire and Berkshire pigs. Additionally, this lower genetic diversity reflects that the number of JNPs has decreased since South Korea has been modernized after the 20th century. We analyzed the population differentiation to examine the population structure of the JNP, Berkshire and Yorkshire pig breeds. Figure 2 shows the result of structure analysis after pre-structure analysis was used to identify and eliminate the impurity of the JNP and other pigs using the case K = 3.

Population differentiation analysis
To detect and characterize of the JNP's selection signatures, we used the population differentiation statistic, F st in which the selection signature between two compared breeds can be revealed. F st is the widely used population differentiation statistic in selective sweep analysis. Practically, we used the Weir-Cockerham's F st statistic. The mean and standard deviation in F st analysis of JNP vs Berkshire and JNP vs Yorkshire were 0.27±0.25 and 0.26±0.25, respectively. Figure 3 shows the distribution of F st of the JNP, Berkshire and Yorkshire pigs. The frequencies of the F st values decreased with increasing F st values, and the gray line represents the cutoff F st (top 5% genes) for gene ontology (GO) analysis. Supplementary file 1 consists of the significant gene's SNP list and its allele frequencies.
Additionally, this file shows the discrepancy of the allele frequency of JNP, Berkshire and JNP, Yorkshire pigs.

Functional classification and biological pathway analysis
We performed gene matching with the high F st SNPs. After gene matching of SNPs using ensemble gene catalogue (www. ensembl.org), the top 5% of genes were used in DAVID GO analysis (The Database for Annotation, Visualization and Integrated Discovery v6.8; https://david.ncifcrf.gov/). The DAVID analysis includes three types of classification (biological process, molecular function, and cellular component). Practically, we used the biological process to query the selective sweep genes' GO. For the selective sweep genes' GO, the multiple correction and p-value approach (cutoff p-value 0.01 or 0.05) was not adequate in our analysis because it was very harsh. Table 1     of selective sweep regions but rather comprehensively scanning the genome-wide regions. However, in the current cost restriction, sample size is a primary problem. Therefore, we instead used the F st statistic which could identify the population differentiation at the direct variant level. In this study, we used SNPs data from a porcine 60K HD chip.
Gene ontology and pathway analysis of the genes in the selective sweeps of the Jeju native pig The significant genes in the selective sweep analysis differed considerably from Kim et al [5]. Kim et al [5] used whole genome sequencing data, whereas the SNP chip data were used in our analysis. Due to the long distance between SNPs in our dataset, we utilized another statistic, F st which assesses the population differentiation at the right position of the SNPs and has the advantage to being able to find the selection signature of the SNPs based on the fixation index between two populations. The major terms in the GO analysis were "GO:0048729~ tissue morphogenesis", "GO:0060429~epithelium development", and "GO:0048589~developmental growth" in the JNP vs Berkshire and "GO:1901701~cellular response to oxygencontaining compound" and "GO:0071396~cellular response to lipid", and "GO:0022008~neurogenesis" in the JNP vs Yorkshire. The overlapped genes in JNP with Berkshire and Yorkshire pigs were 57. These kinds of genes seem to indicate the JNP's specific genes with selective sweeps. It is of interest that some genes of the "GO:0022008~neurogenesis", i.e., zinc finger protein 536 (ZNF536), FRY like transcription coactivator (FRYL), protein tyrosine kinase 7 (PTK7), protein kinase CGMPdependent 1 (PRKG1), patched 1 (PTCH1), intraflagellar transport 140 (IFT140), was overlapped in JNP vs Berkshire and JNP vs Yorkshire cases. ZNF536 protein is a highly conserved zinc finger protein and is most abundant in the brain, where it negatively regulates neuronal differentiation. The FRYL plays a key role in maintaining the integrity of polarized cell extensions during morphogenesis. The PTK7 gene encodes the proteins of a member of the receptor tyrosine kinase family and is involved in the Wnt signaling pathway. It plays a role in multiple cellular processes including polarity and adhesion. PRKG1 isoforms play as key mediators of the nitric oxide/cGMP signaling pathway. PTCH1 encodes a member of the patched family of proteins and is a component of the hedgehog signaling pathway. Hedgehog signaling is important in embryonic development and tumorigenesis. The encoded protein is the receptor for the secreted hedgehog ligands, which include sonic hedgehog, Indian hedgehog and desert hedgehog. Following binding by one of the hedgehog ligands, the encoded protein is trafficked away from the primary cilium, relieving inhibition of the smoothed G-proteincoupled receptor, which results in activation of downstream signaling. Mutations of this gene have been associated with basal cell nevus syndrome and holoprosencephaly. IFT140 gene encodes one of the subunits of the IFT complex and intraflagellar transport is involved in genesis, resorption and signaling of primary cilia.
The JNP exhibits slower growth, the fat is thicker than other pigs and the litters are smaller. According to the Kyoto encyclopedia of genes and genomes (KEGG), mechanistic target of rapamycin signaling pathway was reported to regulate adiposity [16]. Additionally, the regulation of lipolysis in adipocytes is related to adipose accumulation. Wnt family member 7B (WNT7B; GO:1901701~cellular response to oxygen-containing compound in JNP vs Yorkshire pigs, ssc05205: proteoglycans in cancer in JNP vs Yorkshire pigs) is required for epithelial progenitor growth and is involved in the pancreatic development [17]. Estrogen receptor 1 (ESR1; GO:1901701~ cellular response to oxygen-containing compound in JNP vs Yorkshire pigs, ssc04917: prolactin signaling in KEGG pathway of JNP vs Berkshire pigs) is reported to be associated with litter size in the Chinese-European pig line [18]. Transforming growth factor beta-2 (TGF-beta2: GO:0022008~ neurogenesis in JNP vs Berkshire pigs) enhances connective tissue formation and wound strength in the guinea pig [19]. AKT serine/threonine kinase 3 (AKT3; ubiquitous in KEGG pathway of JNP vs Berkshire pigs) is related to reduced proliferation and facilitated differentiation of myoblasts in skeletal muscle development, whereas phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (PIK3CA) induces multipotency [20] (Tables 3, 4).

High F st genes' description
TJP1 encodes a member of the membrane-associated guanylate kinase family of proteins, and it acts as a tight junction adaptor protein. Tight junctions play a role in regulating the movement of ions and macromolecules between endothelial and epithelial cells. ABAT is responsible for catabolism of gamma-aminobutyric acid, an inhibitory neurotransmitter in the central nervous system. MMP25 is involved in the breakdown of the extracellular matrix in normal physiological processes, such as embryonic development, tissue remodeling, and reproduction as well as in disease processes, such as arthritis and metastasis. MSRB1 is a member of the MSRB family functions to repair enzymes that protect proteins from oxidative stress by catalyzing the reduction of methionine-R-sulfoxides to methionines. This protein is highly expressed in the liver and kidney and it also has the highest methionine-R-sulfoxide reductase activity. OGN encodes protein that induces ectopic bone formation and may regulate osteoblast differentiation. High expression of the protein may be associated with elevated heart left ventricular mass. Plexins are transmembrane receptors for semaphorins, a large family of proteins that regulate axon guidance, cell motility and migration, and the immune response. Modification of pro-tein with ubiquitin is an important cellular mechanism for targeting abnormal or short-lived proteins for degradation. UBE3B encoded protein are involved in ubiquitination. IFI44 and IFI44L are the interferon induced protein 44 and its paralog. Diseases associated with DDX59 include orofaciodigital syndrome V and orofaciodigital syndrome. The GO annotations related to this gene include nucleic acid binding and helicase activity. Cathepsin O enzyme is involved in cellular protein degradation and turnover. PTGER3 protein is a member of the G-protein coupled receptor family. This protein is one of four receptors identified for prostaglandin E2 and this receptor may have many biological functions, which involve digestion, the nervous system, uterine contraction activities and kidney reabsorption (www.genecards.org).

The comparison of previous study
Kim et al [5] used XP-EHH and XP-CLR to identify the selection signatures of KNBP including JNP. There were overlapped genes such as hydroxysteroid 17-beta dehydrogense 12 (HSD 17B12), CUB and sushi multiple domains 3, BTB domain containing 11 (BTBD11), ATP binding cassette subfafmily B member 11, transforming growth factor beta 2 (TGFB2), otoancorin, solute carrier family 4 member 10, thyroid hormone receptor beta, and WNT7B between our study and Kim et al' s study. HSD17B12 gene product converts into estradiol in ovarian tissue and is involved in fatty acid elongation. JNP's unique meat quality can be implicated in the fatty acid. Specifically, BTBD11 was identified in every four cases of JNP vs Berkshire (with XP-EHH and XP-CLR of Kim et al [5]) and JNP vs Yorkshire (with XP-EHH and XP-CLR of Kim et al [5]).

CONCLUSION
First, we examined the JNP's genetic diversity using linkage disequilibrium decays. The JNP's genetic diversity was lower than that of Berkshire and Yorkshire pigs. Second, we compared the selection signatures of JNPs to those of Berkshire and Yorkshire pigs and performed GO analysis. Some genes such as ESR1, TGFB2, and AKT3 were related to growth, litter size and reproduction. We expect that these genes and GO genes contribute to the quality of JNP's pork.

CONFLICT OF INTEREST
We certify that there is no conflict of interest with any financial organization regarding the material discussed in the manuscript. Lee SC is an employee of Cronex Inc.