Genetic diversity and divergence among Korean cattle breeds assessed using a BovineHD single-nucleotide polymorphism chip

Objective In Korea, there are three main cattle breeds, which are distinguished by coat color: Brown Hanwoo (BH), Brindle Hanwoo (BRH), and Jeju Black (JB). In this study, we sought to compare the genetic diversity and divergence among there Korean cattle breeds using a BovineHD chip genotyping array. Methods Sample data were collected from 168 cattle in three populations of BH (48 cattle), BRH (96 cattle), and JB (24 cattle). The single-nucleotide polymorphism (SNP) genotyping was performed using the Illumina BovineHD SNP 777K Bead chip. Results Heterozygosity, used as a measure of within-breed genetic diversity, was higher in BH (0.293) and BRH (0.296) than in JB (0.266). Linkage disequilibrium decay was more rapid in BH and BRH than in JB, reaching an average r2 value of 0.2 before 26 kb in BH and BRH, whereas the corresponding value was reached before 32 kb in JB. Intra-population, inter-population, and Fst analyses were used to identify candidate signatures of positive selection in the genome of a domestic Korean cattle population and 48, 11, and 11 loci were detected in the genomic region of the BRH breed, respectively. A Neighbor-Joining phylogenetic tree showed two main groups: a group comprising BH and BRH on one side and a group containing JB on the other. The runs of homozygosity analysis between Korean breeds indicated that the BRH and JB breeds have high inbreeding within breeds compared with BH. An analysis of differentiation based on a high-density SNP chip showed differences between Korean cattle breeds and the closeness of breeds corresponding to the geographic regions where they are evolving. Conclusion Our results indicate that although the Korean cattle breeds have common features, they also show reliable breed diversity.


INTRODUCTION
Recent developments in genomics technology have enabled the analysis of genome-wide genetic structures [1]. The availability of single-nucleotide polymorphism (SNP) chips for massive genotyping has proven to be useful in genetically characterizing populations of animals and in assessing their degree of divergence [2]. Using these chips, it has become possible to study parameters such as linkage disequilibrium (LD) patterns, genetic diversity, and genome-wide association. In this regard, whole-genome SNP arrays are becoming widely used to study genetic diversity and are now considered among the tools routinely applied in animal breeding [3][4][5]. Studies of genetic relationships between cattle breeds provide useful information on livestock breeding programs and useful gene resource maintaining [6][7][8]. Breeds with a unique evolutionary history could potentially have a value not only in the maintenance of genetic diversity at the species level [9], but also for selection strategies aimed at genetic improvement through the application of genetics technologies [10].
In Korea, there are three main cattle breeds, which are distinguished by coat color: Brown Hanwoo (BH), Brindle Hanwoo (BRH), and Jeju Black (JB) [11]. The BH breed was established by a breeding and selection program as a major breeding stock. In Korea, there are over a million BH, whereas population sizes of the other two breeds number only a few thousand. Through strong artificial selection, the BH breed has become mainly specialized for meat traits [12]. Since the 1930s, when the Han woo breeding program began, the BH breed has been intensively selected based on carcass traits, such as carcass weight, eye muscle area, and intramuscular fat, through progeny tests. Hanwoo progeny tests have facilitated annual genetic gains in carcass traits that have increased dramatically in BH populations, and may have affected genomic regions associated with these carcass traits in this breed [13]. Understanding the genetic mechanisms that lead to phenotypic changes requires identification of the genomic regions that have been under long-term artificial selection. Strong artificial selection increases the frequency of favorable alleles at the loci that affect meat quality traits in meat production breeds. In this process, a small region of the genome surrounding mutations is also selected, resulting in a small genomic region that shows reduced variation [14]. This region of reduced variation is referred to as a selection signature, which is identified by nucleotide distributions around favorable mutations that differ statistically from those expected purely by chance [15]. Such selection signatures have revealed many genes that have been important in cattle selection.
The aim of the present study was to compare the genetic diversity and divergence of Korean cattle breeds using a Bo-vineHD chip. Understanding the diversity in Korean cattle populations will enable researchers to develop better strategies for the breeding and conservation of these cattle.

Animals and genotype assays
Sample data were collected from 168 cattle in three populations of BH (48 cattle), BRH (96 cattle), and JB (24 cattle).
Blood of BH was collected by Animal Genetic Resources Research Center of National Institute of Animal Science. Blood of BRH was sampled from two local institutes (Gangwon Provincial Livestock Research Center and Chungbuk Veterinary Service Center). And, blood of JB was collected from Jeju special Self-Governing Provincial Livestock Institute. Blood samples of these three breeds were randomly collected, while avoiding parent-offspring or sib pairs where possible according to pedigree information of each institute. Genomic DNA for genotyping assays was extracted from a blood sample. The quantity and quality of the extracted DNA were evaluated using a spectrophotometer (NanoDrop 1000; Thermo Scientific, Waltham, MA, USA). The SNP genotyping was performed using the Illumina BovineHD SNP 777K Bead chip (Illumina, San Diego, CA, USA) [16]. This protocol was approved by the Committee on the Ethics of Animal Experiments of the National Institute of Animal Science.

Quality control of the genotypes
The initial analysis of the images and the genotypes were carried out using GenomeStudio V2011.1 software (Illumina, USA). We excluded 3,290 SNPs because their genomic position was unknown and genotype clusters were not separated clearly in intensity data plotting. The SNPs with a call rate of less than 98%, and SNPs on the X chromosomes were also excluded from the analysis. A total of 735,182 SNPs were included in further statistical analysis. For selection signature discovery, autosome SNPs (n = 570,371) with a minor allele frequency (MAF) ≥0.05 were filtered out from the initial dataset. The LD coefficients (r 2 ) of all pairs of SNPs were calculated using SVS8 software (Golden Helix SNP and Variation Suite; Bozeman, MT, USA), and then we excluded SNPs with an r 2 value >0.5. A total of 226,694 SNPs were used for LD decay analysis. The final number of SNPs and individuals included in three Korean cattle breeds is shown in Table 1.

Selection signature analysis
The cross population extended haplotype homozygosity (Rsb) analyses were conducted among BH, BRH, and JB using the rehh package [17] for R software. Candidate SNPs were defined as passing the Bonferroni correction threshold of p-value 8.77×10 -8 . Integrated haplotype score (iHS) analyses were con- ducted on BH, BRH, and JB respectively. This statistic was calculated for SNPs that passed the quality control criteria and exhibited a MAF of at least 0.05, since the algorithm of iHS has a limited power to calculate the statistic for fixed SNPs. Candidate regions were defined as in Rsb. As a prerequisite to the Rsb and iHS analyses, Beagle 3.3 [18] was used to phase the genotyped SNPs into the corresponding haplotypes. The fixation index (Fst) was estimated on the basis of the Wright F statistic [19] with use of SVS8. Candidate SNPs were defined as passing the Fst threshold 0.03.

Population structure and genetic diversity analysis
The Fst were analyzed for the estimation of genetic distance and genomic relationship between samples using SVS8 software. The neighbor program of PHYLIP (phylogeny inference package) Version 3.695 (http://evolution.genetics.washington. edu/phylip.html) was used to construct a phylogenetic tree from Fst values.

Linkage disequilibrium analysis
In the present study, we used the squared correlation coefficient between two loci (r 2 ) as a measure of LD. This value denotes the ability of alleles at one marker to predict the alleles at a second marker [20,21]. For pairwise comparisons of all SNPs separated by a maximum distance of 100 Kb, the r 2 value based on haplotype frequencies estimated via the expectation-maximization algorithm was predicted using SVS8 software. Subsequently, we analyzed the decay of LD between SNP pairs. For the purpose of graphical display, the average r 2 values of each 2-kb distance were plotted. Effective population size (Ne) of each breed was estimated through SNPbased LD analysis with SNeP [22].

Runs of homozygosity
Runs of homozygosity (ROH) were defined in each of BH, BRH, and JB using a sliding window approach of 50 SNPs in SVS8, as previously described for cattle by Purfield et al [23]. The ROH were estimated for each individual separately. Each ROH was categorized based on their physical length into 0.5 to 1 Mb, 1 to <5 Mb, 5 to <10 Mb, 10 to <15 Mb, 15 to <20 Mb and 20 Mb. For each of the aforementioned ROH length categories, the mean sum of ROH per population was calculated by summing all ROH per animal in that category and aver-aging this per breed population.

Genetic diversity
Using the Illumina BovineHD SNP 777K Bead chip, 99.4% of markers were genotyped in 95% of the samples in the three breeds, which indicates the suitability of the chip for genotyping the breeds studied (  (Table 1). A phylogenetic tree was constructed from combining estimates of Fst values across overall SNPs ( Figure 1). The tree shows two main groups. One of these groups includes the BH and BRH breeds, which are both raised on the Korean mainland. The other group comprises the JB breed, which is farmed on Jeju Island. This tree reveals a significant correlation between genetic and geographical distances. Our results are consistent with the classification of Korean cattle based on various morphological features proposed by Han et al [24].

Selection signatures
We conducted a haplotype-based iHS analysis and Rsb analysis for selection signatures analysis. There were no significant loci in the genomic region for iHS analysis of BH and JB. Therefore, in order to clarify the genetic characteristics of BRH, we analyzed the selection signatures using Fst of single SNPs. The SNP criteria showing significant difference between two populations were considered to be a significant SNP when it had a Bonferroni correction p-value threshold (8.77×10 -8 ) of iHS and Rsb analyses, and Fst threshold level is ≥0.25. A total of 184, 86 and 403 autosomal loci were potentially subjected to selection in BRH. Among these, 48, 11, and 11 loci were located in genomic regions. The list of genes is summarized in Table 2. There were many significant iHS across a broad range of chromosomes: four on BTA2, one on BTA4, three on BTA5, Figure 1. Neighbor-joining phylogenetic tree inferred using Fst values for the three cattle breeds. Scale bar indicates the distance (Fst value). The phylogenetic tree showed two main groups. two on BTA6, four on BTA7, three on BTA8, four on BTA10, one on BTA11, two on BTA12, three on BTA13, one on BTA18, four on BTA19, two on BTA20, 11 on BTA22, and three on BTA22. Furthermore, the Rsb were discovered at significant loci on five chromosomes (one on BTA10, three on BTA19, one on BTA23, one on BTA26, and three on BTA28). Fst analysis identified significant loci on one chromosome (11 on BTA18).

Linkage disequilibrium decay
The results of LD decay analysis up to 100 kb are shown in Figure 2. The analysis of decay corresponding to the HD chip shows that for the three breeds, r 2 levels start at 0.568, 0.551, and 0.557 for BH, BRH, and JB, respectively, when using the 2 kb bin of SNPs. The decay of LD was more rapid in BH and BRH than in JB, reaching an average r 2 value of 0.2 before 26 kb in BH and BRH, whereas the corresponding value was reached at 32 kb in JB. After these declines, the decay in each breed was considerably less steep, and at a distance of 100 kb, BH, and BRH reached an r 2 value of 0.067 and JB reached an r 2 value of 0.116.

Effective population size
Effective population size is strongly associated with genetic variability and adaptation. To understand the size and structure of these cattle populations, effective population size was calculated using LD estimates (r 2 ). The three cattle breeds showed a declining trend in their effective population size ( Figure 3). Among the breeds, BH and BRH had a high Ne of 260 and 202, respectively, until 13 generations ago, whereas JB had the lowest Ne of 55. This is, however, considerably higher than that reported by Sharma et al [25], who reported the Ne in BH and BRH to be 83 and 59 until 13 generations ago, whereas the Ne of JB was 67. The trends in LD and Ne are the same as those reported by Sharma et al [25]; however, the difference in estimated number could be attributed to differences in the number of samples used in the studies and the number of SNPs (50K vs 500K).

Runs of homozygosity
The ROH have different lengths and frequency depending on breeds. ROH were identified in all animals. Analysis of the distribution of ROH according to their size ( Figure 4) showed that, for BH, ROH shorter than 5 Mb predominated. In contrast, a bimodal distribution with extremely long (>20 Mb) and relatively short (<5 Mb) ROH was found in the BRH and JB breeds. The mean ROH length was 23.

DISCUSSION
The BH breed has been subjected to long-term artificial selection as part of a national breeding program aimed at improving meat quality and quantity. The JB and BRH breeds have both been excluded from this national breeding program, and consequently quality traits in these breeds have not been artificially selected. This has resulted in decreases in the population sizes of these breeds due to a lack of interest within the cattle industry. The JB and BRH varieties show some level of breed divergence from the BH breed, although this is distinctly less than between other well-characterized cattle breeds. From a purely genetic perspective, there is limited value in managing these populations independently; however, given their high social value in Korea, a separate breeding program aimed at maximizing diversity and improving fitness is warranted [26].  The Illumina BovineHD chip was designed using a large taurine reference population, and the results of the present study are consistent with those previously reported by O'Brien [27]. In addition, the overall call rate of our 168 animals showed the high performance of genotyping (>99%), indicating the strategic SNP selection of the HD chip for Korean cattle analysis. Evidence for positive selection was obtained by calculating the value of iHS for a population [28], and Rsb [29], and Fst [19] between populations. Ancestral alleles were derived from BovineHD genotypes determined in a previous study [30]. We used three different selection signature discovery tools (iHS, Rsb, and Fst) to increase the power of detecting genomic signatures of positive selection. However, we failed to find overlapped loci among iHS, Rsb, and Fst analysis. Moreover, there were no significant loci in the genomic regions in the iHS analysis of BH and JB, and Rsb and Fst analyses of cross population BH-JB and BRH-JB pairs. However, a total of 184, 86, and 403 autosomal loci are potentially subjected to selection in BRH. Among these 48, 11, and 11 loci were located in genomic regions. There were many significant iHS across a broad range of chromosomes. In contrast, specific loci on Chr 10,18,19,23,26,27, and 28 were discovered by Rsb and Fst analyses. We can assume that many of these genes are involved in functions that are associated with phenotypic changes in BRH. For example, we identified genes in cattle quantitative trait loci regions for feed efficiency and feeding behavior traits (aquaporin 4 [31]), eating behavior trait (calcium dependent secretion activator [32]), osteochondrosis (polypeptide N-acetylgalactosaminyltransferase 13 [33]), skin color trait (leucyl and cystinyl aminopeptidase [34]), pigmentation trait (melanocortin 1 receptor [35]), muscle trait (oxidative stress responsive 1 [36]), and hematological trait (tripartite motif containing 26 [37]).
JB showed lower heterozygosity and slower LD decay than BH and BRH. These differences could be attributed to population history events, including selection pressure, effective population sizes, and admixture with wild-type ancestors [25]. In contrast, BH and BRH showed lower levels of r 2 compared with JB, which might be a reflection of the larger recent effective population sizes and relatively lower levels of inbreeding in the former two breeds. The patterns of LD and Ne are considerably affected by population historic events. In this study, we observed a sharp decline in the effective population size of all the cattle breeds. The sharp decline was observed at ~50 generations ago. This was the time of the formation of the current breeds, when selection and development of breeding programs had just begun. In the genetic structure of modernday cattle, LD and Ne reflect various historic events and extensive artificial and natural selection.
The sum of an individual' s ROH coverage was used to infer the inbreeding level of an individual [23] Consanguinity may be indicated from the presence of long ROH; the longer such segments are, the more likely that recent inbreeding occurred within a pedigree [38]. The JB breed shows the highest inbreeding and the BRH breed shows a higher degree of inbreeding compared with the BH breed. The low genetic diversity in the JB and BRH breeds would inflate homozygosity and estimates of autozygosity in these populations. ROH were also mapped using their genetic positions and the abundance of ROH in different length classes was used to qualitatively evaluate the historical demography of each of the breeds.
To our Knowledge this study is the first to investigate the genome of a BRH cattle population for signatures of positive selection using high density genome-wide SNPs. Here, we have shown that a coat color-related gene (MC1-R) region is one of the selection signatures. The BH and BRH breeds showed similar overall genetic characteristics. Historically, these two breeds have been crossed without distinction on the mainland for long periods and share many genes. In particular, when BRH cattle are crossed with each other, cattle of brown, brindle, and black color all appear in the F 1 generation. The BRH breed is thus considered to be the prototype of Korean cattle [39]. They have their own characteristics, but share many features. Compared with the BH and BRH breeds, the JB breed, which has been maintained on Jeju Island isolated from the mainland, shows a certain degree of genetic difference. Seo et al [40] reported that JB cattle form a population separate from the BH, BRH, and Korean black breeds. In addition, Yoon et al [11] concluded that the JB breed is genetically more distantly related to the BH breed than to the BRH breed. According to the theory that the cattle of Northeast Asia moved inland after they became domesticated, it can be assumed that the BH, BRH, and Korean black breeds originated in Korea, and that some Korean black cattle were subsequently transported to Jeju Island, where they have evolved as a separate population [41].
An analysis of differentiation based on high-density SNP chip analysis showed the various differences between Korean cattle breeds and the closeness of breeds corresponding to the geographic region in which they are evolving. Further studies will be necessary in order to determine the association between these genetic variations and various phenotypic traits.

CONFLICT OF INTEREST
We certify that there is no conflict of interest with any financial organization regarding the material discussed in the manuscript. Cheong HS and Shin HD are employees of SNP Genetics, Inc.