Identification of Superior Single Nucleotide Polymorphisms ( SNP ) Combinations Related to Economic Traits by Genotype Matrix Mapping ( GMM ) in Hanwoo ( Korean Cattle )

It is important to identify genetic interactions related to human diseases or animal traits. Many linear statistical models have been reported but they did not consider genetic interactions. Genotype matrix mapping (GMM) has been developed to identify genetic interactions. This study uses the GMM method to detect superior SNP combinations of the CCDC158 gene that influences average daily gain, marbling score, cold carcass weight and longissimus muscle dorsi area traits in Hanwoo. We evaluated the statistical significance of the major SNP combinations selected by implementing the permutation test of the F-measure. The effect of g.34425+102 A>T (AA), g.8778G>A (GG) and g.4102+36T>G (GT) SNP combinations produced higher performance of average daily gain, marbling score, cold carcass weight and the longissimus muscle dorsi area traits than the effect of a single SNP. GMM is a fast and reliable method for multiple SNP analysis with potential application in marker-assisted selection. GMM may prospectively be used for genetic assessment of quantitative traits after further development. (


INTRODUCTION
Genetic interactions are important factors to complex biological traits, because most biological phenotypes result from the complex interplay of many genes and environmental factors.Detection of genes responsible for economic traits in livestock has been widely practiced.Several methods have been proposed to analyze genetic interactions in beef cattle (Casas et al., 2006).A typical approach is to fit a multiple regression model and, thus, relating the trait values to marker genotypes.So far, identification of presumptive genetic interaction using the multiple regression models has been demonstrated only with inter-crossed populations.The multifactor dimensionality reduction method is also a commonly used approach.It uses nonparametric methods and is most efficiently used with case-control data (Ritchi et al., 2001;Chung et al., 2006).Most traits of economic importance in livestock are multi-factorial, i.e., influenced by multiple genes and their interactions with environmental factors.Generally, models used to test the effects of genes on traits have been based on parametric methods, such as general linear models or the animal model (Henderson, 1976).However, model building may be cumbersome and over parameterization problems can arise when multiple factors, e.g., multiple genes and their interaction effects, are taken into account (Ritchi et al., 2001).These complex interactions make identification of individual genetic and genetic interactions difficult.A new genetic interaction approach, termed genotype matrix mapping (GMM), is introduced to reveal genetic interactions and interactioninteraction relationships in complex traits in family data and in various genetic backgrounds (Sachiko et al., 2007).
In this study, we utilize the GMM method to detect superior SNP combinations that influence economic traits of Hanwoo.The CCDC158 gene is significantly associated with growth and carcass traits (Lee et al., 2008;2009;2010).Thus, we try to identify interaction effects of SNPs of the CCDC158 gene for average daily gain (ADG), marbling score (MS), carcass cold weight (CWT) and longissimus muscle dorsi area (LMA) economic traits.

Animals and phenotypes
The Hanwoo population (n = 476) was reared under the progeny-testing program of the National Livestock Research Institute (NLRI) of Korea.The pedigree record of 476 steers was produced from 50 sires collected by the Korean Animal Improvement Association (Seoul, Korea).Steers were fed under the tightly controlled conditions of the feeding program in the Daekwanryeong and Namwon branches.The animals were born between the spring of 1998 and autumn of 2002.All steers were slaughtered in the spring of 2002 to autumn of 2004.They were castrated at six months of age and four animals were raised per pen (4 m×8 m).They were fed with concentrates consisting of 15% crude protein (CP)/71% very digestible nutrients (TDN) for a period of 60 to 90 days after six months of age; 15% CP/71% TDN for a period of 180 days; and 13% CP/72% TDN for a period of 90 to 120 days of self-feeding.Roughage was offered ad libitum, and steers had free access to fresh water throughout the entire period.Average daily gain measured the difference between the six month and 24 month weight divided by 18 (i.e., the difference between 6 and 24 months.Cold carcass weight was measured following a 24-h chill.The mean and standard deviation of average daily gain, cold carcass weight, longissimus muscle dorsi area and marbling score were 0.75±0.089kg, 316.75±34.459kg, 75.30±8.114cm 2 and 5.61±4.176,respectively.

SNP genotyping
Genomic DNA from white blood cells was extracted using the phenol-chloroform method (Sambrook et al., 2001).In this study, we genotyped 19 polymorphic SNP of coiled-coil domain containing 158 (Gene ID 534614), as described by Lee et al. (2010).Primers for amplification and extension were designed for single-base extension for genotyping of polymorphic sites (SBE) (Vreeland et al., 2002).Primer extension reactions were conducted using the SNaPshot ddNTP Primer Extension Kit (Applied Biosystems, Foster City, CA, USA).One unit of shrimp alkaline phosphatase (SAP) was added to the reaction mixture, which was incubated for 1 h at 37°C, followed by 15 min at 72°C for enzyme inactivation, to clean the primer extension reaction.DNA samples containing extension products and Genescan 120 LIZ size standard solution were added to HiDi formamide (Applied Biosystems, Foster City, CA, USA) in accordance with the manufacturer's recommendations.The mixture was incubated for 5 min at 95°C, followed by 5 min on ice, after which electrophoresis was conducted using the ABI PRISM 3130XL Genetic Analyzer.Results were analyzed using GeneMapper v4.0 software (Applied Biosystems, Foster City, CA, USA).

Genotype matrix mapping
In GMM, each marker is given a matrix in which each of the total number of alleles for the marker in the tested population is represented by intersecting lines and rows.Genetic interactions are estimated and compared via virtual networks generated among the locus matrixes.Any type of population, including unrelated individuals, family data and mapping populations for linkage analysis, can be used for GMM, providing that there is no population structure within the tested data set.The number of alleles in the population should be determined for every marker before analysis.
We have the following GMM procedure (Sachiko et al., 2007): Construct a list of locus combinations.Each locus combination in the list is depicted on the genotype matrix of the marker.
We used the F-measure to evaluate the significance of the locus combinations; Total set S, which consists of N individuals, is divided into two non-overlapping subclasses S 0 and S 1 based on their marker genotypes.Here, we have defined S 1 as the samples that have the specific genotype pattern to be evaluated, and S 0 as the samples that do not.
Next, the mean square among classes is calculated as follows: Here, μ, μ 0 , and μ 1 are the means of the phenotype values in S, S 0 , and S 1 , respectively.|X| is the number of individuals in X.
The mean square within each class is defined as follows: 2 Here, P i is the phenotype value of the i-th individual S i .The F-measure is obtained by dividing MSA by MSW F indicates the bias of the distribution of phenotype values in the two subclasses.If the distribution of the phenotype in the two subclasses differs, one can conclude that the condition (i.e., the pattern of marker genotypes) used for sample division was associated with the phenotype of interest.In such cases, the F-measure value is large.
Significant superior locus combinations that have large F-measure values are searched incrementally.During the searching procedure, the maximum F-measure obtained is a new F, and the value of the new F is updated when a better combination whose F-measure is higher than that of the current one is found.After obtaining F-measures based on phenotype, we need to determine statistical significance for the critical value of F. An empirical 100 (1-P) percentile obtained by 10,000 repetitions of the permutation process was referred to as an estimated critical value of the whole genome significance level of P. Thus, a permutation test is performed to determine empirical significance of F-measures by applying the GMM method.We obtain major combinations of the upper 15 F-measures using this procedure.
Association between individual SNPs and ADG, CWT, LMA and MS were determined by the mixed effect model, treating "sire" as a random effect; "age" at slaughter was also included in the model as a covariate in the SPSS statistics v19.1 package.We used a single SNP model.Then, the genotype combination effects were tested in the mixed effect model.In addition, statistically significant superior SNP combinations were evaluated by the permutation test (Good et al., 2000), because F-measures obtained by GMM did not calculate their theoretical significant levels (critical value or p-value).Ten thousand repetitions of the permutation process were used for the critical value.We examined significance levels (p-value) of the major SNP combinations selected by implementing the permutation test of the F-measure.This had not been obtained by Sachiko et al. (2007).
The four economic traits and five major SNP combinations groups were selected in Table 5 with least squares means, standard errors (SE) and statistical significance values (p-value) of the SNP genotypes for ADG, CWT, LMA and MS respectively (Table 5).These superior SNP combinations for comprehensive economic traits are used in large-scale Hanwoo.These selected five combinations of superior SNPs from each of the economic traits were used to determine whether they were statistically significant (Table 6).Table 5 was selected to investigate the influence of SNP combinations on economic traits of beef; these were tested.The selected all five SNP combinations of ADG and CWT were statistically significant (p<0.01)but g.66995-169insdelC (CC), g.32330-48A>G (AA), g.8778G>A (GG) SNP genotype combinations and g.66995-169insdelC (CC), g.34425+102A>T (AA), g.8778G>A (GG) SNP genotype combinations and g.66995-169insdelC (CC), g.34425+102A>T (AA), g.32330-48A>G (AA) SNP genotype combinations were not significant or negatively significant for MS.Their T statistic (p-value) were -2.30 (0.022), -1.29 (0.198) and -1.74 (0.084) respectively.By the way, g.34425+102 A>T (AA), g.8778G>A(GG) and g.4102+36 T>G (GT) SNP genotype combination was very significant with p<0.001.Table 6 presented least-square means, and standard error of g.34425+102 A>T (AA), g.8778G>A (GG) and g.4102+36 T>G (GT) SNP combinations.T-and permutation test are also represented the best among SNP combinations.For g.34425+102A>T (AA), g.8778G>A (GG) and g.4102+36T>G (GT) SNP genotype combinations, the single and combination effects were obtained by applying the GMM method to ADG, CWT, LMA and MS respectively.The SNP genotype combinations produced superior performance of four economic traits when compared to the total means of the Hanwoo population.In Table 6, the AAGGGT genotype group Mean±SE revealed 0.83±0.013for AGT, 348.72±0.367for CWT, 77.78±0.514for LMA and 7.35±0.526for MS.Those scores are superior to the 476 sample means and standard deviations (in materials and methods section), 0.75±0.089kg, 316.75±34.459kg, 75.30±8.114cm 2 and 5.61±4.176,for ADG, CWT, LMA and MS respectively.Statistical tests were also significant (p<0.001).In addition, three SNP combination genotype groups represented higher performance than the effect of a single SNP genotype in Table 6.Therefore, we concluded g.34425+102A>T (AA), g.8778G>A (GG) and g.4102+36T>G (GT) SNP combinations, in beef complex economic traits, were decided as the best SNP combinations.

DISCUSSION
A novel method, GMM, was applied to reveal interactions between various SNP genotypes in complex traits.This study used the bootstrap sampling method (Efron et al., 1993) and generated 4,190 animals.The top 15 SNPs combinations were selected by the GMM method for ADG, CWT, LMA and MS.The permutation test was added for statistical significant in this GMM method (Tables 1-4).
The five common superior SNP combinations with beef complex economic traits were selected and statistical tests conducted (Table 5).In addition, we discovered g.34425+102A>T (AA), g.8778G>A (GG) and g.4102+36T>G (GT) SNP combinations that were best commonly affected for ADG, CWT, LMA and MS within the CCDC158 gene and screened in a large-scale Hanwoo population (n = 4,190).The other four SNP combinations did not enable the choice of complex economic traits in Table 5.In contrast, the average for the GG genotype of g.8778G>A SNP was 319.88 kg for cold carcass weight, as described by Lee et al. (2010); however, in this study, the average for genotype of g.34425+102A>T (AA), g.8778G>A (GG) and g.4102+36T>G (GT) SNP combinations was 348.72 kg.It was superior to 28.84 kg in the GG genotype of g. 8778 G>A SNP.The effect of SNP combinations produced very higher performance of ADG, CWT, LMA and MS traits than did the effect of single SNP (Table 6).Therefore, we suggested that g.34425+102 A>T (AA), g.8778G>A (GG) and g.4102+36T>G (GT) SNP combinations were the best SNP combinations on growth and carcass traits in Hanwoo.GMM is a fast and reliable method for multiple SNP analysis of potential application in marker-assisted selection.However, effectual data size is an indispensable issue to be evaluated.One of the most probable reasons for insufficient output from the GMM analysis is the limited size of the data set.Increasing the number of individuals for data acquisition would improve the accuracy of the entire analysis.GMM may prospectively be used for genetic assessment of quantitative traits after further development.

Table 1 .
Significant SNP combinations of top 15 F-measures, least square means and standard errors from average daily gain using GMM in Hanwoo

Table 2 .
Significance SNP combinations of top 15 F-measures, least square means and standard errors from carcass cold weight using

Table 3 .
Significance SNP combinations of top 15 F-measures, least square means and standard errors from longissimus muscle dorsi area using GMM in Hanwoo

Table 4 .
Significance SNP combinations of top 15 F-measures, least square means and standard errors from marbling score using GMM in Hanwoo

Table 5 .
Level of t-and permutation test, p value, least square means and standard errors of 5 SNP combinations for 4 economic traits in Hanwoo

Table 6 .
Least square means and standard errors (SE), single and combinations effect of superior 3 SNP for average daily gain and carcass traits in large-scale Hanwoo population 1 ADG = Average daily gain, CWT = Carcass cold weight, LMA = Longissimus muscle dorsi area, MS = Marbling score. 2 Number of animals. 3Permutation p value.*** p<0.001.