Complex Segregation Analysis of Total Milk Yield in Churra Dairy Ewes

The mode of inheritance of total milk yield and its genetic parameters were investigated in Churra dairy sheep through segregation analyses using a Monte Carlo Markov Chains (MCMC) method. Data which consisted of 7,126 lactations belonging to 5,154 ewes were collected between 1999 and 2002 from 15 Spanish Churra dairy flocks. A postulated major gene was assumed to be additive and priors used for variance components were uniform. Based on 50 000 Gibbs samples from ten replicates chains of 100,000 cycles, the estimated marginal posterior means±posterior standard deviations of variance components of milk yield were 23.17±18.42, 65.20±25.05, 120.40±42.12 and 420.83±40.26 for major gene variance ( 2 G σ ), polygenic variance ( 2 u σ ), permanent environmental variance ( 2 pe σ ) and error variance ( 2 e σ ), respectively. The results of this study showed the postulated major locus was not significant, and the 95% highest posterior density regions (HPDs95%) of most major gene parameters included 0, and particularly for the major gene variance. The estimated transmission probabilities for the 95% highest posterior density regions (HPDs95%) were overlapped. These results indicated that segregation of a major gene was unlikely and that the mode of inheritance of total milk yield in Churra dairy sheep is purely polygenic. Based on 50,000 Gibbs samples from ten replicates chains of 100,000 cycles, the estimated polygenic heritability and repeatability were h = 0.20±0.05 and r = 0.34±0.06, respectively. (


INTRODUCTION
A genetic improvement program for milk yield and composition in dairy sheep is an important component toward the development of a viable industry.To remain competitive, the industry needs to increase its productivity and a genetic program is still the best alternative for improving the biological efficiency of producing milk for cheese.
Marker-assisted selection (MAS) has been shown to increase genetic improvement via reduction of generation interval or by increasing the selection intensity in an "outbred" population (Schrooten et al., 2005).Before applying MAS, or starting to search for quantitative trait loci (QTL), identification of a major gene based on statistical segregation analyses using only phenotypic data would be informative, cheap and useful, as compared to MAS.Therefore, the existence of major genes has been investigated in several studies in livestock species: by Janss et al. (1995) for various traits of Dutch Meishan crossbreds; by Ilahi (1999) and Ilahi et al. (2000) for milking speed in dairy goats; by Pan et al. (2001) for somatic cell scores in dairy cattle; by Hagger et al. (2004) for selection response in laying hens; and by Ilahi and Kadarmideen (2004) for milk flow in dairy cattle.
Segregation analysis using pedigreed animal populations is impossible by analytical approaches due to the existence of many (inbreeding) loops and due to the family sizes, which do not allow to sum and integrate out genotypes and polygenic effects from the likelihood or posterior density.This problem has been simplified by the development of Gibbs sampling, a Monte Carlo Markov chain (MCMC) methodology (Guo and Thompson, 1992), and its applications to livestock populations by Sorensen et al. (1994), Janss et al. (1995), Janss et al. (1997) and Ilahi and Kadarmideen (2004).
The aim of this study was to investigate whether a segregating major gene affects the total milk yield trait, and to estimate its genetic parameters in Churra dairy sheep using Bayesian segregation analysis methodology.

ABSTRACT :
The mode of inheritance of total milk yield and its genetic parameters were investigated in Churra dairy sheep through segregation analyses using a Monte Carlo Markov Chains (MCMC) method.Data which consisted of 7,126 lactations belonging to 5,154 ewes were collected between 1999 and 2002 from 15 Spanish Churra dairy flocks.A postulated major gene was assumed to be additive and priors used for variance components were uniform.Based on 50 000 Gibbs samples from ten replicates chains of 100,000 cycles, the estimated marginal posterior means±posterior standard deviations of variance components of milk yield were 23.17±18.42,65.20±25.05,120.40±42.12and 420.83±40.26for major gene variance ( 2 G σ ), polygenic variance ( 2 u σ ), permanent environmental variance ( 2 pe σ ) and error variance ( 2 e σ ), respectively.The results of this study showed the postulated major locus was not significant, and the 95% highest posterior density regions (HPDs 95% ) of most major gene parameters included 0, and particularly for the major gene variance.The estimated transmission probabilities for the 95% highest posterior density regions (HPDs 95% ) were overlapped.These results indicated that segregation of a major gene was unlikely and that the mode of inheritance of total milk yield in Churra dairy sheep is purely polygenic.Based on 50,000 Gibbs samples from ten replicates chains of 100,000 cycles, the estimated polygenic heritability and repeatability were h 2 = 0.20±0.05and r = 0.34±0.06,respectively.(Key Words : Segregation Analysis, Bayesian Method, Major Gene, Genetic Parameters, Milk Yield, Dairy Sheep)

Data
The Churra sheep is an autochthonous breed raised in Castile and Leon, in north-western Spain.It is a milk production breed of great hardiness, well suited to the continental climate of Castile and Leon, with long, severe winters, very short springs, and hot dry summers.
The data available for this study consisted of 7,126 records of total milk yield (with mean±standard deviation: 94.1 kg±35.6)from 5,154 ewes, and were collected between 1999 and 2002 from 15 Spanish Churra dairy flocks.All pedigree information available was included in the analyses.Thus, the pedigree included 8,053 animals with 448 different sires used within the pedigree.

Statistical model
To analyze the presence of a major gene for milk yield in Churra dairy sheep, the following mixed inheritance model was used: y = Xβ+Zu+Qpe+ZWm+e where y is the vector of observations, β is a vector of non-genetic fixed effects including: flock-year-season, age at the first lambing, lactation number (parity) and type of lambing, u is a random vector of individual polygenic effects, pe is a random vector of permanent environmental effects, W is a design matrix that contains the genotype of each individual (i.e., AA, AB, BB), m is the vector of genotype means (i.e., -a, 0, a), e is a random vector of residual effects, and X, Z and Q are incidence matrices relating the observations to their respective effects.In the term modeling the single gene, both W and m are unknown and have to be estimated from data by using segregation analysis.
The number of levels of all effects included in the model and the number of animals in the pedigree are summarized in Table 1.
The major gene was modelled as an additive autosomal biallelic (A and B) locus with Mendelian transmission probabilities.Allele A is defined to decrease the phenotypic value and allele B is defined to increase the phenotypic value (or favourable allele).With these two alleles A and B, with frequencies p and q = 1-p where p is the estimate of A allele frequency in the founder population in which the Hardy-Weinberg equilibrium was assumed, three genotypes AA, AB or BA and BB can be encountered, with genotype means m = (-a, 0, a), where a is the additive major gene effect.
Distributional assumptions for polygenic effects were, u ~ N (0, A 2 u σ ), where A is the numerator relationship matrix.
The distribution of the permanent environmental effects were, pe ~ N (0, I 2 pe σ ).Residual effects were assumed to be distributed as e ~ N (0, I 2 e σ ). 2 u σ , 2 pe σ and 2 e σ are polygenic, permanent environmental and residual variances, respectively.The relationship matrix of the full pedigree A was used in the analyses.The variance attributable to the major gene ( 2 G σ ) was calculated as: Uniform prior distributions were assumed in the range (-∞, +∞) for non-genetic effects and effects at the major locus, in the range (0, +∞) for variance components, and in the range [0, 1] for allele frequencies (Janss et al., 1995).
A Gibbs sampling algorithm with blocked sampling of genotypes W was used for inference in the mixed inheritance model and implemented using the 'iBay' software package version 1.46 developed by (Janss, 2008).
Ten replicates of the Monte Carlo Markov Chains (MCMC) of 100,000 Gibbs samples were run, using a spacing of 20 cycles, obtaining 5,000 Gibbs samples per chain and 50,000 samples in total.A burn-in period of 20,000 cycles was used to allow the Gibbs chains to reach equilibrium.
From the mixed general model, marginal posterior densities of the following parameters were directly estimated in each Gibbs cycle: variance components 2 G σ , 2 u σ , 2 pe σ , and 2 e σ , additive effect at the major gene a, allele frequency p, and the Mendelian transmission probabilities.Using variance components for polygenes and major genes, following Janss (2008), the heritabilities and repeatabilities were calculated as: for heritability and repeatability:

RESULTS AND DISCUSSION
In a preliminary analysis based on the same statistical model used in the Bayesian segregation analysis described above, the variance components under a polygenic model of total milk yield in Churra dairy sheep were estimated by Bayesian analysis, using the 'iBay' software package version 1.46 as well (Janss, 2008).These estimates were based on 50,000 Gibbs samples from 10 replicated chains of 100,000 cycles.The effective sample size was 33,602 Gibbs samples.These estimates are presented in Table 2.
Marginal posterior means and standard deviations of parameter estimates of total milk yield in Churra dairy sheep, using Bayesian segregation analyses implemented by Gibbs sampling, are presented in Table 3 and 4.These estimates were based on 50,000 Gibbs samples from 10 replicated chains.The effective sample size was 27,319 Gibbs samples.Posterior marginal distributions of all variance components of total milk yield are shown in Figure 1.
According to Box and Tiao (1973), the highest posterior density regions (HPD), based on a non-parametric density estimate using the averaged shifted histogram technique (Scott, 1992), were obtained for all model parameters.These highest regions were constructed to include the smallest possible region of each sampled parameter value.In our analyses, the highest posterior density regions at 95% (HPDs 95% ) of the additive gene effect (a) and the variance at the major locus ( 2 G σ ) included zero (Table 3).The allele frequencies in the analysed population were intermediate (p = 0.56 and q = 1-p = 0.44).The polygenic variance ( 2 u σ = 65.20±25.05)was significantly higher than the major gene variance ( 2 G σ = 23.17±18.42).Janss et al. (1997) and Miyake et al. (1999) also suggested the use of the magnitude of the major gene variances as an indicator for the existence of segregating a major gene.Following Elston (1980), the evidence of a significant segregating major gene in a quantitative trait requires three conditions: statistical significance of the major gene component in the model, statistical differences among the transmission probabilities, and these transmission probabilities are significantly different from an environmental model.
In order to check the statistical significance of the major gene component in the model Janss (1998) proposed, to check the 95% highest posterior density region (HPD 95% ) of the postulated major gene variance: if the 95% HPD does not include zero (the postulated major gene is statistically significant) or includes zero (not significant).(Elston and Stewart, 1971).The Mendelian transmission (probabilities 1, 1/2, and 0) was tested by checking if the highest posterior density regions at 95% (HPDs 95% ) were overlapped or not.Mendelian transmission probabilities for the 3 genotypes were estimated (Table 4) as suggested by Elston and Stewart (1971).These probabilities were parameterised to indicate the Mendelian transmission of the favourable allele, with probabilities of B allele transmission of 1, 1/2, and 0 for genotypes BB, BA, and AA, respectively.
Table 4 showed that the three estimated posterior means of Mendelian transmission probabilities were not significantly different, and their highest posterior density regions at 95% (HPDs 95% ) for the three genotypes were also overlapped.
Based on these results obtained from Bayesian segregation analysis using only phenotypic data sets, we can conclude that the postulated major gene was not significant and the genetic determinism of total milk yield in Churra dairy sheep is polygenic.
Estimated heritability and repeatability of total milk yield obtained by Bayesian segregation under mixed inheritance model analysis were lower than those obtained under the polygenic model by Bayesian analysis approach (Tables 2 and 3).This might be explained by: first, there is a proportion of total genetic variance explained by the major gene variance, but it is not statistically significant and then was not taken into account in the estimation of heritability and repeatability.Second, our current results showed that the genetic determinism of total milk yield in Churra dairy sheep is polygenic and we have to use and apply polygenic models to estimate its genetic parameters.

CONCLUSIONS
Bayesian segregation analysis based only on phenotypic data was applied in order to investigate the mode of inheritance and to estimate the genetic parameters of total milk yield in Churra dairy sheep, by using standard statistical significance testing based on the 95% highest posterior density regions (HPDs 95% ) and by checking Mendelian transmission probabilities.The results of this study showed no existence of a major gene and the mode of inheritance for total milk yield in Churra breed is polygenic.The estimated polygenic heritability and repeatability of this trait in Churra are similar to those estimated and reported in the literature in other dairy sheep breed.
Further genetic analyses using both molecular information and phenotypic data to search for evidence of segregating major genes affecting total milk yield in Churra dairy sheep are supported, since segregation analysis using molecular information is much more powerful and can also separate effects of multiple major genes.

Figure 1 .
Figure 1.Marginal posterior distributions of error variance, polygenic variance, permanent environmental variance and major gene variance from a mixed general model of total milk yield in Churra dairy sheep.

Table 1 .
Description of data sets used for the analysis

Table 2 .
Estimated marginal posterior means and marginal posterior standard deviations for variance components from a polygenic model and left and right 95% highest posterior density regions (HPDs 95% ) for total milk yield in Churra dairy sheep, based on 50,000 Gibbs samples

Table 3 .
Estimated marginal posterior means and marginal posterior standard deviations for fitted parameters from a mixed general model and left and right 95% highest posterior density regions (HPDs 95% ) for total milk yield in Churra dairy sheep, based on 50,000 Gibbs samples from 10 replicated chains

Table 4 .
Estimated marginal posterior means, left and right 95% highest posterior density regions (HPDs 95% ) for transmission probabilities using a mixed general model for total milk yield in Churra dairy sheep, based on 50,000 Gibbs samples from 10 replicated chains Transmission probabilities, presented as the probabilities to inherit a B allele from BB, BA, and AA genotypes *