Potential of the Quantitative Trait Loci Mapping Using Crossbred Population

In the process of crossbreeding, the linkage disequilibria between the quantitative trait loci (QTL) and their linked markers were reduced gradually with increasing generations. To study the potential of QTL mapping using the crossbred population, we presented a mixed effect model that treated the mean allelic value of the different founder populations as the fixed effect and the allelic deviation from the population mean as random effect. It was assumed that there were fifty QTLs having effect on the trait variation, the population mean and variance were divided to each QTL in founder generation in our model. Only the additive effect was considered in this model for simulation. Six schemes (S1-S6) of crossbreeding were studied. The selection index was used to evaluate the synthetic breeding value of two traits of the individual in the scheme of S2, S4 and S6, and the individuals with high selection index were chosen as the parents of the next generation. Random selection was used in the scheme of S1, S3 and S5. In this study, we premised a QTL explained 40% of the genetic variance was located in a region of 20 cM by the linkage analysis previously. The log likelihood ratio (log LR) was calculated to determine the presence of a QTL at the particular chromosomal position in each of the generations from the fourth to twentieth. The profiles of log LR and the number of the highest log LR located in the region of 5, 10 and 20 cM were compared between different generations and schemes. The profiles and the correct number reduced gradually with the generations increasing in the schemes of S2, S4 and S6, but both of them increased in the schemes of S1, S3 and S5. From the results, we concluded that the crossbreeding population undergoing random selection was suitable for improving the resolution of QTL mapping. Even experiencing index selection, there was still enough variation existing within the crossbred population before the fourteenth generation that could be used to refine the location of QTL in the chromosome region. (Asian-Aust. J. Anim. Sci. 2005. Vol 18, No. 12 : 1675-1683)


INTRODUCTION
Multi-genes affected the genetic variation of economic important traits of animals and plants in molecular level, which was named as quantitative trait loci (QTL).To find these genes, many kinds of the experimental populations were designed, such as the F 2 , backcross (BC), daughter design and granddaughter design.At the same time, the linear regression analysis, maximum-likelihood analysis, Bayesian analysis and residual maximum-likelihood based on one or multiple markers (Haley and Knott, 1992;Zeng, 1993Zeng, , 1994;;Kao, 1999;Yi and Xu, 2001;Lee and Wu, 2003) were developed.With these methods, many QTLs having effect on the trait were located (Zuo, 2004), but the resolution was not enough to separate the interested gene or select them efficiently assisted with their linked markers in breeding.In recently, the transmission disequilibrium test (TDT) (Spielman and Ewens, 1996), identical by descent (IBD) analysis (Pritchard et al., 1991;Riquet et al., 1999;Farnir et al., 2002) and linkage analysis (LA) combining linkage disequilibrium mapping (LDM) (Meuwissen et al., 2000(Meuwissen et al., , 2002;;Mott et al., 2002a, b) were proposed to improve the resolution of QTL mapping by exploiting more recombination information in the deep pedigrees.
The linkage analysis combining linkage disequilibrium method utilized the recombination information between the QTL and marker in the mutli-generation pedigree.With the generation increasing from the linkage disequilibrium occurred, the identical by descent (IBD) of the chromosome region contained the QTL was reduced gradually.Analyzing the pedigree recorded, the QTL affecting the trait can be located in the special chromosome regions by genome scanning.The linkage disequilibrium between the QTL and their close linked markers in the multi-generation pedigrees unrecorded can be used to narrow the region QTL hiding through analyzing the identical by descent (IBD) of the chromosome region contained the QTL among the individuals with similar phenotype.Meuwissen and Goddard (2000) developed a multi-marker method to fine map the QTL when the linkage disequilibria were produced due to the mutation of QTL in hundred years ago.Some researchers studied other experimental designed populations in which the linkage disequilibrium came from the mixture of the lines or populations whose phenotype was great difference, such as the advanced intercross lines (AIL) (Darvasi and Soller, 1995), recurrent selection and backcross or intercross lines (RSB, RSBI) (Hill, 1997(Hill, , 1998;;Luo, 2002;Lee, 2005).
In animal breeding practice, the crossbreeding was an efficiently scheme to improve the performance of a breed by crossing with other complementary breeds.To increase the speed of genetic improvement, the high performance individuals were selected as the parents of the next generation by selection index of more than two traits.This process included the cross of two or more breeds with different performance and intercross within the closed crossbred population for many generations, that was not only accumulated the recombination between chromosomes that can narrow the IBD region contained QTL, but also accelerated the purification of individuals and reduced the variation within population due to the selection of individuals with high breeding value.There were a large number of crossbred populations undergone the above process.Does this kind of population be suitable for improving the resolution of QTL mapping?What resolution may be expected in the QTL mapping by using these populations?How dose the selection have effect on the power of QTL mapping in each generation?There are many questions existing on the QTL mapping with the crossbred population, but no reports have involved in them.In this study, we tried to answer some of these questions by simulation.To simulate the crossbred population, we established a mixed linear model that premising the difference of breeds was the population mean, taken the mean of the founder population as fixed effect and variation within population as random effect.We assumed the whole pedigree and the marker information of all individuals in the pedigree are available; the postulated QTL was mapped to a 20 cM region by linkage analysis previously.From the second generation after the cross of founder populations finished to the twentieth generation, we estimated the QTL location with the restricted maximum likelihood (REML) method in each generation.

The mixed effect model for population simulation
To simulate the cross of outbreed populations, Yi and Xu (2001) have developed a mixed effect model that treat the mean allelic value of the each source populations as the fixed effect and the allelic deviation from the population mean as random effect, the total genetic variance can be partitioned into between-and within-population variances.In this study, we also take the mean allelic value of the each founder populations as the fixed effect and the allelic deviation from the population mean as random effect.We extended Yi's mixed effect model and divide the mean allelic value and variance of the founder population to each QTL.Hayes and Goddard's (2001) reports estimated the total number of QTLs affecting the variation of the trait and showed that 17% and 35% of the leading QTL occupied 90% of the total genetic variance on the dairy cattle and pig, respectively.We assumed there were a few QTLs with major effect and a great deal of QTLs with minor effect.
In our model, we premised the number of QTLs with major effect or minor effect was n and m, respectively.The cross of two populations was considered.The expectation and variance of two founder populations were b 1 , σ 1 2 and b 2 , σ 2 2 , respectively, for a trait.Assumed the expectation and variance of the ith QTL was b 1i and σ li 2 in the first population, b 2i and σ 2i 2 in the second population.Only the additive effect of each QTL was considered in this model.By this way, the expectation and variance of each founder population can be expressed as b , σ 2 2 = ∑2σ 2i 2 where i = 1, 2, ⋅⋅⋅, n+m.For the animals were diploid organisms, there were two alleles in each QTL.The allelic value of the ith QTL was calculated from the normal distribution ).Given the size of the both founder populations were N 1 and N 2 .The phenotype of individual j in the first population could be described by the following linear model: , where i = 1, 2, ⋅⋅⋅, n+m, j = 1, 2, ⋅⋅⋅, N 1 , v 1ij1 , and v 1ij2 were the first and second allele effect at the ith QTL and ε 1j was the residual error with a N(0,σ 1ε 2 ) distribution.Similarly, the phenotype of individual j in the second population could be described as

The process of crossbreeding
Assuming any individual of the first founder population had the equal chance to mate up the individual from the second founder population.The matting of the two founder populations produced K offspring to form the F 1 population.We defined the effect of paternal and maternal alleles at the i th QTL of individual k by v p F1ik and v m F1ik for the F 1 population, which was similar to Yi (2001).The phenotype of individual k in the F 1 population was described as , where ε F1 was the residual effect with a N(0,σ 2 F1 ) distribution.In the process of crossbreeding practice, the selection index was used to evaluate the synthetic breeding value of the individual, more than two traits with different heritability and economical weight were considered simultaneously.In our simulation, the selection index was used to evaluate the synthetical breeding value of two traits for each individual from the generation F 3 .The selection index I was the linear index of the phenotype of two traits described as , where w 1 , h 1 2 , P 1 , and 1 P were the economical weight of the first trait, heritability, phenotype value of the individual and the population mean, respectively, w 2 , h 2 2 , P 2 , and 2 P were the parameters for the second trait.The leading N M individuals of the males and the N F individuals of the females with the highest selection index were selected as the parents of the next generation.Then the random mating was applied between the males and females.The process of crossbreeding was continued to the twenty generations.

QTL mapping
Assumed that there were completely records of the pedigree from the founder to their offspring and the phenotype and marker genotype were available on each individual in the pedigree.From the pedigree, the marker origination of the offspring could be traced back to the founder individual.In the chromosome region focused on, only one major QTL was considered in the mixed model, other QTL in the genome were considered as the residual effect.The linear mixed model including the postulated QTL and polygenic effect is , where y is the vector of records, X is the design matrix, b is a vector of fixed effects, Z is an incidence matrix relating to individuals, u is a vector of polygenic effect, Z 1 and Z 2 are the incidence matrices relating each individual to the two alleles at the postulated QTL, respectively, v 1 and v 2 are the effect of the two alleles, ε is the vector of residuals.
The random effects u, v 1(2) and ε are assumed to be uncorrelated and distributed as multivariate normal densities as follows: u~N(0,Aσ u 2 ), v~N(0,Gσ v

2
), where A is a known additive genetic relationship matrix, σ u 2 is the polygenic variance, Gσ v 2 is the variance-covariance matrix of the two allelic effect of the postulated QTL, G is the IBD matrix between relatives for a marked QTL, σ v 2 is the variance of the QTL allele, R is a known diagonal matrix, and σ ε 2 is the residual variance.Given the complete pedigree and marker data, the IBD matrix of marked QTL was discussed by Fernando (1989), Wang (1995) and Abdel-Azim (2001).We calculated the IBD matrix and its inverse matrix with the method presented by Wang and Abdel-Azim.
The restricted maximum likelihood method (Patterson and Thompson, 1971;Grignola et al., 1996aGrignola et al., , b, 1997) ) was used to detect the existing of QTL on the chromosome.Assuming multivariate normality, the restricted likelihood under the mixed model was: , and b' was the generalized least-squares estimate of b.The restricted maximum likelihood was maximized using the derivate free (Simplex) restricted maximum likelihood algorithm (Graser et al., 1987).To prove the presence of QTL at a particular chromosomal position, the statistic test log of likelihood ratio was constructed, where L(σ u 2 , σ ε 2 ) was the restricted likelihood value without the presence of QTL.The profiles of log LR were calculated in each generation.The location with the highest log LR was numbered over 30 replications in the 20 cM region for each generation.

POPULATION SIMULATION
According to the investigation of Hayes and Goddard (2001), fifty QTLs having effect on the variation of the quantitative trait were assumed, including 3 major effect QTLs and 47 minor effect QTLs, which was distributed in four chromosomes.The major effect QTLs of T1 located in the chromosome 1 and chromosome 4 and that of T2 located in chromosome 2 and chromosome 3, the minor  QTL 1, QTL2 and QTL3 are the three major QTL affected the trait, PM is the proportion of population mean, PV is the proportion of the genetic variance, Ch is the Chromosome QTL located, L is the location of QTL from the beginning of each chromosome (cM).
effect QTLs of both traits were distributed randomly in four chromosomes.The position, proportion of the population mean and genetic variance on three major effect QTLs was shown in Table 2.All of the minor effect QTLs had equal proportion of the population mean and genetic variance.
The allelic value of each QTL was sampled according to the normal distribution v 1i ~N(b 1i ,σ 1i 2 ) and v 2i ~N(b 2i ,σ 2i 2 ) described above for each individual in the two founder populations.Only additive effect of alleles was considered in our model.Codominant markers were generated for all individuals.These markers were located on the four chromosomes with the interval of 2 cM.The symbol of the alleles was same on each marker for two founder populations.Four alleles with different frequency in two populations were assumed for each marker.
The process of crossbreeding began from the cross of two outbred populations and the intercross within the offspring selected was continued.To evaluate the breeding value of each individual, more than two traits were considered to build the selection index.In this simulation, two traits, denoted by T1 and T2, were included in the selection index.Their economical weights (EW) were 0.5 and 0.3, heritability was 0.7 and 0.15 in the selection index described above.According to our mixed model for simulation, the difference between two founder populations was the fixed effect of the QTL alleles.The population mean of T1 was 65 and 45 for the first and second founder population, described as MF1 and MF2, and 8, 14 for T2, respectively.Two populations had the same withinpopulation variance in both traits.The details about two traits set were shown in Table 1.
The parents of F 1 were randomly selected from the founder population that included 200 individuals.These individuals mated with those from another founder population randomly.Four hundred offspring of F 1 were generated with equal ratio of male and female.The marker and QTL genotypes of the offspring were generated from the haplotype of their parents.The recombination rate between loci can be computed from their genetic distance by the function of Haldane (r = 0.5(1-e -2x ), x was the genetic distance) (Haldane, 1919).The individuals of F 1 were also selected randomly as the parents of the F 2 generation.From the generation F 3 , the random or index selection and the proportion of individuals selected as the parents of the next generation were changed according to the six designed schemes (S1-S6).In the scheme of S1, S3 and S5, the random selection was conducted, but the proportions of the individuals selected were different for each scheme.In S2, S4 and S6, the selection index was used and the parents of the next generation were chosen from the individuals with the highest selection index from the male and female, respectively.The mating between the parents was also randomly in each generation for all schemes.The details of the parameters for breeding scheme were shown in Table 3.The resolution of QTL mapping was computed from the fourth to twentieth generation in each of six crossbreeding schemes initiated from two outbred populations.

RESULTS
To study the relationship of the population mean and variance with QTL, we designed two methods (method M and P) dividing the population mean and variance to each QTL.The method M assumed that all of the QTL underlining the trait possess equal proportion of the population mean and variance, while the method P supposed that all QTL shared different proportion of the population mean and variance.We compared the change of the population mean and variance with the two methods simulating the trait.The setup of all parameters was consistent with Tables 1 and 2. Figure 1 presented the changes of the population mean (Figure 1(A)) and variance (Figure 1(B)) of T1 from the founder to the twentieth generation after the cross of two outbred populations finished.
Assumed that the QTL explained 40% genetic variance had been located in the 20 cM chromosome region by the general experimental F 2 design previously.Due to the analyses being computationally demanding, only every 2 cM was tested for the presence of the QTL in this region.In The genetic distance between the marker and the postulated QTL was 1 cM.The profiles of the mean log LR of 30 replications were shown in Figure 2 for each generation.The number of the highest log LR (NHL) located in the size of 5, 10 and 20 cM region out of 30 replications was compared in Figure 3.
The Figures 2 and 3 showed that the precision of QTL mapping changed from generation to generation, and from scheme to scheme.In the schemes of S2, S4 and S6, the profiles of the log LR decreased gradually with the generation increasing, which was decreased sooner for the high intensity of selection and showed in the graph of A, B, C graph of Figure 2. In the last several generations, the profiles were very low and it was difficult to determine the precise location of QTL in A and B, but it was clear in C. From the A, B and C graphs of Figure 3, we could see the precision of the QTL mapping did not decrease soon in the beginning generations.The NHL did not decrease too much from generation four to eleven in the region of 5, 10 and 20 cM in Figure 2(A) graph.In the graph of Figure 2(B) and (C), these generations are four to thirteen and four to seventeen, respectively.In the schemes of S1, S3 and S5 where the individuals were randomly selected as the parents of the next generation, the profiles of log LR and the precision of the QTL mapping was improved gradually with the increasing of the generation.From the generation four to twenty, the profiles of the mean log LR became more and more sharp, but the highest of log LR did not change too much in different generations.To the different crossbreeding schemes, the proportion of the individuals selected had effect on the profiles of the log LR.In the scheme of S5, the PM = 20%, PF = 40% was the highest proportion in all schemes with random selection, the QTL was narrowed into the smallest region.From the Figure 3, this conclusion was very clear.

The mixed model for simulation
The genetic model for current QTL mapping methods either assumed the QTL as fixed effect with a finite number of alleles or random effect with an infinite number of alleles.The fixed model premised the linkage phase and the number of the postulated QTL alleles of the parent was known and origination of the QTL alleles for each individual in mapping population can be traced.QTL mapping methods based on the fixed model included the linear regression with single or multiple linked markers (Haley and Knott, 1992), maximum likelihood analysis for interval mapping or composite interval mapping (Zeng, 1993(Zeng, , 1994)).The random effect model taken the simple assumption that the QTL effects are normally distributed in the parental populations, which have long been explored by human geneticists to map QTL (Pritchard et al., 1991).The plant and animal geneticists were also utilized this model to partition the genetic variance of quantitative traits to the specific chromosome region with marker in outbreed in recently (Fernando and Grossman, 1989;Grignola et al., 1996a, b;1997).To study the cross of the outbreed population, Yi and Xu (2001) developed a mixed model to map QTL for the hybrid population derived from the crosses of two or more distinguished outbred populations.In their model, they treat the mean allelic value of the each source populations as the fixed effect and the allelic deviation from the population mean as random effect, the total genetic variance can be partitioned into between-and within-population variances.The phenotype of the mapping population can be sampled by the normal distribution and the path of gene flow can be presumed by a recursive method.
We extended their method to divide the population mean and variance to each QTL with the assumption that the trait was affected by the determinate number QTL.With this method, more than one trait can be simulated flexibly for any number of generations.The result (Figure 1) showed that there was no difference on the method M and P to produce the population mean.For the population variance, there was also no difference between two methods on founder populations, but the change of the variance in the following generation was affected by the method P more than M. With the method P, the major QTL had larger effect on the phenotype of individual.From these results, we concluded that both of the methods M and P were suitable for the simulation of crossbreeding.

Mapping QTL with crossbred population
To use the historically accumulated recombination between marker and QTL, Darvasi and Soller (1995) presented the AIL and Hill (1997Hill ( , 1998) ) and Luo (2002) proposed the breeding scheme with RSB to improve the resolution of QTL mapping.The advanced intercrossed line was produced by randomly and sequentially intercrossing a population that initially originated from a cross between two inbred lines or some variant thereof.The mapping resolution in an AIL improved due to breakdown of linkage disequilibria between the QTL and their linked marker loci.Wright (1992) initially proposed the repeated backcrossing and selection for isolating quantitative trait genes of large effect.Hill and Luo conducted a series of theoretical studies and simulations to demonstrate that the RSB scheme was efficiency for accumulating a sufficient amount of recombination both between closely linked QTL and between the QTL and nearby markers.AIL and RSB did not exist in natural populations and were not commonly used in the plant and animal breeding industry.Establishing such line cross population solely for the QTL mapping was not economical.It was difficult for animals to develop such populations to map QTL due to the animal breeding taking longer time, large animal population managing consuming cost and the tolerance of inbreeding lower than the plant.
From our study, the crossbred population can be used to improve the resolution of QTL mapping.In the crossbreeding practice, the selection index was used to estimate the breeding value of individuals.This process accelerated the homozygosity of the chromosome region with the major QTL and reduced the difference between individual in population.The homozygosity of alleles leaded to decrease the power of QTL mapping, especially reduced in the schemes with the high selection intensity.But there still existed the phenotypic variance within the population in the not too further generation.These variations can be used to improve the resolution of the QTL mapping.In this study, the QTL took 40% of the genetic variance and the heritability of the trait was 0.7, both of them were quite high.If the trait took lower heritability and the QTL took lower proportion of the genetic variance than that of the trait we studied, there would be more variation existed within the population after the crossbred population was selected for the same number of generation.The crossbreeding population experiencing the randomly selection did not reduce the power of QTL mapping with the generation increasing.This kind of populations was similar to the AIL in the power of QTL mapping.In recently, Evens et al. (2003) and Nagamine et al. (2003) studied the variation within the domestic modern animal and demonstrated a considerable amount phenotypic variance observed in the commercial populations can be explained by segregation at major QTL that have not yet reached the fixation through the artificial selection.They also suggested the crossbreeding population underlying the artificial selection can be used to identify the QTL for the production trait.
In this study, we assumed that the whole pedigree information and the markers genotypes of the individuals in the pedigree were available from founder generation to the following generations.In actual, the pedigree was often recorded in the commercial lines and was available easily, but the markers information was difficult to obtain.The gene dropping methods, Gibbs sampling, Markov Chain Monte Carlo and Hidden Markov Model were used to presume the marker genotype and linkage phase of founder individuals, then the QTL covariance-variance matrix of the individuals in mapping population can be computed when the population undergone the random matting and without any selection.For the crossbreeding populations, the selection gave the high weight to the markers linked closely to the major QTL of the individuals with high performance and not all genes had equal chance to be dropped to the offspring.The methods need to be designed to presume the marker genotype of the founder and calculate the covariance-variance matrix under selection.To utilize the linkage disequilibria of crossbred population accumulated in history, Riquet et al. (1999) presented a two-tiered approach, identified seven sires heterozygous "Qq" of QTL and found the shared haplotype in high-density map, to refine the map position of QTL.Farnir et al. (2002) developed this method to simultaneously exploit the linkage and linkage disequilibrium and allow the inclusion of individuals with an ambiguous QTL genotype.

S1- 6 Figure 1 .
Figure 1.The changes of the population mean and variance of the trait1 was compared between two methods dividing the mean and variance of the population to each QTL from the founder generation to the twentieth.M or P method is the QTL affected the trait possessing the equal or different proportion of the population mean and variance, respectively.(A) The change of the population mean from generation to generation.(B) The change of the population variance from generation to generation.

Figure 2 .
Figure 2. The profiles of the mean log LR over 30 replicated simulations for the crossbred populations from the fourth generation to the twentieth under the random selection (RS) or index selection (IS) with the determined proportion of males and females (PM and PF) as the parents of the next generation.(A) undergone IS, PM = 5% and PF = 10%, (B) undergone IS, PM = 10% and PF = 20%, (C) undergone IS, PM = 20% and PF = 40%, (D) undergone RS, PM = 5% and PF = 10%, (E) undergone RS, PM = 10% and PF = 20%, (F) undergone RS, PM = 20% and PF = 40%.

Figure 3 .
Figure 3. Number of the highest log LR located in the chromosome region of 20, 10 and 5 cM over 30 replicated simulations for the crossbred populations from the fourth generation to the twentieth under the random selection (RS) or index selection (IS) with the determined proportion of males and females (PM and PF) as the parents of the next generation.(A) undergone IS, PM = 5% and PF = 10%, (B) undergone IS, PM = 10% and PF = 20%, (C) undergone IS, PM = 20% and PF = 40%, (D) undergone RS, PM = 5% and PF = 10%, (E) undergone RS, PM = 10% and PF = 20%, (F) undergone RS, PM = 20% and PF = 40%.

Table 1 .
The heritability, economic weight of two traits, number of QTL affected each trait, and their population mean and

Table 2 .
The location, allelic mean and variance of the major QTL of the two traits

Table 3 .
The parameters for the 6 simulated breeding schemes