Association of Marker Loci and QTL from Crosses of Inbred Parental Lines *

The objectives of this study were to examine problems with using F1 data by simulation, association of marker loci and QTL from crosses of inbred parental lines and to enumerate the preliminary characterization of genetic superiority within inbred parental lines. In this study, the association between markers for QTL used as covariates and estimates of variance components due to effects of lines was investigated through computer simulation. The effects of size of population to develop inbred lines and initial frequencies and magnitudes of effects of QTL were also considered. Results show that estimates of variance components due to line effects are influenced by including marker information as covariates in the model for analysis. Estimates of line variance were increased by adding marker information into the analysis, because negative covariances between effects associated with the markers and the remaining effects associated with other loci existed. However, the fit of the model as indicated by the log likelihood improved by adding more markers as covariates into the analysis. Marker assisted selection will be beneficial when markers explain unexplained genetic difference during selection procedure. Markers can be used to identify QTLs affecting traits, and to select for favorable QTL alleles. To efficiently use genetic markers, location of markers at the genome must be identified. The estimates of variance due to effects of with and without marker information used as covariates in the analysis were investigated. The estimates of line variances were always increased when markers were included as covariates for the model because a negative covariance were existed. (Asian-Aust. J. Anim. Sci. 2005. Vol 18,


INTRODUCTION
Advances in molecular genetics could provide quantitative geneticists and animal breeder with a better knowledge of the effects and locations of major loci, and an understanding of their actions and interactions that contribute to variation in quantitative traits.
Identifying marker-QTL associations may be undertaken from crosses of inbred lines or in segregating populations.Using genetic markers in conjunction with phenotypic observations would provide more information on the genetic merit of the animal than phenotypic information alone.Many quantitative traits of economic importance are likely to be under the control of several genes, each with a relative small effect.Most of the important traits may be controlled by more that one locus.Often, these traits are influenced by the environment.Therefore, many traits exhibit quantitative (continuous) variation.Spelman and Garrick (1997) proposed that the majority of the genetic variance may be controlled by many genes with small effects even though there may be a few loci with large effects.
Numerous studies have shown that individual loci controlling quantitative traits can be detected through linkage to genetic markers.Several methods have been developed for detecting polymorphisms at the DNA level in the last decade (Beckmann and Soller, 1983;Soller and Beckmann, 1983;Kashi et al., 1990).Many studies have estimated the importance of marker linked QTL effects on the variance of traits (Zhuchenko et al., 1979;Edwards et al., 1987;Stuber et al., 1987;Weller, 1987;Weller et al., 1988).
The objectives of this study were to examine problems with using F 1 data by simulation, association of marker loci and QTL from crosses of inbred parental lines and to enumerate the preliminary characterization of genetic superiority within inbred parental lines.

MATERIALS AND METHODS
Stochastic genetic models were simulated with a Fortran program.The following way is a description.

Theory
Simulation for stochastic genetic model was based on an imaginary genome.The genome was composed of a set of 60 diploid loci, each with two alternate alleles (m and 0) for major QTL and each with two alternate alleles (1 and 0) for other loci.Each individual had one complete genome.An individual is defined by specifying the alleles that it contains at every locus in the genome.The genome was filled by randomly assigning alleles according to specific constraints.Markers were defined by specifying the loci that had alleles with additive genetic effects only.The sum of these allelic effects was equal to the genotypic value.The phenotypic value was generated by adding the genotypic value to an environmental effect that was chosen randomly from a normal distribution with mean (M) and variance (V).
The QTL effect was assumed to be known perfectly and to be completely linked to a marker.In practice, genetic markers may not be tightly linked to QTLs and selection for the apparently favorable marker allele will not always result in the selection of the favorable QTL allele.Because of no recombination between marker and QTL, the marker-QTL association was the same for all families whereas in reality it would be family specific.
The stochastic genetic model considered additive allelic effects in the model only.

Simulation program
The simulation program was written in Fortran.The program was set up to simulate five generations of animals or plants and to generate 30 different populations in each group.
To obtain the initial population for development of inbred lines, the base generation was created by randomly assigning both major QTLs with values dependent on assignment by to position and for other loci values of 0 or 1 at each 60 loci (i.e., total 120 loci for one complete genome) for two chromosomes (A and B) for each individual in the base population.In the simulation program, size of base population (3,000, 1,500 and 750), number of QTL in a population (1, 2, 5 and 10), position of QTL effects on loci dependent on number of QTL in a population, magnitude of QTL value (5 to 50 as small effects, 30 to 75 as medium effects, and 50 to 95 as large effects in Group 1 and 5 to 23 as small effects, 30 to 48 as medium effects, and 50 to 68 as large effects in Group 2), and the allelic frequency for the positive allele at each major QTL (0.05) was initially assigned by external reading parameters to generate genotypes including QTL for base population.For example, to assign a magnitude of QTL value in a population, the values of QTLs were ranged from 5 to 50 by 5 in Group 1 and from 5 to 23 by 2 as magnitude of increasing dependent on assignment of number of QTL, i.e., if 5 QTL were assigned as numbers of QTL then 5, 10, 15, 20 and 25 were used in Group 1 and 5, 7, 9, 11 and 13 were used in Group 2 as magnitude of QTL value with randomly assigned position of QTL.
Then numbers of individuals used as selected parent individuals during selection were assigned.The remaining generations were generated from inbreeding selected parent individuals of the previous generation.The parents were selected by choosing the superior individuals based on their own phenotypes.
To identify markers in selected population, if a value of greater than 1 on loci indicated a major favorable allele (major QTL) with markers denoted 2 which will be used as covariates in the analysis later.And if a value of equal to 1 or 0 on loci indicated a unfavorable allele with markers denoted 1.
The flowchart for the simulation program was shown in Figure 1.
Step 1 : Read parameters Initially, the following parameters were inputted in external file for simulation program to create genotypic and phenotypic values including QTL for the base population.size of the base population, number of loci of chromosome (60 fixed), initial frequency of QTL in a base population, magnitude of QTL effects, position of QTL on loci, number of population in each group (30 fixed), and numbers of selected individuals for each generation.Two sets of parameters were needed because two groups used as sire dam later were existed in this study.
Step 2 : Create genotypes including QTL for base population Two different groups were used as base parental lines and denoted by the symbols G 1 and G 2 where G 1 denoted group 1 used as a sire and G 2 denoted group 2 used as dam later.Each group was assumed to have different genetic background, so two distinct original groups are assumed to be random mating with many different loci with numbers of positive allele at each major QTL dependent on the allelic frequencies.The allelic frequency for the positive allele at each major QTL was 0.05.
As a general rule, the G 1 group will correspond to the high groups with respect to the trait of interest that is they will have genotypic values larger than the G 2 or low groups.Each group was composed of thirty populations (or lines).The thirty population (or lines) from three different sizes within each group were started individuals before initial base populations, which were 3,000, 1,500 and 750, selection.Five replicates of each combination of factors were generated for this study.For example, in the allelic frequencies for the positive allele at each major QTL of 0.05 with 5 QTLs, 30 different populations (or lines) with each 3,000 base population per each population were used with magnitude of QTL effects of 50, 55, 60, 65 and 70 to be generated for initial base population in Group 1.The positions of major QTL effects between 30 different populations (or lines) in each group were not the same.
After the parameters were recognized by the Fortran program, genotypic and phenotypic values were generated including QTL values.The diagram for structure of population with 3,000 base population for example in this study was shown in Figure 2.
Step 3 : Identify markers in selected population After selection culminated in one individual of each base population, marker information was identified in the selected lines (or populations).Two alternate alleles with values of m dependent on magnitude of QTL effects and 0 for major QTL, and two alternate alleles with values of 1 and 0 for other loci were coded as major QTL of 2, and other loci of 1 for the marker phenotypes, respectively (Figure 3).These markers were used as covariates to estimate variance components and amount of variance accounted for by marker information.
Step 4 : Generate F 1 progeny Offspring were generated from the about 98% of homozygous parents.Two different inbred parental lines were crossed to produce F 1 progeny that had average values including environment effects.The phenotypic value of each F 1 individual was the sum of genotypic value of the two parental lines and the associated environmental effect.The F 1 generations were obtained by crossing six different sets of five lines from the two different groups.Each group was composed of thirty inbred lines, resulting in twentyfive single crosses per set of matings of five lines with five lines of the other group.Each final data set contained 150 F 1 individuals.The genotype of each individual was determined by the pair of chromosomes, one from each parents.The mating scheme to generate F 1 progeny from crosses of 30 inbred lines by 30 different inbred lines is shown in Figure 5. Data were analysed by MTDFREML  program (Boldman et al., 1995).

Estimates of variance components
Estimates of variances components due to line effects were obtained with derivative -free restricted maximum likelihood (MTDFREML).The set of programs is used to estimate variance and covariance components using animal model and a derivative-free algorithm to obtain solutions for fixed effects, breeding values, and uncorrelated random effects, sampling variances of solutions and expectations of solutions based on Henderson's mixed model equations (Boldman et al., 1995).
Markers were used as fixed covariates when estimating variances due to effects of inbred lines.Effects of lines within each group were considered as random effects, and variances were compared with and without marker information in the model as covariates.Numbers of markers as covariates were 2, 10, 20, 30, 50 and 120 (all).
The use of markers as covariances is expected to reduce the unexplained variance.Selection of important markers should be beneficial.However, the use of redundant markers as covariates can lead to a loss of detection power of QTL (Jansen, 1994).
The program was restarted with the estimates at apparent convergence as initial values until a global minimum was found.i.e., -2 loglikelihood function not changing to the third decimal after consecutive restarts.
Analyses for single trait were based on the model : Y = Xβ+Zu+e, where β = vector of fixed effects associated with records in y by X with marker information, when included, as covariates, u = vector of random effects associated with records in y by Z, random effects were effects of parental lines, e = vector of random residual effects.E[y] = Xβ and

Negative covariances between effects associated with markers and other loci
To investigate (co)variance between effects associated with the markers and other loci associated with line effects, the following way a were used.
The genotypic values were composed of values associated with the markers and the others remaining values associated with other loci.
Genotype for a cross = markers in Group 1 +other genotypes in Group 1 +markers in Group 2 +other genotypes in Group 2 For the variance of genotypes in each line denoted by the V (marker+other), V (marker+other) = V(marker)+V(other) +2COV (marker, other) where V (marker+other) = variance of genotype for a cross in each line, V (marker) = variances of effects associated with markers, V (other) = variance of effects associated with other loci, and 2COV (marker, other) = covariance between effects associated with markers and effects associated with other loci.The values of markers in each line were summed to calculate variance of effects associated with the markers and other remaining parts in each line were added to calculate variance of effects associated with other loci, respectively.However, there were no variances between lines in each group with small number of QTL (1 or 2) because all QTLs with small number of QTL in each population were fixed during selection.When we assumed that markers with large effects linked to major QTLs.Some cases with large numbers of QTLs (5 or 10) had line variance because some major QTLs lost during selection in each line.Thus, covariance between effects associated with the markers and other loci could be only calculated with large numbers of QTLs.
We assumed that there were two kinds of effects of markers with large effects and small effects.The situation mentioned above belongs to effects of markers with large effects.
It was assumed that the markers with small effects linked to major QTLs.If the markers had small effects (Figure 4), variances between lines in each group were existed more frequently.Because variances of markers in lines depended on summing of values of markers in each line.Thus, covariance between effects associated with the markers and other loci could be calculated.Diagram for calculation of covariances between effects of the markers with large effects and with small effects and other loci is shown in Figure 5.

RESULTS
This study conducted to estimate variance components using F 1 records of progeny generated from two completely inbred parental lines while jointly identifying marker-QTL associations.

Effects of QTLs with an original frequency of .05
With large QTL effects : ranged from 50 to 95 for group 1 and from 50 to 68 for group 2 For combinations of parameters with large effects of QTLs, three base population sizes (BPS) were compared.
With BPS of 3,000, the estimates of variances due to effects of lines in group 1 were changed slightly for model with and without marker information.When 30 marker information as covariates with BPS of 3,000 with larger QTL effects at 1 QTL was included, variances of line effects with 30 marker information highly increased compared with variances of line effects without marker information in group 1. Line variance with 50 markers as covariates increased significantly at 5 QTLs in group 2 with BPS of 3,000 with large QTL effects compared with without marker information.Variances of line effects increased when marker information was added as covariates compared with variances of line effects without marker information with BPS of 3,000 with large QTL effects in both group 1 and group 2. However, when all 120 marker information as covariates was included, variances of line    effects with all 120 marker information decreased to about zero in for both group 1 and group 2 (Figure 6).Line variances increased when more markers were added to the model for at 10 QTLs with BPS of 1,500.The estimate of line variance peaked when 50 markers in both group 1 and group 2 in the model at 10 QTLs with BPS of 1,500.Also when all 120 marker information as covariates was included, variances of line effects with all 120 marker information decreased to about zero in for both group 1 and group 2 with BPS of 1,500 (Figure 7).
Variances of lines with 10 marker as covariates and 5 QTLs in the populations increased significantly for BPS of 750 in both group 1 and group 2 with large QTL effects.Line variance with marker information increased at 10 QTLs with BPS of 750 with large QTL effects.Variance due to effects associated lines was fairly constant for lines derived from populations starting with at both 1 and 2 QTLs.Also when all 120 marker information as covariates was included, variances of line effects with all 120 marker information decreased to about zero in for both group 1 and group 2 with BPS of 750 (Figure 8).
The overall line variance with 10 QTLs with large QTL effects in the population was larger than with other numbers of QTLs with large QTL effects.Variance due to effects associated with lines increased after marker covariates were included in the analysis with large QTL effects and three different base population.However, when all 120 maker information were used as covariates in the analysis, variance due to effects associated with lines decreased always to about zero.
This result did not agree with previous study from Jansen and Stam (1994).According to their results, use of informative markers as covariates can reduce the unexplained variance.
With medium QTL effects : ranged from 30 to 75 for group 1 and from 30 to 48 for group 2 With medium QTL effects, the results were similar as with larger QTL effects.Variance due to effects associated lines increased after marker covariates were included in the analysis.Further, when all 120 maker information was used as covariates in the analysis, variance due to effects associated lines also decreased to about zero in for both group 1 and group 2 with medium QTL effects and three different base population.
With small QTL effects : ranged from 5 to 50 for group 1 and from 5 to 23 for group 2 With small effects of QTLs, the results were similar as with large QTL effects and with medium QTL effects.Variance due to effects associated with lines increased after marker covariates were included in the analysis.Further, Figure 7. Estimates of variance components due to effects of lines in group 1 and group 2 with various numbers of markers used as covariates (BPS: 1,500, QTL effects for group 1: 50 to 95, group 2: 50 to 68, frequency of QTL: 0.05).when all 120 maker information was used as covariates in the analysis, variance due to effects associated lines also decreased to about zero in for both group 1 and group 2 with small QTL effects and three different base population.As a result, nine possible combinations of different base population sizes and QTL effects were investigated with initial QTL frequency of 0.05.After marker information was added to the analysis, estimates of line variances increased for each situation.When all 120 maker information was used as covariates in the analysis, variance due to effects associated lines always decreased to about zero at three different QTL effects and base population size in both group 1 and group 2.
Further, line variance with small BPS (750) was larger compared with variances of line effects with larger BPS (1,500 and 3,000) because all favorable alleles in BPS with 750 animals may not be fixed at each loci.Some major QTL may be lost during the selection procedure.

The log likelihood
The -2 log likelihoods for the model with and without marker information as covariates were investigated in all analysis.The -2 log likelihoods decreased gradually after marker as covariates were included in the analysis.A smaller -2 log likelihood means a better fit of the model.Minimizing -2 log likelihood is equal to maximizing log likelihood.
With more marker information was included in the analysis, log likelihood improved gradually.-2 log likelihood for combination of line effects for group 1 and group 2 with various numbers of markers used as covariates (BPS: 3,000, QTL value for group 1:50 to 95, group 2:50 to 68, frequency of QTL: 0.05) are shown as example in Figure 9.
Van Zyl (1998) found from analyses of several traits from crosses of inbred lines of corn that when marker alleles were added in the model, the -2 log likelihood decreased.
With more marker information was included in the analysis, log likelihood improved gradually.This result with both PBS of 1,500 and 750 and initial frequency of QTL of .05,-2 log likelihoods for different categories of QTL effects had the same trend.

Negative covariances between effects associated with markers and other loci
With marker included as covariates, variances due to effects associated with lines within groups can be expected to decrease.However, in this study the estimates of line variances were always increased when markers were included as covariates because a negative covariance were existed between effects associated with markers and other loci.
To calculate covariances between effects associated with markers and effects of other loci, QTLs were assumed to be linked to markers with large effects and with small effects.Although QTLs were linked to markers with large effects, a negative covariance between effects associated with markers and effects of other loci existed when original QTL frequency was 0.05 with BPS of 1,500 and large QTL effects (ranged from 50 to 95 for group 1 and from 50 to 68 for group 2) with 10 QTLs in both group 1 and group 2. Also a negative covariance between effects associated with markers and effects of other loci existed when original QTL frequency was 0.05 with BPS of 750 and large QTL effects (ranged from 50 to 95 for group 1 and from 50 to 68 for group 2) with 5 QTLs and 10 QTLs in both group 1 and group 2.
With medium QTL effects at initial QTL frequency of .05, a negative covariance between effects associated with markers and effects of other loci existed when BPS of 1,500 and 5 QTLs in group 1, and when BPS of 750 and 10 QTLs in both group 1 and group 2 (.A negative covariance between effects associated with markers and effects of other loci existed with three different base population sizes and various numbers of QTLs in group 1 and group 2 with small QTL effects except when BPS of 3,000 and 1 QTL in group 1.
Although markers were perfectly linked to QTLs, we may consider effects of markers.Now, assumptions with QTLs linked to markers with small effects is considered.In most cases, covariances between effects of markers and other loci were found when markers were associated with small effects for combinations of parameters with various effects of QTLs, base population sizes, and numbers of QTLs in both group 1 and group 2.
Van Zyl (1998) found that line variance increased when marker information was added as covariates to the model.In addition, the log likelihood was improved.They suggested that a negative covariance might be existed between effects associated with markers and genetic effects associated with other loci.
As an illustration, they suggested the line variance was composed as following; If a negative covariance between effects associated with markers and genetic effects associated with other loci existed, and the variance associated with effects of markers then twice the covariance between effects associated with markers and genetic effects associated with other loci might be greater than V(M i ) so that variance due to effects of other loci would be greater than variance due to effects of line, i.e. the total variance.

DISCUSSION
Recently, associations between quantitative trait loci and marker loci have been studied.
In this study, the association between markers for QTL used as covariates and estimates of variance components due to effects of lines was investigated through computer simulation.The effects of size of population to develop inbred lines and initial frequencies and magnitudes of effects of QTL were also considered.
Results show that estimates of variance components due to line effects are influenced by including marker information as covariates in the model for analysis.Estimates of line variance were increased by adding marker information into the analysis, because negative covariances between effects associated with the markers and the remaining effects associated with other loci existed.However, the fit of the model as indicated by the log likelihood improved by adding more markers as covariates into the analysis.
Marker assisted selection will be beneficial when markers explain unexplained genetic difference during selection procedure.Markers can be used to identify QTLs affecting traits, and to select for favorable QTL alleles.To efficiently use genetic markers, location of markers at the genome must be identified.
The estimates of variance due to effects of with and without marker information used as covariates in the analysis were investigated.The estimates of line variances were always increased when markers were included as covariates for the model because a negative covariance were existed.

Figure 1 .
Figure 1.The flowchart of the simulation program.

Figure 2 .
Figure 2. The population structure with base population of 3,000.
of records, I = identify matrix of appropriate order, and

Figure 4 .
Figure 4. Coding of marker information with small effects.

Figure 5 .
Figure 5. Calculation for covariances between effects of markers and other loci.

Figure
Figure6.Estimates of variance components due to effects of lines in group 1 and group 2 with various numbers of markers used as covariates (BPS: 3,000, QTL effects for group 1: 50 to 95, group 2: 50 to 68, frequency of QTL: 0.05).
6. Estimates of variance components due to effects of lines in group 1 and group 2 with various numbers of markers used as covariates (BPS: 3,000, QTL effects for group 1: 50 to 95, group 2: 50 to 68, frequency of QTL: 0.05).