Effects of preselection of genotyped animals on reliability and bias of genomic prediction in dairy cattle

Objective Models for genomic selection assume that the reference population is an unselected population. However, in practice, genotyped individuals, such as progeny-tested bulls, are highly selected, and the reference population is created after preselection. In dairy cattle, the intensity of selection is higher in males than in females, suggesting that cows can be added to the reference population with less bias and loss of accuracy. The objective is to develop formulas applied to any genomic prediction studies or practice with preselected animals as reference population. Methods We developed formulas for calculating the reliability and bias of genomically enhanced breeding values (GEBV) in the reference population where individuals are preselected on estimated breeding values. Based on the formulas presented, deterministic simulation was conducted by varying heritability, preselection percentage, and the reference population size. Results The number of bulls equal to a cow regarding the reliability of GEBV was expressed through a simple formula for the reference population consisting of preselected animals. The bull population was vastly superior to the cow population regarding the reliability of GEBV for low-heritability traits. However, the superiority of reliability from the bull reference population over the cow population decreased as heritability increased. Bias was greater for bulls than cows. Bias and reduction in reliability of GEBV due to preselection was alleviated by expanding reference population. Conclusion Cows are easier in expanding reference population size compared with bulls and alleviate bias and reduction in reliability of GEBV of bulls which are highly preselected than cows by expanding the cow reference population.


INTRODUCTION
Genomic prediction (GP) is used to predict the genomic breeding values of genotyped individuals [1]. The GP models usually do not account for selection. However, the reference population which is used for estimating marker effects with GP models usually consisted of progeny test bulls which was highly selected. Therefore, the prediction models are unable to incorporate past selection based on pedigree and phenotypes, perhaps leading to bias as well as decreased accuracy.
A formula for approximating the reliability and bias of the genomically enhanced breeding values (GEBV) that accounted for the prior selection of genotyped test bulls from among all test bull candidates was proposed [2]. In that method, the differences between the means and standard deviations of the estimated breeding values (EBV) of all of the test bull candidates are used to estimate the proportion of selective genotyping. Then, the selection difference or intensity of selection is calculated from quantitative genetics textbooks [3], and the authors

103
Assuming prediction error cov(EBV, GEBV) is zero, the correlation between EBV and GEBV can be shown as:

104
is the variance of G before selection, k is the proportional reduction in the variance of the selection criterion due to selection, and

103
Assuming prediction error cov(EBV, GEBV) is zero, the correlation between EBV and GEBV can be shown as:

104
is the reliability of EBV. With truncation selection on a normally distributed selection criterion, k is determined entirely by the intensity of selection, k = i (i -x), where i is the intensity of selection, and x is the standardized truncation point [3]. Similarly, the variance of the GEBV of the animals selected on the basis of EBV can be expressed as:

103
Assuming prediction error cov(EBV, GEBV) is zero, the correlation between EBV and GEBV can be shown as:

104
is the squared correlation between EBV and GEBV.
Assuming prediction error cov(EBV, GEBV) is zero, the correlation between EBV and GEBV can be shown as: Therefore, the variance of the GEBV after selection is:

107
Therefore, the variance of the GEBV after selection is:

111
In addition, the cov(EBV, GEBV) can be written as: A general expression for a covariance after selection [13] is:

116
Where * jk σ is the covariance between j and k selected on i, and jk σ is the covar 117 cov(GEBV, G) after selection on EBV is: In summary, the reliability of GEBV after accounting for selection on EBV can be 122 .
In addition, the cov(EBV, GEBV) can be written as:

107
Therefore, the variance of the GEBV after selection is:

111
In addition, the cov(EBV, GEBV) can be written as:

112
A general expression for a covariance after selection [13] is:

116
Where * jk σ is the covariance between j and k selected on i, and jk σ is the covariance before selection. Therefore the 117 cov(GEBV, G) after selection on EBV is: In summary, the reliability of GEBV after accounting for selection on EBV can be expressed based on the reliability of 122 .
A general expression for a covariance after selection [13] is:

107
Therefore, the variance of the GEBV after selection is:

111
In addition, the cov(EBV, GEBV) can be written as: A general expression for a covariance after selection [13] is:

116
Where * jk σ is the covariance between j and k selected on i, and jk σ is the covar 117 cov(GEBV, G) after selection on EBV is: In summary, the reliability of GEBV after accounting for selection on EBV can be

107
Therefore, the variance of the GEBV after selection is:

111
In addition, the cov(EBV, GEBV) can be written as: A general expression for a covariance after selection [13] is:

116
Where * jk σ is the covariance between j and k selected on i, and jk σ is the cova 117 cov(GEBV, G) after selection on EBV is: In summary, the reliability of GEBV after accounting for selection on EBV can be 122 GEBV under random selection ( 2 GEBV r ), the intensity of selection, and the reliabilit 123 is the covariance between j and k selected on i, and

107
Therefore, the variance of the GEBV after selection is:

111
In addition, the cov(EBV, GEBV) can be written as:

112
A general expression for a covariance after selection [13] is:

116
Where * jk σ is the covariance between j and k selected on i, and jk σ is the covariance before selection. Therefore the 117 cov(GEBV, G) after selection on EBV is: In summary, the reliability of GEBV after accounting for selection on EBV can be expressed based on the reliability of 122 2 r 2 r is the covariance before selection. Therefore the cov(GEBV, G) after selection on EBV is:

107
Therefore, the variance of the GEBV after selection is:

111
In addition, the cov(EBV, GEBV) can be written as: A general expression for a covariance after selection [13] is:

116
Where * jk σ is the covariance between j and k selected on i, and jk σ is the cov 117 cov(GEBV, G) after selection on EBV is: In summary, the reliability of GEBV after accounting for selection on EBV can b 122 GEBV under random selection (

107
Therefore, the variance of the GEBV after selection is:

111
In addition, the cov(EBV, GEBV) can be written as:

112
A general expression for a covariance after selection [13] is:

116
Where * jk σ is the covariance between j and k selected on i, and jk σ is the covariance before selection. Therefore t 117 cov(GEBV, G) after selection on EBV is: ). kr -

126
), the intensity of selection, and the reliability of EBV ( ter accounting for selection on EBV can be expressed based on the reliability of the intensity of selection, and the reliability of EBV ( 2 EBV r ) of the preselected in the reference population: kr - (1) ) of the preselected individuals both genotyped and phenotyped in the reference population: In summary, the reliability of GEBV after accounting for selection on EBV can be expressed based on the reliability of of GEBV after accounting for selection on EBV can be expressed based on the reliability of n ( 2 GEBV r ), the intensity of selection, and the reliability of EBV ( 2 EBV r ) of the preselected d phenotyped in the reference population:

128
The regression coefficient of G or deregressed proofs on GEBV is a criterion for bias in GEBV [2,14,15]. Accordingly, the 129 regression coefficient of G on GEBV after selection by using EBV is written as: , equation (2) results in the same regression coefficient as in [2]. That is, we extended the formula for the 134 reliability and bias of GEBV from [2] by accounting for the preselection on EBV of the animals used to create the reference

144
The reliabilities of GEBVs depend on the size of the reference population (nP), the effective number of loci for which 145 effects have to be estimated (nG), and the correlation of the G of a genotyped individual with its phenotypic record (r). In a 146 random sample of the population, the reliability of GEBV or the correlation between GEBV and G ( 2 G , GEBV r ) can be 147 calculated as described in [16]:
The regression coefficient of G or deregressed proofs on GEBV is a criterion for bias in GEBV [2,14,15]. Accordingly, the regression coefficient of G on GEBV after selection by using EBV is written as: 6 127 When 1 = r 2 EBV , equation (1) yields the same formula as that in [2].

128
The regression coefficient of G or deregressed proofs on GEBV is a criterion for bias in GEBV [2,14,15]. Accordingly, the 129 regression coefficient of G on GEBV after selection by using EBV is written as: , equation (2) results in the same regression coefficient as in [2]. That is, we extended the formula for the 134 reliability and bias of GEBV from [2] by accounting for the preselection on EBV of the animals used to create the reference

144
The reliabilities of GEBVs depend on the size of the reference population (nP), the effective number of loci for which 145 effects have to be estimated (nG), and the correlation of the G of a genotyped individual with its phenotypic record (r). In a 146 random sample of the population, the reliability of GEBV or the correlation between GEBV and G ( 2 G , GEBV r ) can be 147 calculated as described in [16]: 150 yields the same formula as that in [2].
or deregressed proofs on GEBV is a criterion for bias in GEBV [2,14,15]. Accordingly, the V after selection by using EBV is written as: results in the same regression coefficient as in [2]. That is, we extended the formula for the [2] by accounting for the preselection on EBV of the animals used to create the reference equal to a cow in the reliability of the GEBV of preselected animals in the reference tion after selection was expressed as (1). The reliability of GEBV without selection tion is obtained after transformation of (1): pend on the size of the reference population (nP), the effective number of loci for which and the correlation of the G of a genotyped individual with its phenotypic record (r). In a , the reliability of GEBV or the correlation between GEBV and G (   (1) yields the same formula as that in [2].

128
The regression coefficient of G or deregressed proofs on GEBV is a criterion for bias in GEBV [2,14,15]. Accordingly, the 129 regression coefficient of G on GEBV after selection by using EBV is written as: (2) results in the same regression coefficient as in [2]. That is, we extended the formula for the 134 reliability and bias of GEBV from [2] by accounting for the preselection on EBV of the animals used to create the reference 143 144 = 1, equation (2) results in the same regression coefficient as in [2]. That is, we extended the formula for the reliability and bias of GEBV from [2] by accounting for the preselection on EBV of the animals used to create the reference population.

Estimating the number of bulls equal to a cow in the reliability of the GEBV of preselected animals in the reference population
Reliability in a reference population after selection was expressed as (1). The reliability of GEBV without selection ( 127 When 1 = r 2 EBV , equation (1) yields the same formula as that in [2].

128
The regression coefficient of G or deregressed proofs on GEBV is a criterion for bias in GEBV [2,14,1 129 regression coefficient of G on GEBV after selection by using EBV is written as: (2) results in the same regression coefficient as in [2]. That is, we extended 134 reliability and bias of GEBV from [2] by accounting for the preselection on EBV of the animals used to

144
The reliabilities of GEBVs depend on the size of the reference population (nP), the effective numbe 145 effects have to be estimated (nG), and the correlation of the G of a genotyped individual with its phenot 146 random sample of the population, the reliability of GEBV or the correlation between GEBV and G 147 calculated as described in [16]:

150
) or under random selection is obtained after transformation of (1): (1) yields the same formula as that in [2].

128
The regression coefficient of G or deregressed proofs on GEBV is a criterion for bias in GEBV [2,14,15]. Accordingly, the 129 regression coefficient of G on GEBV after selection by using EBV is written as: (2) results in the same regression coefficient as in [2]. That is, we extended the formula for the 134 reliability and bias of GEBV from [2] by accounting for the preselection on EBV of the animals used to create the reference

144
The reliabilities of GEBVs depend on the size of the reference population (nP), the effective number of loci for which 145 effects have to be estimated (nG), and the correlation of the G of a genotyped individual with its phenotypic record (r). In a 146 random sample of the population, the reliability of GEBV or the correlation between GEBV and G ( 2 G , GEBV r ) can be 147 calculated as described in [16]:

150
(3) The reliabilities of GEBVs depend on the size of the reference population (nP), the effective number of loci for which effects have to be estimated (nG), and the correlation of the G of a genotyped individual with its phenotypic record (r). In a random sample of the population, the reliability of GEBV or the correlation between GEBV and G ( 127 When 1 = r 2 EBV , equation (1) yields the same formula as that in [2].

128
The regression coefficient of G or deregressed proofs on GEBV is a criterion for bias in GEBV [2,14,15]. According 129 regression coefficient of G on GEBV after selection by using EBV is written as: (2) results in the same regression coefficient as in [2]. That is, we extended the formula 134 reliability and bias of GEBV from [2] by accounting for the preselection on EBV of the animals used to create the ref The reliabilities of GEBVs depend on the size of the reference population (nP), the effective number of loci for 145 ) can be calculated as described in [16]: (1) yields the same formula as that in [2].

128
The regression coefficient of G or deregressed proofs on GEBV is a criterion for bias in GEBV [2,14,15]. Accordingly, the 129 regression coefficient of G on GEBV after selection by using EBV is written as: (2) results in the same regression coefficient as in [2]. That is, we extended the formula for the 134 reliability and bias of GEBV from [2] by accounting for the preselection on EBV of the animals used to create the reference

144
The reliabilities of GEBVs depend on the size of the reference population (nP), the effective number of loci for which 145 effects have to be estimated (nG), and the correlation of the G of a genotyped individual with its phenotypic record (r). In a 146 random sample of the population, the reliability of GEBV or the correlation between GEBV and G ( where λ = nP/nG. Parameter nG depends on the historical effective size of the unselected population (N E ) and on the size of the genome, L (in Morgans), and can be estimated as shown in [17]: When an individual in the reference population is both genotyped and phenotyped, r is equal to the square root of heritability of the trait. Then, the reliability of cows from their own records is: When the reference population is based on progeny-tested sires, i.e., when sires are genotyped but their offspring are phenotyped, r equals the accuracy of the EBV obtained from progeny testing [5]: where N is the number of half-sibling progeny on which the EBV is based.
Parameter nP can be transformed from (4) When the reference population is composed of either bulls (m) or cows (f), parameter nP in (5) can be written as: Alternatively, the reliability of GEBV under random selection can be written as (3) by using that of GEBV after preselection, therefore using the subscript letters m and f as defined earlier, 186 187 Substituting (7) into (6) yields: The numbers of bulls equal to a cow in regard to the specific reliabilities of the GEBV of animals in the reference 192 population with and without preselection are calculated by using (8) and (6) 6 7 Substituting (7) into (6) yields: The numbers of bulls equal to a cow in regard to the specific reliabilities of the GEBV of animals in the reference 2 population with and without preselection are calculated by using (8) and (6) 186 187 Substituting (7) into (6) yields:

191
The numbers of bulls equal to a cow in regard to the specific reliabilities of the GEBV of animals in the reference 192 population with and without preselection are calculated by using (8) and (6) The numbers of bulls equal to a cow in regard to the specific reliabilities of the GEBV of animals in the reference population with and without preselection are calculated by using (8) and (6), respectively. Note that 8 and f as defined earlier, ard to the specific reliabilities of the GEBV of animals in the reference lated by using (8) and (6) , ard to the specific reliabilities of the GEBV of animals in the reference lated by using (8) and (6) in a standpoint of bringing about the same size of reliabilities of the GEBV are the reliabilities of EBV for preselection used to create the femalesonly and males-only reference population, respectively. The number of bulls equal to a cow in terms of the reliability of GEBV must be compared under the same reliability of GEBV after preselection, i.e., 186 187 Substituting (7) into (6) yields:

191
The numbers of bulls equal to a cow in regard to the specific reliabilities of the GEBV of animals in the reference 192 population with and without preselection are calculated by using (8) and (6)

205
Using selection index theory, the reliability of GEBV was derived by [10], which is explained by markers, in a reference 206 population consisting of multiple groups of animals whose phenotypes differ in their information content. We extended 207 selection index theory to a reference population consisting of preselected bulls and cows.  (9) Note that the number of bulls equal to a cow in terms of the reliability of GEBV without preselection, i.e., k = 0, depends only on the reliability of EBV of the individuals both genotyped and phenotyped in the reference population.

Reliability of GEBV in the reference population consisting of preselected bulls and cows
Using selection index theory, the reliability of GEBV was derived by [10], which is explained by markers, in a reference population consisting of multiple groups of animals whose phenotypes differ in their information content. We extended selection index theory to a reference population consisting of preselected bulls and cows.
From selection index theory, of preselected animals in the reference population is: Note that the number of bulls equal to a cow in terms of the reliability of GEBV without preselection, i.e., k = 0, depends 202 only on the reliability of EBV of the individuals both genotyped and phenotyped in the reference population.

205
Using selection index theory, the reliability of GEBV was derived by [10], which is explained by markers, in a reference 206 population consisting of multiple groups of animals whose phenotypes differ in their information content. We extended 207 selection index theory to a reference population consisting of preselected bulls and cows.

205
Using selection index theory, the reliability of GEBV was derived by [10], which is explained by markers, in a reference 206 population consisting of multiple groups of animals whose phenotypes differ in their information content. We extended 207 selection index theory to a reference population consisting of preselected bulls and cows.  is the reliability of GEBV in the reference population consisting of preselected bulls and cows.
The increase in reliability from including bulls only to both bulls and cows in a reference population is expressed as the difference between the reliability of GEBV in the reference population consisting of preselected bulls and cows and that of including preselected bulls only, i.e., . This increase in reliability corresponds to the increase after preselection and therefore can be converted to the increase under random selection. That is, the increase (

234
Simulation data

235
We preselected animals in the reference population according to the EBV of the tra 236 randomly, to obtain more realistic reference populations [14,18,19]. The animals i 237 several generations in the past but were approximated and simplified to come fro 238 preselected on EBV, and the reference population was created from the phenotypic d 239 records or the preselected cows' own data. When the reference population is based 240 daughters per test bull was set to 50 and 100. All test bull candidates were assumed 241 (parent average). PA was computed by using the EBVs of sire (from 50 and 100 dau 242 daughters per progeny-tested bull was set to 50, PA was calculated from EBV of sire fr 243 was set to the number of daughters of sire in PA and that of progeny-tested bull. N 244 preselected from all test bull candidates, and they became test bulls after preselect 245 (11) The number of bulls corresponding to this increase (nP m -Δ ) can be computed applying (5):

234
Simulation data

235
We preselected animals in the reference population according to the EBV of the tra 236 randomly, to obtain more realistic reference populations [14,18,19]. The animals i 237 several generations in the past but were approximated and simplified to come fro 238 preselected on EBV, and the reference population was created from the phenotypic d 239 records or the preselected cows' own data. When the reference population is based 240 daughters per test bull was set to 50 and 100. All test bull candidates were assumed 241 (parent average). PA was computed by using the EBVs of sire (from 50 and 100 dau 242 daughters per progeny-tested bull was set to 50, PA was calculated from EBV of sire fr 243 was set to the number of daughters of sire in PA and that of progeny-tested bull. N 244 preselected from all test bull candidates, and they became test bulls after preselect 245 daughters' records (50 or 100) after progeny-testing. Heifers were preselected using 246 We designated the number of cows in the reference population as nP f . The increase in reliability is derived from adding cows into the reference population that consists of bulls only. That is, the number of bulls equal to a cow in regard to the reliability of GEBV in the reference population consisting of preselected bulls and cows (Cow value_m+f ) is:

234
Simulation data

235
We preselected animals in the reference population according to the EBV of the tra 236 randomly, to obtain more realistic reference populations [14,18,19]. The animals i 237 several generations in the past but were approximated and simplified to come fro 238 preselected on EBV, and the reference population was created from the phenotypic d 239 records or the preselected cows' own data. When the reference population is based 240 daughters per test bull was set to 50 and 100. All test bull candidates were assumed 241 (parent average). PA was computed by using the EBVs of sire (from 50 and 100 dau 242 daughters per progeny-tested bull was set to 50, PA was calculated from EBV of sire fr 243 was set to the number of daughters of sire in PA and that of progeny-tested bull. N 244 preselected from all test bull candidates, and they became test bulls after preselect 245 daughters' records (50 or 100) after progeny-testing. Heifers were preselected using 246 according to the EBV from their own records. After preselection, test bulls, heifers, and 247 (12)

Simulation data
We preselected animals in the reference population according to the EBV of the trait of interest rather than selecting them randomly, to obtain more realistic reference populations [14,18,19]. The animals in the reference population came from several generations in the past but were approximated and simplified to come from a single generation, i.e., they were preselected on EBV, and the reference population was created from the phenotypic data of the preselected bulls' daughters' records or the preselected cows' own data. When the reference population is based on progeny-tested bulls, the number of daughters per test bull was set to 50 and 100. All test bull candidates were assumed to be preselected according to the PA (parent average). PA was computed by using the EBVs of sire (from 50 and 100 daughters) and dam. When the number of daughters per progeny-tested bull was set to 50, PA was calculated from EBV of sire from 50 daughters. That is, same number was set to the number of daughters of sire in PA and that of progeny-tested bull. Note that bulls for progeny testing were preselected from all test bull candidates, and they became test bulls after preselection and progeny-tested sires with their daughters' records (50 or 100) after progeny-testing. Heifers were preselected using their PA, and cows were preselected according to the EBV from their own records. After preselection, test bulls, heifers, and cows were used to create the reference population. The reliability of the EBV from their own records was calculated by selection index theory where reliabilities of PA and their individual records constituted the index similar to the equation (10) and the number of daughters of sire in PA was set to 50.
We assumed that the length of the genome was 30 Morgans and that the heritability of the trait of interest was 0.1, 0.3, or 0.5. The historical effective population size was set to 100 animals [5,20]. The preselection percentage on EBV of animals used to create the reference population was set to 5%, 30%, and 100% for males and to 70%, 90%, and 100% for females. When animals were selected randomly, the proportional reduction (k) in the variance of G was set to zero. The reference population size was set to 5,000, 10,000, 20,000, and 40,000.

Reliability of GEBV of preselected animals in the reference population
We calculated the reliability of the GEBV of non-preselected animals and the ratio of the reliability of preselected animals to that of non-preselected animals for reference populations composed solely of proven bulls preselected on PA computed by using EBVs of sire (from 50 or 100 daughters) and dam (Table 1). The reliability of the GEBV of cows was shown in Table 2. The reliability of preselection on PA, EBV from a cow' s own record, or a bull's progeny testing based on 50 daughters was calculated at three levels of heritability (Table 3). The reliability of GEBV in the bulls-only reference population was the highest among the three reference populations (bulls preselected on PA, heifers preselected on PA, and cows preselected on EBV from their own records). The bulls-only population was particularly superior to the cow population for low-heritability traits (h 2 = 0.1), especially for the bulls-only population testing based on 100 daughters. However, the superiority of the reliability associated with the bull reference population decreased as heritability increased, regardless of whether the animals in the reference population were preselected or not.
In addition, the reliability of GEBV decreased as the intensity of preselection increased (i.e., a decrease in the preselection percentage), and this trend became more conspicuous as heritability increased. This change occurs because the effect of preselection on the reduction of the variance of G increases as heritability increases. The decrease in the reliability of preselected animals compared with that of non-preselected animals became more conspicuous as the reference population size decreased. That is, the effect of preselection on the decrease in reliability became more deleterious as the reference population became smaller.

Bias of GEBV
Regression coefficients of G on GEBV for animals in the reference populations composed solely of proven bulls preselected on PA calculated by using EBVs of sire (from 50 and 100 daughters) and dam were shown ( Table 4). The regression Table 1. The reliability of GEBV of non-preselected bulls at 100% preselection and the ratio of reliability of preselected bulls at <100% preselection to that of nonpreselected bulls

Heritability
No. of animals coefficients of G on GEBV for cows preselected on EBV were calculated (Table 5). Regression coefficients of G on GEBV deviated more from 1 as the intensity of preselection increased, thus indicating that overestimation of GEBV became more prominent as the intensity of preselection increased. Because the intensity of preselection is higher in bulls than in cows, bias was more problematic in bulls than in cows. When the cow reference population preselected on PA was compared with that preselected by using the EBV from their own records, bias or overestimation of GEBV was greater for cows preselected on the EBV from their own records than those   1) In all cases when the preselection percentage was 100%, the regression coefficient was 1.0.
2) PA was calculated by using EBVs from sire (from 50 daughters) and dam.
3) PA was calculated by using EBVs from sire (from 100 daughters) and dam. preselected on PA because the reliability of EBV from their individual record was greater than that of PA (Table 3). In the same way, bias or overestimation of GEBV was greater for bulls testing 100 daughters than 50 daughters. That is, bias became more pronounced with an increase in the reliability of preselection of animals used to create the reference population. Bias or overestimation of GEBV was alleviated by increasing reference population size (Tables 4, 5).

The contribution to the same reliability of the number of bulls to a cow
The number of bulls equal to a cow in terms of the bringing about the same size of reliability of the GEBV of preselected animals was calculated by (9) ( Table 6). This parameter is re-lated solely to the reliability (

290
The contribution to the same reliability of the number of bulls to a cow

291
The number of bulls equal to a cow in terms of the bringing about the same size of reliability of the GEBV of preselected 292 animals was calculated by (9) ( Table 6). This parameter is related solely to the reliability ( The combined reliabilities in the reference population composed of 10,000 preselected bulls and 10,000 or 20,000 preselected 302 cows are calculated by (10) and shown together with the reliabilities of reference populations composed solely of bulls or 303 cows ( Table 7). The cows in Table 7 are only those preselected on EBV from their individual records, because the combined 304 ) of the preselection of heifers/cows and bulls to create the reference population, the intensity of preselection (k f , k m ), and the reliability of the EBV of cow's record or bull's progeny testing (

290
The contribution to the same reliability of the number of bulls to a cow

291
The number of bulls equal to a cow in terms of the bringing about the same size of reliability of th 292 animals was calculated by (9) ( Table 6). This parameter is related solely to the reliability ( The combined reliabilities in the reference population composed of 10,000 preselected bulls and 10,00 302 cows are calculated by (10) and shown together with the reliabilities of reference populations comp 303 cows ( Table 7). The cows in Table 7 are only those preselected on EBV from their individual records

304
). The number of bulls equal to a cow in regard to the reliability increased with increases in the intensity of preselection for males and with decreases in the intensity of selection for females. The number increased three to four times with an increase in heritability from 0.1 to 0.5 under the same preselection percentage. For example, the number of bulls equal to a cow increased approximately four times, from 0.208 to 0.811, as heritability increased from 0.1 to 0.5 under the 5% male preselection percentage and random female preselection.

Reliability of the GEBV in the reference population comprising both bulls and cows
The combined reliabilities in the reference population composed of 10,000 preselected bulls and 10,000 or 20,000 preselected cows are calculated by (10) and shown together with the reliabilities of reference populations composed solely of bulls or cows ( Table 7). The cows in Table 7 are only those preselected on EBV from their individual records, because the combined reliability was almost equivalent whether heifers were preselected on PA or cows were preselected on the EBV from their own records. The combined reliability increased as the number of cows increased from 10,000 to 20,000, and this trend was more conspicuous for high-heritability traits (h 2 = 0.5) than low-heritability traits (h 2 = 0.1). As shown in Table 2 and 6, this result again confirmed cows' favorable properties regarding the reliability of high-heritability traits. The contribution of cows in reliability of the combined population compared with that of a reference population composed of bulls only, i.e., the difference between combined reliability due to bulls and cows and the reliability due to bulls only, ranged from 0.03 to 0.22. The reliability for a reference population composed of either bulls or cows solely was computed by using (1). The number of bulls equal to a cow in terms of the reliability of  1) In all cases when the preselection percentage was 100%, the regression coefficient was 1.0.
2) PA was calculated by using EBVs from sire (from 50 daughters) and dam. the reference population created from both preselected bulls and cows computed by using (12) coincided with the number computed by using (9). That is, the number of bulls equal to a cow in regard to the reliability of GEBV in the reference population comprising both bulls and cows agreed with the number of bulls equal to a cow in the reference population created solely from bulls or cows in Table 6.

Benefit of cows regarding the reliability of GEBV for high-heritability traits
The superiority of the reliability of the GEBV from a bulls-only reference population over the cow population decreased as heritability increased regardless of whether animals in the reference population were preselected (Tables 1, 2). To improve GP, the same individuals should be both genotyped and phenotyped instead of genotyping parents and phenotyping their progeny [5]. In the current study, cows are both genotyped and phenotyped, whereas bulls are genotyped and their progeny are phenotyped. Because the reliability of a cow's EBV was based on her own record, the increase in reliability con-current with an increase in heritability was greater for cows than for bulls (Table 3). For example, the reliability of a cow's EBV based on her own record and that of a bull's EBV based on 50 of his daughters' records corresponding to a heritability of 0.1 are 0.236 and 0.562; when heritability is 0.5, these are 0.604 and 0.877, respectively. Consequently, we consider that genotyping of cows with phenotypes is advantageous for highheritability traits from the point of increasing the reliability of GEBV. The value of genotyping of cows with phenotypes was reduced by increasing the number of daughters per test bull (results not shown), because the reliability of bulls increased with increases in the number of daughters per test bull ( Table 1).

The effects of preselection on reliability and bias of GEBV
The effect of preselection on reducing the variance of G increased as heritability increased, thereby decreasing the reliability of the GEBV of preselected animals. However, the reliability of GEBV in the reference population increased as heritability increased even if preselection had been practiced (Tables 1, 2). This result indicates that the effect of the increase in heritability on the increase in the reliability of the EBV of ls. However, the reliability of GEBV in the reference population increased as heritability been practiced (Tables 1, 2). This result indicates that the effect of the increase in liability of the EBV of animals by using a cow's own record or progeny testing of bulls e in reliability due to reduction of the variance of G from preselection.
decreased reliability of GEBV became more deleterious for smaller reference populations plained by (1). That is, the reliability of GEBV after preselection is written as: , where GEBV r is the accuracy of GEBV in the absence of preselection or under atio of reliability after preselection ( erence population was under selection, but it decreased with increase in the size of the cow inclusion of cows in the reference population slightly reduced the bias in GEBV [17]. s or yearling heifers can be a cost-effective strategy for enhancing the genetic value of l dairy farms [21]. That is, saying that a replacement decision was based on GEBV implies rling heifers were included in the reference population. In general, it is easier to increase cows than bulls. Expanding the reference population alleviated bias when heritability and onstant (Tables 4, 5). This effect occurs because the regression coefficient as a criterion of ) as shown in (2) and because the reliability of GEBV (   . Consequently, bias and reduction in reliability of GEBV due to preselection was alleviated by expanding reference population. Therefore, cows can contribute to reducing bias and increasing reliability due to their ease of use in expanding reference population size and by providing more recent animals (compared with bulls). Cows' contribution was determined to improving GP in breeding schemes where few bulls with traditional evaluations were added annually [22,23]. Older bulls may contribute only slightly to increasing genomic reliability because of linkage decay between the validation and ancestral populations, resulting in r g <1.0 between bulls and cows and lowering reliability [7]. The young selection candidates are more closely related to the animals in the reference population when the reference population consists of cows or a combination of bulls and cows instead of bulls only [7]. The GP is more reliable when juvenile animals share their recent pedigree with animals in the reference population [24,25]. Cows are easier in expanding reference population size compared with bulls and alleviate bias and reduction in reliability of bulls' GEBV due to higher preselection by expanding reference population of cows.

The value of cows compared with that of bulls in terms of the reliability of GEBV
The overall reliability in the reference population comprising both bulls and cows was not the sum of the reliabilities from those containing bulls or cows only (Table 7); this result indicated that marker information between bulls and cows was not independent, which was in agreement with the reliability results obtained from Danish cows and US bulls [26]. That is, the off-diagonal elements in (10) derived from index selection theory for the reference populations containing both bulls and cows were not zero.
The number of bulls equal to a cow in reliability of GEBV as calculated from (12) in the reference population containing both bulls and cows agreed with the number computed from (9) in a reference population created solely from bulls or cows. This effect occurs because the increased reliability due to the addition of cows into a bulls-only population was converted to the increase per head in the bulls-only population and the numbers of bulls only and cows only to yield the increased reliability was compared. Consequently, the number of bulls equal to a cow in terms of the reliability of the combined reference population could be computed by using the simple formula of (9) applied to reference populations created solely from bulls or cows. Cows are, in general, selected randomly compared with bulls; consequently, the effect of preselection on decreased reliability and bias of GEBV would be much smaller for cows than for bulls.