Whole Genome Association Study to Detect Single Nucleotide Polymorphisms for Behavior in Sapsaree Dog (Canis familiaris)
Article information
Abstract
The purpose of this study was to characterize genetic architecture of behavior patterns in Sapsaree dogs. The breed population (n = 8,256) has been constructed since 1990 over 12 generations and managed at the Sapsaree Breeding Research Institute, Gyeongsan, Korea. Seven behavioral traits were investigated for 882 individuals. The traits were classified as a quantitative or a categorical group, and heritabilities (h2) and variance components were estimated under the Animal model using ASREML 2.0 software program. In general, the h2 estimates of the traits ranged between 0.00 and 0.16. Strong genetic (rG) and phenotypic (rP) correlations were observed between nerve stability, affability and adaptability, i.e. 0.9 to 0.94 and 0.46 to 0.68, respectively. To detect significant single nucleotide polymorphism (SNP) for the behavioral traits, a total of 134 and 60 samples were genotyped using the Illumina 22K CanineSNP20 and 170K CanineHD bead chips, respectively. Two datasets comprising 60 (Sap60) and 183 (Sap183) samples were analyzed, respectively, of which the latter was based on the SNPs that were embedded on both the 22K and 170K chips. To perform genome-wide association analysis, each SNP was considered with the residuals of each phenotype that were adjusted for sex and year of birth as fixed effects. A least squares based single marker regression analysis was followed by a stepwise regression procedure for the significant SNPs (p<0.01), to determine a best set of SNPs for each trait. A total of 41 SNPs were detected with the Sap183 samples for the behavior traits. The significant SNPs need to be verified using other samples, so as to be utilized to improve behavior traits via marker-assisted selection in the Sapsaree population.
INTRODUCTION
Among the domestic animals, the dog takes a significant position in human society due to its very intimate behavioral exhibition with humans. Moreover, the dog is one of the most diverse domestic species in terms of morphology and behavior (Wayne and Ostrander, 1999). Although each breed shows general uniformity for behavior and morphology, individuals within a breed have diverse heritages because of different haplotypes related to the traits (Vila et al., 1997), which can be supported by analysis of protein alleles (Ferrell et al., 1978), as well as with hypervariable microsatellite loci (Fredholm and Wintero, 1995).
Currently, there are more than 400 registered dog breeds around the world that have been bred for various purposes, e.g. hunting, guarding, guides for blind, pets, etc. The Sapsaree is one of the aboriginal breeds in Korea, with medium body size and body height ranging 49 to 55 cm (Kim et al., 2001). Adult coat hair is long and abundant with two typical variations in color, i.e. blue and yellow. The Sapsaree dogs are very gentle, protective and loyal to their owner. Generally, the dogs are not aggressive, but express aggression, if other dogs enter their territory. The Sapsaree population was close to extinction during Japanese colonization. Afterwards, in 1986, eight individuals with similar color and body conformation to Sapsaree breed were collected across the country by local Sapsaree lovers in Daegu, Korea. Successively, systematic mating and reproduction generated the current population of about 3,000 individuals including five hundred at the Sapsaree Breeding Research Institute in Gyeongsan, Gyeongbuk province.
Recent sequencing technologies allowed the sequencing of the entire canine genome (Eggen, 2012), which provides abundant genetic resources to search for genetic variants underlying diseases and phenotypes such as body size (Patterson et al., 1982; Ostrander et al., 1997; Galibert et al., 1998). Whole genome association (WGA) studies are now routinely practiced due to the availability of high density canine single nucleotide polymorphism (SNP) chips such as the Illumina SNP arrays (Karlsson and Lindblad-Toh, 2008; Spady and Ostrandar, 2008). Further, most dog breeds are less than 200 years old, and, thus, have high linkage disequilibrium (LD) and long haplotype blocks (Sutter et al., 2004), which enables WGA study with lower density marker maps, compared to humans (Sutter and Ostrander, 2004; Karlsson and Lindblad-Toh, 2008). According to Karlsson and Lindblad-Toh (2008), about 15,000 SNPs proved to be sufficient for WGA mapping. After the first report of a WGA study in a dog by Karlsson et al. (2007), numerous WGA studies in dogs have been carried out for disease traits. However, there were few reports about GWA analysis on morphology and behavior.
The objectives of this study were to genetically characterize Sapsaree breed and to detect SNPs or quantitative trait loci (QTL) for various behavior traits by WGA analysis in the Sapsaree population.
MATERIALS AND METHODS
Pedigree and phenotypes
The pedigree of the Sapsaree population (n = 8,256) was constructed from 1989 to 2007. A set of 1,014 individuals were recorded for morphology and behavior traits. The behavior tests were first introduced in 1998 and carried out once a year. Each animal was tested only once for nerve stability (NST), affability (AFB), wariness (WRN), adaptability (ADP), sharpness (SRP), activity (ACT) or energy level or temperament and reactions during blood drawn (RBD). Behavior definitions and measurements are described in Tables 1 and 2. All the individuals passed through a sufficient adjustment period before they underwent behavior tests and, the behavior measures were recorded mainly by one experimenter. Dog’s behaviors were graded according to intensities, for which the lowest to the greatest scores reflected the least to the most desired expression, respectively.
Variance component estimation
The behavior traits that were recorded at hedonic scales were analyzed with either quantitative or continuous measurements. For Some behavior traits, the records were transformed to binary pattern, i.e. absence (0) or presence (1). To do that, two or more categories were abridged together (Table 2). Sex, season of birth (summer and winter), birth year groups (10 levels) and age at testing (<15, 20, 25, and, >25 months) were considered as fixed factors for analysis. Then, the SAS general linear model (GLM) procedure (Release 9.1, SAS Institute Inc., NC, USA, 1999) was applied to determine the fixed effects to be fitted in the Animal model at α = 0.1 level.
A restricted maximum likelihood approach was applied under the Animal model using ASReml 2.0 software (Gilmour et al., 2001). A univariate analysis for each trait with quantitative measurements was conducted to estimate variance components, direct and maternal heritability (Equation 1)
Where, Y = phenotype, μ = overall mean, si = effect of sex, sbj = effect of season of birth, ybk = effect of year of birth; tal = effect of age at testing, am = animal effect with N(0, Aσ2d), where A = the additive relationship matrix; mn = maternal genetic effect with N(0, Aσ2m); eijklmn = the random error term with N(0, Iσ2e), where I = the identity matrix. σ2d, σ2m, and σ2e are direct and maternal genetic and residual (error term) variances, respectively. No interaction (covariance component) between maternal and direct effects was assumed. Direct and maternal heritabilities were defined as
in which the overall heritability was:
The behavior traits with binary data formats were also modeled with a generalized linear mixed model (GLMM) using ‘!logit’ function and then fitted under an Animal model (Equation 1). The heritabilities for categorical traits with binary scale (0 or 1) were estimated by the Equation 3 (Lynch and Walsh, 1998):
Where,
Genome-wide association test
Genotyping was performed using two types of canine SNP chips, i.e. CanineSNP20 BeadChip and CanineHD BeadChip. The former and the latter contained more than 22,000 and 170,000 evenly spaced and validated SNPs derived from the CanFam2.0 assembly. Initially, a total of 134 individuals were genotyped by CanineSNP20 BeadChip using the Illumina’s Infinium Assay. Another 60 individuals were genotyped with CanineHD BeadChip using the Infinium HD Assay Protocol (Illumina Inc., San Diego, CA, USA). The genotyping analyses were performed at GeneSeek (Ltd.) and the SNP data were obtained.
A thorough screening of the SNP were performed on three different data; Sap134, Sap60, and Sap183, of which the first two data sets were based on CanineSNP20 and CanineHD BeadChip panels, respectively, and the last set was derived from the SNPs that were embedded on both SNP chips. The genotyped animals were excluded if any particular genotype(s) was entirely missed. No SNP with <0.05 minor allele frequency or with more than 10% missing genotype was included in the following association tests. The genome-wide association tests were carried out mainly on 38 autosomes with the Sap183 data set, because of its greater sample size compared to the Sap60 and Sap134.
For association studies with quantitative traits, sex and year of birth was fitted as fixed factors, and the residuals for each trait were obtained in the process. A generalized linear model procedure was implemented using PROC GLM in SAS statistical software (SAS version 9.1, SAS Inst., Inc., Cary, NC, USA) to calculate the error variances.
In a random mating population with no population structure the association between a marker and a trait can be tested with single marker regression as:
Where, Y is a vector of phenotypes residuals for any given trait, 1n is a vector of 1s, Xa and Xd are coefficient matrices allocating records to the additive and dominance effects of the SNP, respectively, βa and βd are the respective coefficients of the marker effect, and e is a vector of random deviates, N(0, σe2), where σ2e is the error variance. For Xa (Xd), 1, 0, and −1 (0, 1, 0) were assigned for the SNP genotypes, AA, AB, and BB, respectively. The null hypothesis is that the marker has no effect on the trait, while the alternative hypothesis is that the marker does affect the trait, due to LD of the SNP with a QTL.
The variation explained by each SNP (S2SNP) was calculated as
To determine significant SNP for the behavior traits, a significance threshold of 1% point-wise p value from F distribution was applied for each trait. Then, a best set of significant SNPs were selected from the SNPs using a stepwise regression procedure (Neter et al., 1990), due to the fact that some of the significant SNPs, if closely linked to each other (LD), would yield redundant information in implementing a marker assisted selection program. Inclusion and exclusion of each SNP out of the stepwise model was determined at 0.001 level.
RESULTS AND DISCUSSION
Heritability and correlations between behavior traits
Heritability estimates (h2) for behavior traits were presented in Table 3 and 4. The h2 estimates fell within a low range (0.00 to 0.17). In general, there was no noticeable difference of the estimates between the single and multiple trait analyses, except for AFB and ACT. All behavior traits except RBD did not have a great maternal h2 estimate. Nerve stability, ADP and RBD had low direct h2 estimates, i.e. close to zero.
Famula (2001) reported that heritability of dog behaviors were low to high (0.10 to 0.60) depending on the characters and breeds. Wilsson and Sundgren (1997), Van der Waaij et al. (2008), and Ruefenacht et al. (2002) reported a low to medium h2 for NST (0.15 to 0.25), AFB (0.03 to 0.38), and SRP (0.09 to 0.19) of dogs. For ACT trait studied by Wilsson and Sundgren (1997) and related behavior traits such as temperament by van der Waaij et al. (2008) and Ruefenacht et al. (2002) and energy by Bartlett (1976), the h2s ranged from 0.05 to 0.53. Reuterwall and Ryman (1973) reported h2s of 0.09 to 0.17 for AFB and 0.00 to 0.04 for ADP, respectively. Goddard and Beilharz (1982) estimated h2 of 0.10 for suspicion in Labrador retrievers. Our estimates were, in general, concordant with the above reports. Ruefenacht et al. (2002) reported that h2 estimates under quantitative measures for behaviors were lower or similar to those of categorical estimates, which was consistent with this study (Tables 3 and 4).
Wilsson and Sundgren (1997) reported small or negligible maternal effects for behaviors in old dogs. Our results showed that, in general, the maternal effects for behavior with quantitative measures were small, except for RBD (Table 3), but were moderate for AFB, ADP, and RBD with binary values (Table 4).
Genetic correlations (rG) were, in general, greater than phenotypic correlations (rP) (Table 5). Moderate positive rPs were observed between NST and AFB (0.68), between NST and ADP (0.6) and between AFB and ADP (0.62). There were strong genetic correlations between all the behavior traits, except for RBD (Table 5). More specifically, NST, AFB, WRN, and ADP had strong genetic correlation, while RBD had negative genetic correlations with other behavior traits, e.g. with ACT (−0.52).

Phenotypic (lower diagonal) and genetic correlation (upper diagonal) of behavior traits with quantitative values
Ruefenacht et al. (2002) reported positive rPs (0.31 to 0.57) and rGs (0.34 to 0.83) among NST, SRP, and temperament (or ACT in this study) in German Shepherd dogs. Van der Waaij et al. (2008) also reported a range from 0.12 to 0.36 (rP), and −0.51 to 0.64 (rG) among the comparable traits (i.e., NST, AFB, SRP, and ACT) in this study, with the greatest rP (0.36) between NST and temperament, and the greatest rG (0.64) between NST and AFB in GS dogs, which was similar to the results in this study (Table 5).
Significant single nucleotide polymorphisms for behavior by whole genome association studies
For the Sap183 data, a set of 15,825 SNPs were chosen after the quality control tests. The number of available SNPs on each chromosome was proportionate to its chromosomal length, e.g. 865 SNPs on CFA1 and 199 SNPs on CFA38, respectively (results not shown). The physical map of SNPs spanned about 2,181 Mbps, with an average distance of 137.8±147.4 Kb between adjacent SNPs.
A total of 41 significant SNPs for the seven behavior traits were determined from the stepwise regression analyses (Table 6). The set of SNPs for each trait explained a large proportion of total phenotypic variance (29% to 67%), partly due to the fact that the SNP effect was overestimated with the small sample size in this study (n = 183). In general, the SNP effects for the behavior measures had both additive and dominance effects. However, WRN and SRP had mainly additive and dominance effects, respectively (Table 6). The SNPs were distributed across the canine genome, i.e. 24 chromosomes (CFA). The greatest number (6) of SNPs was detected on CFA18, among which four SNPs for NST, AFB, ADP, and ACT were located at 23 or 26 Mbs (Table 6). The significant SNPs in this study need to be verified using other samples, so as to be utilized to improve behavior traits via marker-assisted selection in the Sapsaree population.
ACKNOWLEDGMENTS
This research was supported by the KNU research fund in 2012 and by the Technology Development Program for Agriculture and Forestry, Ministry of Agriculture, Forestry, and Fisheries, Republic of Korea, 2011.