Go to Top Go to Bottom
Anim Biosci > Volume 31(1); 2018 > Article
Kong, Anthony, Rowland, Khatri, and Kong: Genome re-sequencing to identify single nucleotide polymorphism markers for muscle color traits in broiler chickens



Meat quality including muscle color in chickens is an important trait and continuous selective pressures for fast growth and high yield have negatively impacted this trait. This study was conducted to investigate genetic variations responsible for regulating muscle color.


Whole genome re-sequencing analysis using Illumina HiSeq paired end read method was performed with pooled DNA samples isolated from two broiler chicken lines divergently selected for muscle color (high muscle color [HMC] and low muscle color [LMC]) along with their random bred control line (RAN). Sequencing read data was aligned to the chicken reference genome sequence for Red Jungle Fowl (Galgal4) using reference based genome alignment with NGen program of the Lasergene software package. The potential causal single nucleotide polymorphisms (SNPs) showing non-synonymous changes in coding DNA sequence regions were chosen in each line. Bioinformatic analyses to interpret functions of genes retaining SNPs were performed using the ingenuity pathways analysis (IPA).


Millions of SNPs were identified and totally 2,884 SNPs (1,307 for HMC and 1,577 for LMC) showing >75% SNP rates could induce non-synonymous mutations in amino acid sequences. Of those, SNPs showing over 10 read depths yielded 15 more reliable SNPs including 1 for HMC and 14 for LMC. The IPA analyses suggested that meat color in chickens appeared to be associated with chromosomal DNA stability, the functions of ubiquitylation (UBC) and quality and quantity of various subtypes of collagens.


In this study, various potential genetic markers showing amino acid changes were identified in differential meat color lines, that can be used for further animal selection strategy.


Meat quality in chickens is an important trait and includes pH, meat color, drip loss, tenderness, intramuscular fat content, and other fat traits such as the contents and proportions of abdominal and subcutaneous fat. Modern broilers grow very fast due to genetic selection, efficient production systems, improved nutrition and regular veterinary attention. However, selection for fast growth and high yield may have negatively impacted qualities of the meat [1]. The elucidation of the molecular mechanisms underlying meat quality traits in chickens will have both biological and economic consequences.
Previously, quantitative trait loci (QTLs) have been studied for many traits in chicken for over 20 years to identify genetic markers and functional factors for physiological mechanisms. A variety of QTLs were detected for meat quality traits and abdominal fat traits throughout various genomic regions [2]. These QTLs were detected by linkage analysis and by candidate gene analysis. Both of these methods have limitations: the identified QTL regions are generally large and require subsequent fine mapping to identify closely linked markers or causative variants. Candidate genes, based on putative physiological roles, may exclude the identification of novel genes or pathways that influence the target traits [3]. Recently, genome-wide association studies (GWAS) in an F2 broiler resource population identified 14 new genes for several meat quality traits [4]. Most of the regions identified by GWAS were found in regulatory regions which make the identification of causal mutations (e.g. coding sequence changes, deletions, or duplications) difficult.
Two broiler chicken lines divergently selected for muscle color (high muscle color [HMC] and low muscle color [LMC]) along with their random bred control line (RAN) have been developed at University of Arkansas, Fayetteville, Arkansas, United States [4,5]. HMC and LMC lines have undergone 8 generations of selective pressure based on muscle color and those lines clearly showed differential muscle qualities [5]. After 8 generations of divergent selection for muscle color, HMC lines showed ~4.2 higher L* value compared to parental RAN line while LMC lines showed ~2.8 lower L* values compared to RAN line [5]. When compared, the L* value for the HMC line (53.91±0.28) was ~7.1 higher than the LMC line (46.86±0.20). This result clearly showed that the meat color is different between HMC and LMC lines.
In this study, whole genome re-sequencing analysis using Illumina sequencing platform was performed to investigate genetic variation for muscle color expression in HMC and LMC lines compared with their parental random bred line, RAN. Millions of single nucleotide polymorphisms (SNPs) were identified by genome re-sequencing and only potentially causal genes containing non-synonymous mutations, which can induce amino acid changes in proteins were detailed in this study.


Genetic lines and Illumina sequencing

HMC and LMC lines were formed from the divergent selection of muscle quality on a random bred control line (RAN) that has been maintained by N. B. Anthony at the University of Arkansas. L* value is a key visual indicator to determine muscle color of breast fillet using Minolta CR-300 Colorimeter (Minolta Italia S.P.A, Milano, Italy) and selection methods and line development were reported previously by Harford et al [5]. Bird populations in the 8th generation of selection were used for genome sequencing. Blood (3 mL) was collected from 12 birds each following animal use protocol approved by IACUC (University of Arkansas; Project number: 11025). Genomic DNA was isolated from each whole blood sample using QiaAmp DNA mini kit (Qiagen, Valencia, CA, USA) following manufacturer’s instructions. DNA quality was determined by agarose gel electrophoresis and 10 samples having the highest qualities in each line were pooled to represent their respective line. Library preparation and Illumina genome sequencing for the pooled DNA samples were performed by the National Center for Genome Resources (NCGR; Santa Fe, NM, USA). Illumina HiSeq system 2×100 bp paired end read technology was used for genome sequencing.

Genome sequence assembly and data analysis

Quality of raw reads were determined using FastQC tool kit (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and adapters were trimmed out using bbduk.sh command line of bbmap toolkit (https://sourceforge.net/projects/bbmap/). Cleaned reads were aligned to the galgal4 chicken reference genome sequence that was retrieved from National Center for Biotechnology Information. For the reference based genome alignment, the NGen genome sequence assembly program of the Lasergene software package (DNAStar, Madison, WI, USA) was used. Assembly parameters were as follows: file format, BAM; mer size, 21; mer skip query, 2; minimum match percentage, 93; maximum gap size, 6; minimum aligned length, 35; match score, 10; mismatch penalty, 20; gap penalty, 30; SNP calculation method, diploid bayesian; minimum SNP percentage, 5; SNP confidence threshold, 10; minimum SNP count, 1; minimum base quality score, 5. After assembly, the SeqMan Pro program of the Lasergene package was used for further analyses including SNP data.

SNP detection and analysis

SNP calling and further analysis were following previous report [6]. Briefly, JMP Genomics (SAS Institute Inc., Cary, NC, USA) program was used for filtering unique SNPs for HMC and LMC chickens. SNPs occurring in both HMC and LMC lines were filtered out, leaving behind unique SNPs for each line with relatively high SNP quality call (Q call >40 out of highest value = 60). To call highly fixed and homozygous SNPs, SNP percentages (SNP%) was the next filtering point, anything less than 75% (for example, number of SNP = 3 of read depth = 4) was not considered. The cutoff of Q call >40 and SNP% >75% for SNP selection was set by considering potential sequencing errors that can be generated by the massively parallel sequencing method. Then, chicken line specific SNPs were determined for the segregation of parental- (RAN) or non-parental mutations. To focus on the novel mutations that are different from parental genotypes, non-parental SNPs were chosen in each line. Potential causal SNPs showing non-synonymous changes in coding DNA sequences (CDS) region were chosen in each line. Since read depth of many SNPs were low range, unique SNPs showing over 10 depths were considered as reliable SNPs specific for each genetically selected line. Reliable, mutational and causal SNPs, which were chosen by criteria described above were confirmed by double-checking the raw assembly data with alignment view to reduce false positives.

Bioinformatic analyses

The effect of SNP variants (non-overlapping, >40 Q call, >75 SNP%, >10 depths and non-synonymous changes) in protein functionality was determined using Ensembl Variant Effect Predictor (http://useast.ensembl.org/info/docs/tools/vep/index.html). Functional interpretation of genes retaining SNPs was analyzed in the context of gene ontology and molecular networks using the ingenuity pathways analysis (IPA; Qiagen; www.ingenity.com). Since IPA is based on human and mouse bioinformatics, functionalities for selected genes in the chicken were interpreted based primarily on mammalian biological mechanisms. The number of molecules in the network was set to the limit of 35, leaving only the most important ones based on the number of connections for each focus gene (a subset of uploaded significant genes having direct interactions with other genes in the database) to other significant genes [7].


Genome re-sequencing for HMC, LMC, and parental RAN line chickens

Results of genome sequencing of pooled DNA from each of 10 chickens per line showed 5.1×, 6.5×, and 8.2× genome coverage for HMC, LMC, and RAN lines, respectively. The total number of SNPs was 5.3, 6.0, and 4.4 million (~0.5% of template genome) in HMC, LMC, and RAN genome, respectively (data not shown). The large number of SNPs per examined chicken line was based on data of at least 2 read coverage depths (number of read counts per nucleotide location). To identify genetic biomarkers that are responsible for regulating muscle color, high quality (Q call >40) SNPs uniquely found in HMC and LMC were selected by removing overlapping SNPs between HMC and LMC simultaneously. Then, possibly fixed mutations showing >75% SNP rates were chosen as reliable marker SNPs. As a result, a total of 1,134,655 unique SNPs were identified throughout the HMC and LMC chicken genomes. The number of SNPs in each chromosome is shown in Figure 1. The HMC line (Figure 1A) showed 576,886 SNPs including 229,415 parental and 347,471 non-parental segregations, while the LMC line (Figure 1B) showed 557,769 SNPs including 253,055 parental- and 304,714 non-parental segregations (Information for all SNPs for HMC and LMC will be provided upon request). When SNPs were grouped by feature types of chromosome regions, ~50% of SNPs were in the intergenic (heterochromatic) regions and 26,800 SNPs were found in CDS (protein coding regions). Around 70% of SNPs in CDS regions were synonymous mutations that do not induce amino acid changes. A total of 2,884 SNPs (1,307 for HMC and 1,577 for LMC) could induce non-synonymous-, frameshift-, nonsense-, and no-start mutations in proteins, suggesting that the 2,884 SNPs may be inducible mutations that are part of the genetic components regulating muscle color in chickens. Of those 2,884 candidate SNPs associated with amino acid changes, 679 and 924 SNPs were parental (RAN line) segregations for HMC and LMC, respectively (data not shown; information for all SNPs for HMC and LMC will be provided upon request). To focus on the novel mutations different from parental genotypes, SNPs of non-parental segregation showing non-synonymous changes in CDS region were considered as causal mutations in this study. Potentially causal SNPs showing over 10 read depths (considered to be more reliable candidate genetic markers) were chosen for further bioinformatic analysis as described in Materials and Methods. In addition, re-scanning of each SNP position for reliable, mutational, and causal protein coding SNPs was conducted to reduce false positives due to possible errors (e.g. SNP detection by reading a position in a chicken line, which was not covered the region of the SNP in the other chicken line) in the SNP calling process, using Seqman-Pro viewer program. This process yielded 15 more reliable SNPs including 1 for HMC and 14 for LMC (Table 1) and gene names containing SNPs that are listed in Table 2. Genes included are: trafficking protein particle complex 8 (TRAPPC8) for HMC and apolipoprotein B antigen (APOB), ataxia telangiectasia mutated (ATM), coiled-coil domain containing 88A (CCDC88A; transcript variant 1), complement factor H (CFH), collagen, type XXVIII, alpha 1 (COL28A1), glycoprotein IX (GP9 [platelet]), ligase IV (LIG4, DNA, ATP-dependent), melanoma inhibitory activity family, member 3 (MIA3), microtubule associated monooxygenase, calponin and LIM domain containing 2 (MICAL2), RE1-silencing transcription factor (REST), small nuclear RNA activating complex, polypeptide 1 (SNAPC1, 43 kDa; transcript variant 1), TNF receptor-associated factor 1 (TRAF1), UTP18 small subunit processome component homolog (UTP18 [yeast]), and LOC428119 (uncharacterized protein) for LMC.
The effect of SNP variants in the protein functionality of genes listed above was determined using Ensembl Variant Effect Predictor, resulting in that two SNPs found in ATM and APOB can induce deleterious amino acid changes, while others were tolerated mutations (data not shown). Both mutations in ATM and APOB were found in armadillo (ARM) repeats superfamily domain (superfamily ID: SSF48371 given from http://supfam.org/SUPERFAMILY/index.html).

Bioinformatic analyses with genes retaining SNPs associated with muscle color

Fifteen genes containing potentially causal SNPs, listed above, were subjected to gene network analysis, which represents the intermolecular connections among interacting genes based on functional knowledge inputs, using the IPA program. Of various assay settings, the simplest settings of 35 focus molecules were employed to analyze molecular gene networks that resulted in only one network being generated (Figure 2). ATM and LIG4 are associated with cellular assembly and organization. ATM, a protein kinase family member plays a critical role in the cellular response to DNA damage by phosphorylating downstream targets such as BRCA1 (breast cancer 1, early onset) and p53 [8], in addition to a direct role in telomere maintenance [9]. LIG4 mainly functions in repair of DNA double-strand breaks by non-homologous end joining process in the maintenance of genome stability [10]. Moreover, CCDC88A, also known as Akt-phosphorylation enhancer (APE) can decreases DNA synthesis with suppression of protein kinase B phosphorylation [11]. Taken together, potentially causal amino acid mutations in molecules involved in DNA synthesis, telomere maintenance and DNA damage responses appear to be associated with muscle color expression in broiler chickens. However, the direct connection of phenotypic expression of muscle color to basic cellular function for chromosomal DNA integrity, cellular structure, and proliferation is still unclear and further investigations are needed to discover genetic markers and to understand molecular regulation of muscle color expression. Molecules, such as APOB, ATM, MIA3, MICAL2, REST, TRAF1, UTP18, TRAPPC8, CCDC88A, and SNAPC1 in the network were shown to bind directly to UBC (ubiquitin C) by proteomic analyses of ubiquitylated proteomes [1215], suggesting that the various cellular functions including protein degradation by ubiquitylation may play a role for the expression of muscle color in chickens though the specific roles of each factor in regulating muscle color development need further investigations. In addition, membrane bound- and secreting proteins, such as APOB, CFH, GP9, and COL28A1 were shown to be associated with regulating muscle color. Secreting proteins can be used as biomarkers in the biofluid for muscle color trait. Specifically, GP9 is found in the platelets of blood, thus amino acid difference in GP9 can be used as blood biomarker to select muscle color traits in chickens [16]. Previous results of GWAS for meat quality in chicken identified 14 marker genes including a collagen gene (collagen, type 1, alpha 2) [4]. The extracellular matrix of muscle is composed mostly of the protein collagen family and muscle qualities can be reflected by the relative amount and distribution of collagen fibers in muscles depending on genetic, physiological and nutritional conditions [17]. This suggests that genetic alteration in the collagen gene and the contents of collagen proteins may function in regulating meat quality, especially meat color.
It is hypothesized that the SNPs found in this study are the direct result of divergent selection for L*, however, one cannot ignore the potential impact of random genetic drift. Each research line has been maintained with 24 sires and 3 dams per sire providing an effective population size of 72. This is sufficient to manage inbreeding and significant impacts of drift in research populations [18]. It is possible, however, that sampling error could have been introduced by the initial pooling of 10 samples per line. One could further validate our results by testing a larger number of birds using potential marker candidates, that are identified by the genome-wide screening methods.
In this study, various potential genetic markers showing amino acid changes were identified in differential meat color lines, HMC and LMC, through genome re-sequencing. When considering the functional standpoint based on the interpretation of factors involved, meat color in chickens appears to be associated with chromosomal DNA stability, functions of ubiquitylation (UBC) and quality and quantity of various subtypes of collagen in muscle. Since wide-spread SNPs throughout genomic regions can influence the expression of muscle color by various mechanisms, such as altered gene expressions (e.g. SNPs in promoter regions may regulate expressions of mRNAs and proteins), the genes listed in this study may not cover all potentially causal mutations in muscle color change. However, non-synonymous mutations induced by SNPs in protein coding regions, that were mainly focused in this study, may become critical determinants in protein structures and their functionalities for the expression of muscle color. In this regard, the SNPs causing amino acid changes were analyzed first in this study and further studies will be conducted to characterize SNPs found in other feature regions of genome. Additionally, functional validation studies for the candidate factors will follow, using the muscle color chicken lines.



We certify that there is no conflict of interest with any financial organization regarding the material discussed in the manuscript.


Licenses for Lasergene software package (DNAStar, Madison, WI, USA) and JMP Genomics (SAS Institute Inc., Cary, NC, USA) program were supported to Cell and Molecular Biology Graduate program by University of Arkansas (Fayetteville, AR, USA). This work was supported by Arkansas Bioscience Institute and Arkansas Agricultural Experimental Station.

Figure 1
Number of unique single nucleotide polymorphisms (SNPs) per chromosome found in high muscle color (HMC) (A) and low muscle color (LMC) (B). Open bars indicate number of SNPs derived from parental line (RAN), while solid bars indicate number of SNPs, which is not shared with parental line (RAN).
Figure 2
Gene network for genes containing 15 single nucleotide polymorphisms (SNPs) causing amino acid changes. Molecular interactions among important focus molecules are displayed. Gray symbols show the genes found in the list of SNPs while white symbols indicate neighboring genes, which are functionally associated, but not included in the gene list of SNP. Symbols for each molecule are presented according to molecular functions and type of interactions. Gene network image was created by IPA analysis.
Table 1
The 15 reliable marker single nucleotide polymorphisms inducing amino acid changes showing over 10 read depths
Trait Contig ID Chr Ref Pos Ref base Called base SNP % Feature name DNA change Amino acid change Depth A Cnt C Cnt G Cnt T Cnt
HMC NC_006089 2 105986273 T C 0.9 TRAPPC8 c.3032A>G p.H1011R 10 0 9 0 -*
LMC NC_006088 1 139473411 C A 0.82 LIG4 c.1166G>T p.T389K 11 9 -* 0 0
LMC NC_006088 1 179548486 C G 0.82 ATM c.2998G>C p.A1000P 11 0 -* 9 0
LMC NC_006088 1 193745605 A G 0.8 LOC428119 c.118T>C p.I40V 10 -* 0 8 0
LMC NC_006089 2 24867966 T G 0.8 COL28A1 c.1754A>C p.Q585P 10 0 0 8 -*
LMC NC_006090 3 58108 G T 0.75 CCDC88A c.5215C>A p.L1739I 12 0 0 -* 9
LMC NC_006090 3 17375493 C A 0.91 MIA3 c.4762G>T p.A1588S 11 10 -* 0 0
LMC NC_006090 3 101892286 C T 0.9 APOB c.6704G>A p.S2235N 10 0 -* 0 9
LMC NC_006091 4 48626849 A G 0.75 REST c.2591T>C p.V864A 12 -* 0 9 0
LMC NC_006092 5 7442160 C A 0.82 MICAL2 c.4632G>T p.Q1544H 11 9 -* 0 0
LMC NC_006092 5 53874276 A G 0.8 SNAPC1 c.1051T>C p.S351P 10 -* 0 8 0
LMC NC_006095 8 2656043 T C 0.8 CFH c.1465A>G p.K489E 10 0 8 0 -*
LMC NC_006099 12 5105832 G A 0.8 GP9 c.23C>T p.A8V 10 8 0 -* 0
LMC NC_006104 17 8316335 C T 0.85 TRAF1 c.98C>T p.R33Q 13 0 -* 0 11
LMC NC_006105 18 5075503 A C 0.83 UTP18 c.1288T>G p.M430L 12 -* 10 0 0

* - denotes reference base of Red Jungle Fowl.

Table 2
Gene name and functions of genes containing amino acid changes showing over 10 depth counts in high muscle color and low muscle color chickens
Gene Entrez gene name Location Type(s)
APOB Apolipoprotein B Extracellular space Transporter
ATM ATM serine/threonine kinase Nucleus Kinase
CCDC88A Coiled-coil domain containing 88A Cytoplasm Other
CFH Complement factor H Extracellular space Other
COL28A1 Collagen, type XXVIII, alpha 1 Extracellular space Other
GP9 Glycoprotein IX (platelet) Plasma membrane Other
LIG4 Ligase IV, DNA, ATP-dependent Nucleus Enzyme
MIA3 Melanoma inhibitory activity family, member 3 Cytoplasm Other
MICAL2 Microtubule associated monooxygenase, calponin and LIM domain containing 2 Cytoplasm Enzyme
REST RE1-silencing transcription factor Nucleus Transcription regulator
SNAPC1 Small nuclear RNA activating complex, polypeptide 1, 43kDa Nucleus Other
TRAF1 TNF receptor-associated factor 1 Cytoplasm Other
TRAPPC8 Trafficking protein particle complex 8 Cytoplasm Transporter
UTP18 UTP18 small subunit (SSU) processome component homolog (yeast) Nucleus Other
LOC428119 Uncharacterized N/A N/A


1. Dransfield E, Sosnicki AA. Relationship between muscle growth and poultry meat quality. Poult Sci 1999; 78:743–6.
crossref pmid pdf
2. Chicken QTLdb [Internet]. Ames, IA, USA: NAGRP Bioinformatics Team; [2017 June 15]. Available from: http://www.animalgenome.org/cgi-bin/QTLdb/GG/index

3. Fan B, Du ZQ, Gorbach DM, Rothschild MF. Development and application of high-density SNP arrays in genomic studies of domestic animals. Yi Chuan Xue Bao 2010; 23:833–47.
crossref pdf
4. Sun Y, Zhao G, Liu R, et al. The identification of 14 new genes for meat quality traits in chicken using a genome-wide association study. BMC Genomics 2013; 14:458
crossref pmid pmc
5. Harford ID, Pavlidis HO, Anthony NB. Divergent selection for muscle color in broilers. Poult Sci 2014; 93:1059–66.
crossref pmid pdf
6. Jang HM, Erf GF, Rowland KC, Kong BW. Genome resequencing and bioinformatic analysis of SNP containing candidate genes in the autoimmune vitiligo Smyth line chicken model. BMC Genomics 2014; 15:707
crossref pmid pmc
7. Kong BW, Lee JY, Bottje WG, et al. Genome-wide differential gene expression in immortalized DF-1 chicken embryo fibroblast cell line. BMC Genomics 2011; 12:571
crossref pmid pmc pdf
8. Shiloh Y, Ziv Y. The ATM protein kinase: regulating the cellular response to genotoxic stress, and more. Nat Rev Mol Cell Biol 2013; 14:197–210.
9. Metcalfe JA, Parkhill J, Campbell L, et al. Accelerated telomere shortening in ataxia telangiectasia. Nat Genet 1996; 13:350–3.
crossref pmid pdf
10. Lieber MR, Ma Y, Pannicke U, Schwarz K. Mechanism and regulation of human non-homologous DNA end-joining. Nat Rev Mol Cell Biol 2003; 4:712–20.
crossref pmid
11. Anai M, Shojima N, Katagiri H, et al. A novel protein kinase B (PKB)/ AKT-binding protein enhances PKB kinase activity and regulates DNA synthesis. J Biol Chem 2005; 280:18525–35.
crossref pmid
12. Danielsen JM, Sylvestersen KB, Bekker-Jensen S, et al. Mass spectrometric analysis of lysine ubiquitylation reveals promiscuity at site level. Mol Cell Proteomics 2011; 10:M110.003590
13. Emanuele MJ, Elia AE, Xu Q, et al. Global identification of modular cullin-RING ligase substrates. Cell 2011; 147:459–74.
crossref pmid pmc
14. Kim W, Bennett EJ, Huttlin EL, et al. Systematic and quantitative assessment of the ubiquitin-modified proteome. Mol Cell 2011; 44:325–40.
crossref pmid pmc
15. Wagner SA, Beli P, Weinert BT, et al. A proteome-wide, quantitative survey of in vivo ubiquitylation sites reveals widespread regulatory roles. Mol Cell Proteomics 2011; 10:M111.013284
16. Modderman PW, Admiraal LG, Sonnenberg A, von dem Borne AE. Glycoproteins V and Ib-IX form a noncovalent complex in the platelet membrane. J Biol Chem 1992; 267:364–9.
crossref pmid
17. McCormick RJ. Extracellular modifications to muscle collagen: implications for meat quality. Poult Sci 1999; 78:785–91.
crossref pmid pdf
18. Gowe RS, Robertson A, Latter BDH. Environment and poultry breeding problems 5. The design of poultry control strains. Poult Sci 1959; 38:462–71.
crossref pdf

Editorial Office
Asian-Australasian Association of Animal Production Societies(AAAP)
Room 708 Sammo Sporex, 23, Sillim-ro 59-gil, Gwanak-gu, Seoul 08776, Korea   
TEL : +82-2-888-6558    FAX : +82-2-888-6559   
E-mail : editor@animbiosci.org               

Copyright © 2024 by Asian-Australasian Association of Animal Production Societies.

Developed in M2PI

Close layer
prev next