Microsatellite Sequences of Mammals and Their Applications in Genome Analysis in Pigs-A Review

The microsatellites are the short tandem repeats of 1 to 6 bp long monomer sequences that are repeated several times. These short tandem repeats are considered to be generated by the slipped strand mispairing. Based on the unique capability of alternating purine-pyrimidine residues to form Z-DNA, the possible role of the microsatellites in gene regulation has been proposed. The microsatellites are highly polymorphic, follow Mendelian inheritance and are evenly distributed throughout the genomes of eukaryotes. They are easy to isolate and the polymerase chain reaction based typing of the alleles can be readily automated. These properties make them the preferred markers for comparison of the genetic structure of the closely related breeds/populations; very high-resolution genetic mapping and parentage testing etc. The microsatellites have rapidly replaced the restriction fragment length polymorphism (RFLP) and the random amplified polymorphic DNA (RAPD) in most applications in the population genetics studies in most species, including the various farm animals viz. cattle, buffalo, goat, sheep and pigs etc. More and more reports are now available describing the use of microsatellites in pigs ranging from measurement of genetic variation between breeds/populations, developing high resolution genetic maps to identifying and mapping genes of biological and economic importance. (Asian-Aust. J. Anim. Sci. 2002. Vol 15, No. 12 : 1822-


INTRODUCTION
The progress in the recombinant DNA technology and the gene cloning during the last two decades has brought in the revolutionary changes in the field of genetics by providing several new approaches for the genome analysis.It is now possible to uncover a large number of genetic polymorphisms at the DNA sequence level, and to use them as markers for the evaluation of genetic basis for the observed phenotypic variability.The identification methods, such as the typing of blood groups and the biochemical polymorphisms have proved their usefulness in the pigs and other species (Widar et al., 1975;Tanaka et al., 1983;Oshi et al., 1990;VanZeveran et al., 1990a,b), but the discriminating power of these techniques is less than that of DNA markers (VanZeveran et al., 1995;Goldstein and Schlotterer, 1999).Moreover, the number of different tissues on which the typing can be done is very limited and represents a significant limitation of such methods.Several DNA-based technologies to type the polymorphic loci have been developed in the last decade.These techniques include the restriction fragment length polymorphisms (RFLP), variable number of tandem repeats (VNTR), single strand conformational polymorphism (SSCP), denaturing gradient gel electrophoresis (DGGE), random amplified polymorphic DNA (RAPD) and also the methods which make use of the polymorphism of the short tandem repeats (STR) called the microsatellites.
The microsatellites are the short segments of the DNA in which a specific motif is tandomly repeated.They are sometimes referred to as the short tandem repeats or simple repeats (Litt and Luty, 1989).The microsatellites are ubiquitous in the genomes of wide range of organisms and the number of repeats within many of them is highly variable.The polymerase chain reaction (PCR) provides the means for the rapid analysis of the repeat number.The microsatellite polymorphisms are now used for a wide range of applications in genetics including the genetic distancing studies between the breeds/populations/ individuals, linkage mapping of the disease genes and the quantitative traits, paternity testing and individual identification for the selective breeding (Bruford and Wayne, 1993;Jarne and Lagoda, 1996;Goldstein and Schlotterer, 1999).This article reviews about the occurrence, evolution, polymarphism and functional significance of microsatellites and their application in the genome analysis in the pigs.

OCCURRENCE AND DISTRIBUTION OF MICROSATELLITES
A striking feature of the genomic organization in the eukaryotes is that the coding sequences constitute only a minor portion, about 5 to 10 per cent in the mammals, of the total genome (Hochgeschwender and Brennan, 1991).The apparently non-functional DNA is either the single copy DNA or the repetitive DNA.The schematic presentation of the mammalian genome organisation is given in Figure 1.The repetitive elements may be interspersed in the genome or may occur as tandem repeats (Schmid and Jelinek, 1982).In mammals, two major groups of the interspersed repetitive elements can be recognised: the short interspersed elements (SINEs) and the long interspersed elements (LINEs) (Schmid and Jelinek, 1982;Rogers, 1983;Singer and Skorownski, 1985;Singer et al., 1997).Both types of the elements are proposed to have originated by the reverse flow of information by retroposition (Rogers, 1985).
The repetitive elements arranged in the tandem are as common as the interspersed repeats.The tandem repeats are broadly referred to as the satellite type of DNA, a designation derived from classical DNA preparations from the satellite peaks obtained during the CsCl gradient centrifugation (Endow et al., 1975).The satellite types are further classified according to the size or location of the repeat as satellite, telomeric, minisatellite and microsatellite DNA (Brutlag, 1980;Prosser et al., 1986).The satellite DNA is characterized by huge arrays of the short or long repeats, spanning several millions of nucleotides and are the typical centromere sequences in many mammals (Singer, 1982).Whereas, telomeric DNA is characteristic of the telomeric regions of the DNA.The telomeric repeat region spans about 10-15 kb and is site of the telomerase activity (Biessmann and Mason, 1992).The third class of tandem repeats comprises the minisatellite DNA (Jafferys et al., 1985).These elements are composed of the 10-60 bp units that may be repeated upto thousands of time.
The fourth class of the tandem repeats is referred to as the microsatellite (Goldstein and Pollock, 1997).In the microsatellite the repeat motifs of 1 to 6 base pairs are repeated upto a maximum of about 100 times.They appear to be abundant and are evenly distributed throughout the genome occurring once in about every 6 kb of the genome.However, some regions like the centromeres, telomeres, nuclear organisation regions and the interstitial heterochromatin have lower densities (Starling et al., 1990;Wintero et al., 1992).The most common microsatellites in the mammals are the (A) n , (CA) n , (AAAT) n and (AG) n .For the most motifs, the short stretches of the repeat units are generally more common than the longer stretches (Beckmann and Weber, 1992).

EVOLUTION OF MICROSATELLITES
The simple tandem repeats are considered to be generated by the slipped strand mispairing.According to this model, short deletions or insertions may arise from the intrahelical slippage events in the short tandem repeat regions.The slipped strand mispairing or the replication slippage refers to the out-of-register alignment of the two DNA strands following dissociation at the time when the DNA polymerase traverses the repetitive region (Levinson and Gutman, 1987).If the most 3'-repeat unit of the nascent strand hybridizes with a complementary repeat unit downstream along the template strand, a loop will be formed in the nascent strand and the new sequence will become correspondingly longer than the template sequence upon elongation.Conversely, if the incorrect alignment occurs upstream along the template strand, the new strand will become shorter than the template sequence (Figure 2).
The possible existence of other mutation mechanisms contributing to the length variation at the microsatellite loci is a long standing debate in the microsatellite literature (Kruglyak et al., 1998;Colson and Goldstein, 1999;Schlotterer, 2000).In theory, new length variants at the repetitive DNA sequences can form through the interchromosomal exchange, for example, in conjunction with the recombination (unequal crossing-over) or gene conversion.However, based on the accumulated evidence in favor of the replication slippage, the role of the recombination like events in the length mutations at the microsatellite loci can be evaluated.Mahtani and Willard (1993)  indicate the intrahelical mutation events.The rate and pattern of the microsatellite mutation does not seem to differ between the hemizygote chromosomes (e.g. the 7chromosomes) and the chromosome pairs (autosomes of diploids), suggesting that the mutation events do not require contact between the homologous chromosomes (Keysar et al., 2000).Moreover, the character of mutations is generally consistent with that arising from the replication slippage.
The in vitro experiments clearly demonstrate that the microsatellite sequences have the intrinsic ability to undergo the DNA slippage (Schlotterer and Tautz, 1992).

FUNCTIONAL SIGNIFICANCE OF MICROSATELLITES
The functional significance of the microsatellites remains to be clearly understood.Several probable roles have been proposed, all more or less relating to the unique capability of the alternating purine-pyridinine residues to form the Z-DNA (Hamada et al., 1984b).The (GC) n or (CA) n can change from the B to the Z-DNA, in vitro, in response to the various environmental factors like the elevated ionic strength (Klysik et al., 1981), negative torsional stress (Hanniford and Pulleyblank, 1983), presence of intercalators and the modification of guanine or cytosine residues (Rich et al., 1984).The identification of nuclear proteins that preferentially bind to the Z-DNA and in situ application of the antibodies raised against the Z-DNA provide indirect evidence of possible existence of the Z-DNA in vivo (Nordheim et al., 1982).
When the Z-DNA structure had been demonstrated for alternating purine-pyrimidine residues, the involvement of this structure in recombination process became the tempting suggestion (Jelinek et al., 1980).The stickiness of the left-handed DNA could facilitate the correct pairing of the homologous chromosomes during the meiotic recombination (Blaho and Wells, 1989).The DNA sequences associated with synaptonemal complexes are rich in the microsatellite sequences (Pearlman et al., 1992).The presence of the microsatellite loci may also affect the reciprocal meiotic exchange (Schultes and Szostak, 1991).The d(CA) n .d(GT)n microsatellites have been demonstrated to inhibit rec-A promoted strand exchange in vitro (Gendrel et al., 2000) The possible association between the simple repeats and the gene regulation was prompted by the observation that the Z-DNA segment is present in the SV40 enhancer (Nordheim and Rich, 1983).An inserted (CA) 15 repeat increases the gene expression in vitro.Like the viral enhancers, it increases the gene expression from the distance and is more effective near to the promoter.It works irrespective of the orientation (Hamada et al., 1984a).However, if the simple repeats play a role in the gene regulation, their influences are not uniform.The activation or inactivation of the gene by the microsatellite repeats may be associated with its ability to interconvert between the B and the Z form of DNA in vivo (Santoro et al., 1984;Naylor and Clark, 1990).Such inter-conversions would enhance the distortion of the DNA, at proximal or distal site resulting in the activation of the gene expression (Albanase et al., 2001).
The simple tandem repeats are absent in the bacteria and are more frequent in the euchromatin than the heterochromatin (Stallings, 1992).Their conformational properties may provide a suitable condition for the repeated packaging and condensing of DNA during the cell cycles (Stallings et al., 1991).These highly repetitive and conserved sequences might also function as the repository Reprinted from Ellegren, 2000b;copyright (2000), with permission from Elsevier Science.
of the unessential DNA sequences for use in the future evolution of the species.It is also a possibility that they might not have any function at all and these are just the "junk" DNA that is carried along by the process of replication and segregation of the chromosomes.However, the validity of these postulated functions of the microsatellites needs the further investigation.

MICROSATELLITE POLYMORPHISM
As compared to the most other types of the DNA sequences, the microsatellites are the highly polymorphic which makes them attractive as genetic markers (Goldstein and Shlotterer, 1999).For the naturally evolving DNA sequences, the amount of polymorphism is expected to be directly proportional to the mutation rate (Kimura, 1983).Hence, it has been assumed that the mutation rate to form new length variants at the microsatellite loci is appreciable, and this idea has been substantiated by direct observations of the spontaneous events of the germline mutation from the pedigree analysis (Weber and Wong, 1993;Goldstein and Pollock, 1997).
The eukaryotic DNA sequences mutate at the rate of approximately 10 -9 per nucleotide per generation (Crow, 1993).The mutation rate of the microsatellites is several orders of the higher magnitude, often quoted in the range of 10 -3 to 10 -4 per locus per generation (Weber and Wong, 1993).The most recent data from the large scale human genome mapping or paternity testing based on many different loci suggest an even higher rate of about 2×10 -3 per meiosis (Ellegren, 2000a;Kayser et al., 2000).
The degree of polymorphism, at least for the mammalian (CA) n repeats is positively correlated with the average number of the repeat units.As a thumb rule, the mammalian microsatellite repeats with less than 10 repeat units is likely to be the monomorphic.On the other hand, the repeats with an average number of iterated units exceeding 20 may possess the polymorphism information content values of 0.6 or more (Vaiman et al., 1995).In the compound repeats more than one motif may be polymorphic.The microsatellites polymorphism have been well documented in fish, poultry and the livestock species including the pigs (Crooijmans et al., 1997;Yang et al., 1999;Diez-Tascon et al., 2000;Wimmers et al., 2000;Binadel et al., 2001;Bjornstad and Roed, 2001;Canon et al., 2001).

APPLICATIONS OF MICROSATELLITES IN PIGS
Earlier, only laborious cloning and sequencing procedures could detect the polymorphism of the microsatellites, their importance for the genome analysis appeared insignificant.However, the introduction of the PCR has completely changed the picture.By designing the synthetic oligonucleotides flanking a microsatellite, a locus specific amplification of the repeat region can be primed (Saiki et al., 1988).The length of the amplified fragment will vary according to the number of repeats and this can be simply measured by electrophoresis of the amplified product (Smeets et al., 1991;Weber and May, 1989).An individual heterozygous for the two size variants will thus give rise to the two fragments of different lengths.Generally, along with the genuine alleles, the stutter bands are visible.These extra bands can be easily identified as they are less intense than the main bands (Jacob et al., 1991).Besides, the obvious advantage of PCR based analysis, the applicability of the microsatellite markers in genome analysis primarily depend on the three inherent properties: abundance, hypervariability and Mendelian inheritance.These properties make the microsatellites very informative markers in the genome analysis and are used for various applications in pigs and other species (Beuzen et al., 2000;Ellegren, 2000b;Binadel et al., 2001).

Genetic analysis of closely related breeds/populations
The allelic frequency data obtained after the PCR-based genome scoring can be utilized for studying the evolutionary relationships of the closely related breeds/ populations of a species or closely related species (Bowcock et al., 1994;Laval et al., 2000).The high degree of the polymorphism makes them the markers of choice for such studies over the conventional markers like the restriction fragment length polymorphism, which generally have only two alleles, and hence a maximum theoretical heterozygosity of 50% (Botstein et al., 1980).The microsatellites give discriminating and significantly concordant results as compared to RAPD (Bart-Delabasse et al., 2001).
The more and more reports are now appearing in the farm animals including pigs describing genetic characterization of the breeds using the microsatellite markers.Fredholm et al. (1993) characterized 24 porcine (dA-dC) n -(dT-dG) n microsatellites for genotyping of the four European pig breeds.Van Zeveran et al. (1995) used the seven microsatellites to study the four Belgian pig populations.The Variation between the Chinese indigenous Meishan and the Western breeds was also studied using the microsatellite markers (Paszek et al., 1998a,b).Recently, Li et al. (2000b) reported variation among the seven local pig breeds of China using the six microsatellite loci.Niu et al. (2001) analysed the 5 lineages of Xishuangbanna miniature pig inbred lines with the 35 microsatellite loci.It is difficult, however, to group the data from these studies together in order to clarify the genetic relationships among the major pig breeds as they do not use the common set of the microsatellites for analysis of the genetic diversity.
To maintain the homogeneity of the results so that the data can be compared at the international level, the Food and Agricultural Organisation has recommended the use of a species specific set of at least 25 microsatellite markers for such studies in the pigs and other farm animals (FAO, 1998).The reports are now available on the European (Laval et al., 2000), Iberian (Martinez et al., 2000), Chinese (Li et al., 2000a) and Indian (Behl et al., 2002) pig breeds using FAO recommended swine microsatellite loci.Such genetic relatedness/distancing studies using the microsatellites will help in assensing the genetic variation within and between the breeds and to define a diversity measure which will permit the ranking of the breeds for conservation purpose thus providing the useful information concerning the relative contribution to the genetic diversity.This will allow for the future management of the breeds to be based on the greater knowledge of their genetic structuring and the relationships between their populations/ breeds.

Linkage analysis and gene mapping
The use of microsatellites for the genome wide linkage search has several advantages.The microsatellites, being the highly polymorphic, a small number of the families will be sufficient to prove or disprove the linkage.Also, multiple microsatellites can be analysed simultaneously since several loci can be amplified in the single PCR and, provided that their allele sizes do not overlap, analysed on the same lane of the gel (Peelman et al., 1998).These excellent properties of the microsatellites and their ubiquity in the genome makes them an effective tool for linkage studies and gene mapping.
The genetic linkage analysis is based on the principle that if two genes on the DNA segment are located close to each other on the same chromosome, they are likely to be inherited together.The linkage between the two genetic loci is established when they show significant co-segregation in the offspring.The microsatellite markers scattered throughout the mammalian genome, are probably best markers available for linkage studies.Linkage maps solely or mostly built up by microsatellite markers have been reported for various species including farm animals (Crawford et al., 1995;Vaiman et al., 1996;Berandse et al., 1997;Kappes et al., 1997).Since more than 1500 porcine microsatellite markers are now available (Binadel et al., 2001) genetic maps of porcine genome have been developed during the last decade using the microsatellites (Ellegren et al., 1994;Rohrer et al., 1994;Archibald et al., 1995;Rohrer et al., 1996;Mikawa et al., 1999).
These linkage maps for the genomes of the pig and other domestic animals are important as these genetic maps have made it possible to map the disease genes and their potential ability to genetically dissect the phenotypic traits of the agricultural or biological significance.Marklund et al. (1993) identified a linkage group of the three microsatellite loci, blood groups L, GBA and ATP1B1 on the pig chromosome 4. Wintero et al. (1994) assigned the gene for porcine insulin like growth factor 1 to the chromosome 5 by the linkage mapping with the S0005 microsatellite locus.Peelman (1999) identified the four microsatellites as the preliminary diagnostic tool to type pigs for the K88 E. coli neonatal diarrhea resistance or sensitivity.Hasan et al. (1999) employed the 14 microsatellites to map the Lgulono-gamma-lactone oxidase gene, which is a candidate for the vitamin C deficiency in to the chromosome 14.A whole genome scan was conducted using 132 microsatellite markers to identify the chromosomal regions that have an effect on the teat number in the Chinese Meishan pigs and the five commercial Dutch pig lines (Hirooka et al., 2001).The microsatellite loci have been also used for detection and localisation of the quantitative trait loci for the growth and fatness in the pigs (Marklund et al., 1999;Binadel et al., 2001;Wu et al., 2001).The molecular genome scan analysis to identify the chromosomal regions influencing the economic traits in the pigs using the microsatellites have been reported for other traits also, for axample, the teat number (Malek et al., 2001a,b).The information may be subsequently utilized for the breeding programmes in domestic animals through marker assisted selection (Soller and Beckmann, 1983).

Individual identification of parentage testing
The detection of the hypervariable sequences, the PCRbased microsatellite typing provide the powerful tool for the identity or the paternity testing (Hagelberg et al., 1991).With the selected microsatellite loci, the multiplex PCR system and electrophoresis in one gel lane, forms a highly discriminating, extremely powerful tool for parentage testing (Heyen et al., 1997;Peelman et al., 1998;Luikat et al., 1999).However, in the case of exclusions based on one microsatellite polymorphism, special attention must be given to whether the offspring and the parent in question are homozygous for these alleles.In these cases, it can not be excluded that the non-paternity/non-maternity would be incorrectly diagnosed, due to the allele non-amplification.The precision of the allele designation across the gels is sufficient for the five or six loci, thus allowing the comparison between the newly analysed and the stored samples.
The microsatellites have been successfully and extensiely employed for the parentage testing and individual identification for the breed allocation etc., in various domestic animals especially the dogs and the horses (Fredholm and Wintero, 1996;Bowling et al., 1997;Guerand et al., 1997;Peelman et al., 1998;Vijh et al., 1999;Alter et al., 2001).The microsatellites have been also employed for such the studies in wild animals (Talbot et al., 1996;Schnabel et al., 2000;Poetsch et al., 2001).However very few such reports are available for the pigs.Coppieters et al. (1993), based on the obseved allele frequencies in the studied pig populations, estimated the exclusion probability for the quadruplex microsatellites to be 0.96.Kaul et al. (2000) estimated the probability of identity of two random individuals from the two different native Indian pig populations, based on 13 microsatellites, to be 3.51×10 -19 .
Compared to the DNA profiling with VNTR loci, commonly applied in the forensics, the microsatellite typing offers several advantages (Devlin et al., 1990).For instance, since the allele are discrete and definable the risk of homozygosity excess, as may be seen with the continuous allele systems, can be ruled out (Gerber et al., 2000).Such deviations may otherwise bias the estimates of population allele frequencies, values necessary for determining the exclusion properties.

CONCLUSION
The microsatellites are the genetic markers that can be useful in addressing the questions at a variety of scales.More specifically, this genetic tool can help in solving the problems ranging from the individual-specific, such as the questions of relatedness and parentage, the genetic structure of populations, the comparison among breeds/populations/ species to the linkage analysis and gene mapping.Further, it has several technical and analytical advantages that make it superior to the genetic markers whose domains are far smaller.Thus, microsatellites are markers of the choice for the genome analysis studies.

Figure 1 .
Figure 1.Schematic presentation of repetitive DNA of mammalian genome

Figure 2 .
Figure 2. Generation of microsatellite polymorphism by slipped strand mispairing.The repeat units are denoted by numbered arrows.Reprinted fromEllegren, 2000b; copyright (2000), with permission from Elsevier Science.