Marker Assisted Breeding and Map Based Cloning of Genes

2.1.1 Molecular Maps

With the advent of recombinant DNA technology came the application of cloned DNAs as probes to genomic DNA of the source organism and the revelation that different alleles could be detected between individuals based on restriction fragment length polymorphisms (Helentjaris et al. 1985) (Figure 1). Genetic maps based on these and other types of molecular markers (Table 1) were developed for many organisms including crop plants such as rice (McCouch et al. 1988; html), lettuce (Kesseli et al. 1994), tomato (Tanksley et al. 1992), alfalfa (Brouwer and Osborn 1999), and Brassica spp. (Kole et al. 2002), among others. Likewise molecular maps were developed for key fungal pathogens such as Magnaporthe grisea (Farman and Leong 1995; Nitta et al. 1997; Skinner et al. 1993; Sweigard et al. 1993), Phytophtora infestans (van der Lee 2001), and Leptosphaeria maculans (Pongam et al. 1988). These studies also began to reveal the complexities of these genomes in terms of repeated DNAs, their function as transposable elements (Goff et al. 2002; Hamer et al. 1989; Kachroo et al. 1997), their distribution within the genome (Goff et al. 2002; McCouch et al. 1988; Nitta et al. 1997; Yu et al. 2002), and their role in genome evolution and host recognition (Farman 2002; Farman et al. 2002; Kang et al. 2001; Song et al. 1997; 1998). Comparative maps were generated in plants by mapping markers across genera and showed considerable synteny within families of plants (Ahn and Tanksley 1993; Bennetzen and Freeling 1993; Chen et al. 1997; Dunford et al. 1995; Gale and Devos 1998; Hulbert et al. 1990; Saghai Maroof et al. 1996; Tanksley et al. 1992). The mapping of phenotypic markers, both native and induced by mutation, followed closely behind and yielded precise information on the chromosomal location of genes important to plant defense (Ronald et al. 1992; Wang et al. 1995) and fungal host specificity (Dioh et al. 2000; Smith and Leong 1994; Sweigard et al. 1993) and led to their cloning by chromosome walking (Cao et al. 1997; Farman and Leong 1998; Martin et al. 1993; Orbach et al. 2000; Song et al. 1995; Sweigard et al. 1995). The cloning of a plethora of disease resistance genes from many plant species has shown that they belong to a small number of structural classes (Brueggeman et al. 2002; Chauhan and Leong 2002; Dangl and Jones 2001; Meyers et al. 1999; Xiao et al. 2001) (Figure 2). By contrast, the predicted structures of fungal cultivar specificity genes are quite diverse (Bohnert et al. 2001; De Wit and Joosten 1999; Orbach et al. 2000; Sweigard et al. 1995).

These studies have been complemented by the mapping of candidate genes such as the PR (pathogenesis-related) proteins in plants that were discovered from differential expression of RNA and protein during plant infection (Muthukrishnan et al. 2001) or resistance gene analogs based on the conserved structural features of disease resistance genes (Boyko et al. 2002; Chauhan et al. 2002; Faris et al.

Homozygous susceptible

Homozygous susceptible

Figure 1 Cosegregation of a RFLP marker (R-23 16) with Pi-CO39(t) locus in homozygous F2 susceptible progenies. Genomic DNA of CO39 (R, resistant), 51583 (S, susceptible) and F2 progenies was digested with Dra1, blotted and probed with R-2316. Recombinant progenies show DNA fragments from both parents. Phosphoimage of Southern blot is shown.

Table 1 Molecular markers used in mapping of traits



Restriction fragment length polymorphism


Random amplified polymorphic DNA


Amplified polymorphic DNA


Cleaved amplified polymorphism


Amplified fragment length polymorphism


Polymorphism based on different numbers

on mono, di, tri, or

tetranucleotide repeats


cDNA amplified restriction fragment length


Figure 2 Different classes (A-G) of plant disease resistance genes [reviewed in Chauhan and Leong (2002)): Genes in classes A and B are cytoplasmic proteins differing only in their N-terminal domains; Class C genes encode putative transmembrane molecules with an extracellular LRR domain; Xi21 is a transmembrane protein with an extracellular LRR domain; Pto is a cytoplasmic Ser/Thr kinase; RPW8 contains a putative N-terminal TM domain and a CC domain; Hm1 is a unique enzyme that detoxifies a fungal toxin; Abbreviations for domains: TIR, Drosophila Toll/Human Interleukin-lreceptor; CC, Coiled-coil; NBS, Nucleotide binding site; LRR, Leucine-rich repeat; TM, Transmembrane; Ser/Thr, Serine/threonine kinase.

Figure 2 Different classes (A-G) of plant disease resistance genes [reviewed in Chauhan and Leong (2002)): Genes in classes A and B are cytoplasmic proteins differing only in their N-terminal domains; Class C genes encode putative transmembrane molecules with an extracellular LRR domain; Xi21 is a transmembrane protein with an extracellular LRR domain; Pto is a cytoplasmic Ser/Thr kinase; RPW8 contains a putative N-terminal TM domain and a CC domain; Hm1 is a unique enzyme that detoxifies a fungal toxin; Abbreviations for domains: TIR, Drosophila Toll/Human Interleukin-lreceptor; CC, Coiled-coil; NBS, Nucleotide binding site; LRR, Leucine-rich repeat; TM, Transmembrane; Ser/Thr, Serine/threonine kinase.

1999; Gebhardt and Valkonen 2001; Huang and Gill 2001; Li et al. 1999; Shen et al. 1998; Speulman et al. 1998). This analysis has been particularly well advanced in wheat and its relative Aegilops tauschii. Resistance and defense response genes in A. tauschii are localized in clusters primarily in distal/telomeric regions of the genome (Boyko et al. 2002) while in Chinese spring wheat defense response genes are localized in clusters and/or at distal regions of chromosomes (Li et al. 1999). In many cases, these genes or gene homologs have been correlated with loci that affect quantitative or single gene resistance in the respective plants. For example, QTLs with large effects in wheat were shown to contain RGAs or clusters of defense response genes such as catalase, chitinase, thaumatins, and an ion channel regulator (Faris et al. 1999). Similar results are emerging in the genomes of potato (Gebhardt and Valkonen 2001), Arabidopsis (Speulman et al. 1998) and pepper (Pflieger et al. 2001).

Efforts are underway to functionally characterize 179 NBS-LRR-encoding genes that may encode disease resistance genes in the Arabidopsis genome (Figure 2). These have been organized into subclasses and their distribution mapped to the chromosomal sequence (Michelmore 2002; A publicly available, draft ordered sequence of the rice genome is anticipated by the end of 2002 ( and will allow comparisons to be done across syntenic regions of grass genomes (!). The unordered draft sequence of rice varieties Nipponbare and 93-11 has revealed numerous NBS-LRR-containing sequences as well as sequences potentially encoding other minor classes of R genes and Arabidopsis genes known to control defense response signal transduction (Goff et al. 2002; Yu 2002). Preliminary studies based on conservation of RGAs in comparative maps of the grasses have shown evidence for some conservation but also redistribution of this class of genes among the grasses (Leister et al. 1998). Likewise a detailed comparison of a syntenic region between barley and rice did not reveal any candidate resistance genes in rice that could be the ortholog of Rpgl in barley (Han et al. 1999; Kilian et al. 1997).

2.1.2 Differential cDNA-AFLP Screens

Differential cDNA-APLP screening has been done to isolate hypersensitive response (HR)-specific genes to the Clade-sporium fulvum elicitor Avr4 in tomato and has led to the isolation of a previously known and corresponding disease resistance gene cluster Cf-4 as well as numerous new candidate genes involved in the HR response (De Wit et al. 2002; Takken et al. 2001). This method is a robust and inexpensive way to identify differentially expressed genes involving the digestion of cDNAs with two different restriction enzymes and the amplification of the resulting products after ligation to adapters for these enzymes. The sizes of the resulting amplicons are measured by gel electrophoresis and resulting fragments can be excised and sequenced. Comparison of this method with differential display has shown the cDNA-AFLP method to be superior (Jones and Harrower 1998). Using the cDNA-AFLP technique, Durrant et al. (2000) found a strong coincidence between the expression of genes involved in race-specific resistance and the wound response in tobacco cell cultures. Collectively, these candidate genes will provide additional markers for studies of disease resistance traits in the potato and tomato genomes. Infection-specific cDNA-AFLPs have also been identified in Arabidopsis thaliana inoculated with Peronospora parasitica (van der Biezen et al. 2000). Interestingly, most fragments were derived from the fungal pathogen showing the power of this method to study genes expressed in the pathogen, which in many cases may represent a minor component of the mass of the tissue studied.

Previous genetic studies by Valent et al. (1991) have shown that infection of some grass hosts by M. grisea is a quantitatively inherited trait. The cDNA-AFLP approach would allow for the identification a unique set of cDNA-AFLPs in each progeny showing varying degrees of pathogenicity and in some cases segregating with patho-genicity. This approach has been recently used to create genome-wide transcription maps of Arabidopsis and potato and study inheritance of the cDNA-AFLPs in segregating populations (Brugmans et al. 2002). Thus phenotypes can be directly associated with molecular genotypes and candidate gene fragments can be excised from gels for further analysis. We are using this approach in an attempt to identify major and minor genes controlling resistance to blast and drought tolerance in Eleusine coracana.

Computational methods for relating the size of the AFLP restriction fragment products to the predicted restriction fragment products from sequenced cDNA libraries have been developed and used to identify putative, infection stage-specific, pathogenicity factors from the plant pathogenic nematode Globoderu rostochiensis without need for sequencing the gel fragments (Qin et al. 2001). For those organisms having fully sequenced, annotated full length cDNA libraries such as Arabidopsis (Seki et al. 2002), this approach provides for the rapid functional classification of the cDNA-AFLPs.

2.1.5 Identification of QTL-Associated Genes

Very few QTL studies in plants have led to the cloning of a single gene within a QTL that is responsible for the variation seen [reviewed in Buckler and Thornsberry (2002)]. These few examples represent QTLs that had major effects on variation. Buckler and Thornsberry (2002) have proposed that association approaches should be also considered to provide improved resolution and to reduce the time of analysis as mapping populations are not needed since natural variation in a population is investigated instead. The resolution of association that can be obtained depends on the linkage disequilibrium (LD) structure of the population of organism being studied and some insight on candidate gene(s) to target. Studies on LD structure have shown that inbreeding plants such as Arabidopsis have large LD structures on the order of 250 kb or 1 cM (Nordborg et al. 2002) while outbreeding plants like maize have very small LD structures in the order of kbs (Buckler and Thornsberry 2002). Using this approach, Thornsberry et al. (2001) were able to associate polymorphisms found in the Dwarf8 gene of maize with variation seen in flowering time.

In Saccharomyces cereviseae whose entire genome sequence is known, a QTL for high temperature growth (Htg) commonly found in clinical isolates was rigorously analyzed to identify the responsible genes (Steinmetz et al. 2002). Using reciprocal hemizygosity tests, involving selective gene disruption of candidate genes in both parental genomes and then forming diploid hybrids among these strains, three genes were found to contribute to the phenotype in the QTL interval. However, the alleles of two genes came from one parent strain while that of the third came from the other parent. By contrast attempts to employ natural sequence variation or mRNA expression levels determined from several natural isolates of yeast did not provide a clue to which gene(s) in the interval contributed to the phenotype. Thus employing LD and association by decent to accelerate gene identification in an interval may not always succeed and genetic studies will be required to study inheritance and create reciprocal hemizygotes in targeted regions of the genome. The use of allele-specific gene silencing methods (see below) may allow this to be done in the F1 generation of plants while targeted gene disruption methods can be used in fungi such as Ustilago maydis that have a stable diploid phase and facile gene knockout system. Transformation of haploid fungi with wild type and disrupted, endogenous or alternative alleles of candidate genes might be a useful strategy for those fungi that cannot form stable diploids. In fact these strategies have been used to unravel the complex functions of the east and west alleles of b mating type locus of U. maydis (Gillissen et al. 1992; Kamper et al. 1995). Gene silencing has been used in fungi such as P. infestans in which silencing of the fungal elicitin INF1 increased virulence on Nicotiana benthamiana (Kamoun et al. 1998).

2.1.5 Candidate Gene Validation

It should be emphasized that candidate genes are simply "candidate" genes and that confirmation of a gene's function with a genetically and/or expression-defined phenotype must be done by transformation and complementation tests (Farman and Leong 1998; Orbach et al. 2000; Song et al. 1995; Wang et al. 1999; Yoshimura et al. 1998). Recently gene silencing has also been successfully applied to study gene function in several plants (Azevedo et al. 2002; Baulcomb et al. 2002; Peart et al. 2002; Wesley et al. 2001). This method involves the cloning of a small fragment (~ 200 nucleotides) of a gene into an expression vector and transforming the plant [reviewed in Baulcomb et al. (2002)]. The resulting small RNA is made double stranded and then is digested into small dsRNA fragments (siRNA, small interfering RNA), which are thought to guide RNAse to the nascent wild-type gene transcript and causes it to be degraded thus leading to a net loss in the gene's expression. Direct bombardment of plant cells with dsRNA is also possible (Schweizer et al. 2000). In the examples described earlier, the specific genes RAR1 in barley, and Rx, N , Pto, and EDS1 in N. benthamiana implicated in disease resistance signaling were silenced and found to be essential for the signaling process. This approach is being extended to the high throughput analysis of the HR candidate genes from tomato noted earlier (De Wit et al. 2002) as well as in a normalized cDNA library from N. benthamiana (Baulcomb et al. 2002). Interestingly, many genes in N. benthamiana were found in the preliminary round of analysis to affect the HR while not affecting pathogen growth while some genes were affected in both phenotypes when silenced. Moreover almost 1% of the genome appears to affect the HR response.

2.1.5 Integration of Molecular Biology with Classical Breeding

The application of molecular markers to traditional breeding has provided a powerful method to accelerate breeding as phenotypic tests are not essential and rapid DNA isolation methods using hole punch-sized pieces of leaf tissue are possible in young seedlings (Huang et al. 1997). Thus tightly cosegregating or gene(allele)-specific markers can be followed and confirmation of phenotype can be done on a selected set of plants within a population that are destined for further crossing. Phenotypic validation is essential as recombination, gene conversion or other confounding events can take place, even within the gene being studied, leading to an inaccurate scoring based on markers alone. As we learn more about the function of plant genes and specific alleles of these genes in disease resistance through mapping and functional tests, we can anticipate the increased application of molecular markers and gene chips (see later) to the breeding of disease resistance in plants. In particular, it will be interesting to know what contribution pathogenesis-related proteins, which are generically present in all plant genomes, make to quantitative resistance. Is expression more efficient in some genomes than others because of the gene's placement in clusters and/or their specific transcription regulatory elements and/or their duplication or absence in some genomes and/or the efficacy of specific alleles? Likewise, what is the genetic and molecular basis of host specialization in the fungi? The location of many disease resistance and defense genes at the ends of chromosomes in wheat may affect their stability through recombination and chromosome breakage as well as their expression through unique chromatin organization (Faris et al. 2000). The telomeric location of AVR1-PITA in M. grisea has been shown to contribute to its instability leading to strains with increased virulence (Orbach et al. 2000). The BUF1 gene of M. grisea appears to be readily deleted in one parental chromosome by intrachromosomal recombination of repeated DNA flanking the locus as a result of mispairing of homologous chromosomes during meiosis (Farman 2002). In Alternaria alternata a conditionally dispensable chromosome controls production of a host-specific toxin (Hatta et al. 2002). Likewise Han et al. (2001) found that genes required for pathogenicity on pea are present on a dispensable chromosome of Nectria haematococca.

Exploitation of the multitude of novel genotypes found in plant germplasm now available in gene banks will often require the methods of molecular mapping and candidate gene isolation described above to identify the genes that contribute to these unique phenotypes (Fulton et al. 1997; Tanksley and McCouch 1997; Xiao et al. 1998). This is true even for plant genomes for which the entire genome sequence is known, thus allowing candidate genes to be isolated in related genomes. For example, the short stature mutation sg1 found in green revolution rice variety IR8 encodes a mutant biosynthetic gene for gibberellin while the semidwarf phenotype in green revolution varieties of wheat is conferred by mutations in the gibberellin signaling pathway (Sasaki et al. 2002). In addition, many previously unidentified genes have been found in every genome that has been sequenced (Goff et al. 2002; Yu et al. 2002). In the case of disease resistance genes, the functions of only a few are known in each sequenced genome and not all LRR-containing coding sequences are likely to function in disease resistance. For example, a LRR receptorlike transmembrane protein kinase gene, that is gibberellin-induced and specifically expressed in growing tissues of deep-water rice, may function in hormone signaling (van der Knaap 1999). The genetic location of disease resistance genes must be determined in the genome that contains the functional gene if the reference genome sequence lacks a functional copy. These tenets also apply to the identification of fungal genes involved in plant recognition. Examples exist for the complete absence of a recessive gene in fungal strains that have lost cultivar specificity (van den Ackerveken et al. 1992; Farman et al. 2002). Furthermore, the recently released genomic DNA sequence of M. grisea having 7X coverage ( magnaporthe/) does not contain the AVRI-CO39 cultivar specificity gene (RS Chauhan, D Lazaro, and SA Leong, unpublished data). This genome sequence is thus useless without precise genetic mapping data for this AVR gene that can be used to identify in the reference, sequenced genome, a contiguous sequence spanning the genome between these markers. This sequence can then be used to develop new genetic markers and to probe libraries of a strain that does carry AVR1-CO39. In fact this AVR gene was originally cloned using a more laborious chromosome walking strategy in the genome of a strain carrying the functional gene (Farman and Leong 1998).

Was this article helpful?

0 0

Post a comment