Strategy. A physical map is the ordering of cosmid clones by their
position along a chromosome. Construction of a physical map begins with the creation of an
initial, partially ordered collection of clones, which is then edited to create a final
map. Editing includes integrating the physical map (ordered clones) with other genetic
information like ESTs to verify and if necessary, correct the physical map under
construction. Ordering of the clones will be accomplished by using the chromosomal clones
with unique assignments (S) as probes to link with other clones via computerized
compression analysis (a chromosome walk) as described in Prade, R.A., Griffith, J. Kochut,
K. Arnold, J. and Timberlake, W.E. (1997). PNAS USA, 94, 14564-14569. This ordering
will be done under the supervision of the Co-investigator, Dr. Jonathan Arnold, UGA.
a. Chromosomal assignment. Chromosomal assignment can be
accomplished in 2 ways. With a random genomic library, DNA from each of the 16 Pc
chromosomes is extracted, made radioactive using random hexamer priming with 32P-CTP,
and used to probe the membranes containing the cosmid libraries. Nylon membranes are
prehybridized and hybridized overnight using standard procedures. Membranes are exposed to
X-ray film for 24-72 hrs and visually examined for positive signals. Cosmids are then
assigned to chromosomes and classified as repetitive, hybridizing to all 16 chromosomes
(R); hybridizing to a limited set of chromosomes, >1 or <16 (L); or specific to a
particular chromosome (S). With this strategy, chromosome-specific clones will be picked
and distributed in 96-well microtiter plates to create chromosome libraries (non-ordered).
These libraries will then become immediately available to the scientific community through
the Fungal Genetics Stock Center to facilitate ongoing studies. Likewise, mini-libraries
of repetitive clones (likely containing msg and related genes) and limited
repetitive elements will be created. With the second strategy, construction of
mini-libraries from eluted chromosomal DNA, most clones created from a chromosome will be
by definition contained therein. However, identification of repetitive and limited
repetitive elements will require the same chromosomal labeling strategy used in the
previous approach.
- Probe/Clone Hybridization.
Two strategies are possible for labeling of specific
probes to be used for construction of the physical map; probing with the cosmid insert or
primer extension of the insert ends using the SP6 and T7 bacterial promoters. In our
preliminary work, we explored both methods. Use of gel-isolated cosmid inserts was
time-consuming and unnecessary in our evaluation. Priming off both ends of the cosmid
permitted a rapid identification of linking clones as well as clones containing repetitive
elements. A commercially available standard mini-prep procedure (Quiagen, Santa Clarita,
CA) will be used to isolate cosmid DNA. Each clone will be double stamped in a 5 x 5 array
using robotics to reduce false positives. Each end of the isolated cosmid insert will be
labeled using the SP6 and T7 promoters and radiolabeled dNTPs and hybridized to the
stamped libraries after standard prehybridization conditions. High stringency conditions
will be used for hybridization (60-65 oC) and for washes (2XSSC twice for 15
minutes decreasing to 0.1X SSC at the temperature of hybridization) to increase
specificity.
The robotics hardware and compression software available at UGA will
hasten this process by providing the ability to create high density stamped membranes
(2400 clones) permitting more rapid walking and by eliminating redundant clones creating a
compressed library. The software and robotics will analyze the membranes and select the
next probe/clone to be used to link to the next DNA sequence/chromosome, increasing the
efficiency of this process. Membranes will be scanned with the Packard Instant Imager and
the digital data deposited in a text file for loading into the Fungal Genome Database
(25).
It is estimated that with the use of the robotics systems currently
available at UGA, about 48 probings will be performed per week for the Pc Genome project.
Using the observed rate of progress of the A. nidulans mapping project, a contig
map of P. carinii should be generated in 8 weeks with a probe density of 1 cosmid
probe end per 29 kb, with robotic assistance. This mapping strategy will be repeated twice
for both the rat and human Pc genomes to reduce greatly the frequency of false joins
between contigs and to yield a physical map at 13 kb for both organisms. This resolution
of the physical map will permit the recovery of almost any DNA fragment by long distance
PCR.
c.Cataloguing of assigned clones. Clones will be assigned a call
number based on the plate of origin, position in the 4 x 4 array on each membrane, and
well number. The hybridization results are then entered according to call number
and result (R, L, or S; L results are listed as to the chromosome bands of identity, e.g.
1A12, L-234). Once the project is underway, the results from hybridizations will be
automatically fed into the Fungal Genome Data Base (FGDB) at the University of Georgia,
which then selects the probes/clones for subsequent screening.
d.. Constructing ordered clone banks. After fingerprinting each
cosmid in terms of chromosome and hybridization with a panel of probes, the cosmids will
then be ordered automatically by comparison of their "digital fingerprints" into
a 2-way layout. From an unordered binary data matrix, distances can be computed between
each pair of clones. Using this distance matrix, random cost algorithms in the Fungal
Genome Data Base (FGBD) will then be used to order quickly clones (in rows) by their
position along the chromosome. The transpose of the binary data matrix is then used to
order the cosmid probes (in columns) into cells, using the whole cosmid library as probes.
The final result is a redundant ordering of the library down the rows and a minimal tiling
across the columns. We plan to shift over to a maximum likelihood procedure (Kececioglu,
Shete, and Arnold, 2nd International Fungal Genome Workshop, Athens, GA, May
1998) for constructing these chromosome walks from the binary data matrix, once we have
finished verifying the code and improved performance of the ML procedure vs random cost
procedure.
e. Assessment of contig maps. The bootstrap method will be used to
assess the reliability of each linkage in the physical map. With the bootstrap method,
random subsets of probes are used to reconstruct the physical map and to asceretain
whether or not a clone was linked to (adjacent to) a neighboring clone via the
computational methods described above. Multiple rounds of this procedure are performed.
The number of times that a linkage appears is scored. The entire process is repeated 1000
times. This produces a measure of the probability that a particular linkage actually
exists. For example, if a linkage is only supported by one probe, this linkage will have
decreased confidence because it will often not appear in resampling.
f. Confirmation of contig order. Weak links flagged by bootstrap
resampling or the rules of thumb of Arratia et al. will initiate a visual re-examination
of the images stored in the Hewlett-Packard Instant Imager originally read by the robotics
system. Weak links that were not due to mistakes at this level will be tested by
additional hybridization studies. Colonies for probes used to map a chromosome will be
re-picked and their DNAs arrayed on membranes for hybridization. Cosmid mini-preps of
pools of clones from the ends of the same inferred contig will be hybridized to the
arrayed probe clones spotted onto membranes to confirm contig boundaries. If more than one
positive shows up on the membrane, then the gap is closed. Repeats located within the
middle of one probe and at the end of another would account for "nonreciprocal
hybridization". That is, a probe used early in the mapping experiment does not
hybridize to a later probe, but the later probe is found to hybridize with the earlier
probe. This would arise if the repeat is located in the middle of the early probe and at
the end of the late probe. These nonreciprocal hybridizations will be flagged and
sidestepped where possible by choosing another S-clone that hybridizes to the late probe
and does not hybridize to the early probe (i.e.misses the repeat).
g. Data storage, retrieval and distribution. All data are being made available in 5
modes by: (i) anonymous ftp from a server FUNGUS.GENETICS.UGA.EDU;
(ii)World Wide Web at
http://fungus.genetics.uga.edu:5080;
(iii) X-Windows connection directly to the FGDB; (iv)
by remote connection to FGDB through client software on PCs and Macintoshes, such as ODS;
and (v) Genetic Maps. The database system is being constantly refined and will be
coupled to an Internet server to the WWW.
As the walk progresses the software will provide a "real-time" diagram of the
chromosomal physical map updated via the FGDB
http://fungus.genetics.uga.edu:5080
.