Agenda. Introduction to Microarray and DNA sequencing. DNA sequencing. DNA sequencing. DNA sequencing. How DNA Sequencing works

Agenda Introduction to Microarray and DNA sequencing Genomics Facility [email protected] www.uoguelph.ca/~genomics DNA sequencing 1995 – DNA se...
Author: Tamsyn Hardy
0 downloads 1 Views 744KB Size
Agenda Introduction to Microarray and DNA sequencing

Genomics Facility [email protected] www.uoguelph.ca/~genomics

DNA sequencing

1995

– DNA sequencing – Microarray slides production – RNA quality assessment – Microarray labeling and hybridization – Microarray slide scanning – Microarray Data analysis

DNA sequencing

• What is DNA sequencing? – DNA sequencing is the process of determining the precise order of nucleotide bases in a section of a DNA molecule.

1970

• Review of DNA sequencing protocol • Introduction to Microarray technology • Demonstration of technology ( Room 1401,NSC)

• DNA molecule has a negative charge which allows separation • Gel electrophoresis • EtBr staining • UV illumination • DNA cut with restriction enzymes produced “maps”

2004

2008

How DNA Sequencing works Why are dideoxynucleotides (ddNTP) used to stop DNA synthesis?

DNA sequencing

DNA sequencing

How DNA Sequencing works

• What is DNA sequencing? – DNA sequencing is the process of determining the precise order of nucleotide bases in a section of a DNA molecule.

• How DNA Sequencing works ? – DNA template – Primer to initiate DNA synthesis

Denature

This primer will determine the starting point of the sequence being read, and the direction of the sequencing reaction.

Anneal

– An enzyme called DNA polymerase – Four nucleotides dNTP – Four ddNTPs with different fluorescence labels http://seqcore.brcf.med.umich.edu/doc/dnaseq/trouble/

Extend

How DNA Sequencing works

How DNA Sequencing works Because billions of DNA molecules are present in the reaction, the strand can be terminated at any position. This results in collections of DNA strands of many different lengths

How DNA Sequencing works

Applied Biosystems 3730 DNA Analyzer

The 48-capillary 3730 DNA Analyzer is used for sequencing 48 DNA samples simultaneously.



The 48-capillary 3730 DNA Analyzer uses capillary electrophoresis to separate various DNA fragments.



Negatively charged DNA is attracted to the positively charged capillary tip and migrates through the liquid polymer in the capillary.



The various DNA strands are separated according to their length. The smallest fragments migrate first then in ascending size.



As each fragment passes by the detection window, lasers excite the labeled dye on the fragment and the fluorescent signal of the terminating ddNTP is captured.

DNA sequence steps

DNA sequencing steps 1.

2.

Collecting samples: All DNAs, template or primer, should be supplied in water, or 10 mM Tris-HCl, pH 8.0. Salts and other charged molecules may cause of loss of signal. It is critically important that the DNA template be clean and free of contaminants! Run PCR

3. Purification of DNA Samples Use Sephadex to remove unincorporated dideoxynucleotides and salts 4. Run ABI3730 DNA Analyzer 5. Distribute sequencing data

Next Generation Sequencing Technologies • Pyrosequencing uses 4 enzymes in a single dNTP wash: – DNA polymerase which binds the nucleotide and releases PPi – In the presence of APS, ATP sulfurylase converts the pyrophosphate molecule to ATP – Luciferase converts ATP to light – Apyrase removes the unbound nucleotides and ATP

• For each base position there are 4 washes • If there is no nucleotide incorporation, no light signal production

Next Generation Sequencing Technologies Sequencing by synthesis - Solexa technology •Adapters •Flow channels •Solid phase bridge amplification •Clusters of dsDNA •Incorporation of fluorescent dNTPs •Reversible terminators •Removal of fluorescence •Repetition to build sequence

Next Generation Sequencing Technologies •



• •



The Applied Biosystems SOLid™ enables parallel sequencing of clonallyamplified DNA fragments linked to beads. The sequencing methodology is based on sequential ligation of two-base encoding dye-labeled probes. “Frameshift” ligation of primer leads to dual sequencing coverage. Overlapping fragment reads of 35-50 bases are assembled by powerful software algorithms. Each SOLid™ slide can generate over 3 gigabytes of sequencing data.

DNA Sequencing Websites to explore http://www.dnalc.org/ddnalc/resources/animations.html http://bioweb.uwlax.edu/GenWeb/Molecular/Theory/DNA_sequencing/dna_sequencing.htm http://smcg.ccg.unam.mx/enp-unam/03-EstructuraDelGenoma/animaciones/secuencia.swf

http://seqcore.brcf.med.umich.edu/doc/dnaseq/trouble/ http://www.pyrosequecing.com/DynPage.aspx?id=7454 http://marketing.appliedbiosystems.com/images/Product/Solid_Knowledge/PDF/SOLiD _Brochure_092707.pdf http://www.illumina.com/media.ilmn?Title=Sequencing-BySynthesis%20Demo&Cap=&PageName=solexa%20technology&PageURL=203& Media=1

Microarray technology and applications

Introduction to Microarray Technology

• The microararry technology was first published in 1995 by M. Schena and coworkers. ( Science,270:467-670) • A microarray is typically a glass slide, on to which DNA molecules are attached at fixed spots. • There may be tens of thousands of spots on an array. For gene expression studies, each spot ideally should identify one gene or one exon in the genome. • It is a high throughput technology that allows detection of thousands of genes simultaneously.

Northern Blotting and Microarrays

Northern Blotting and Microarrays Northern blotting measures relative expression levels of mRNA

• Microarray technique is similar to northern blot which is used to measure relative expression level of mRNA.

– RNA isolation and purification – electrophoreses on a gel – The gel is probed by hybridizing with a labeled gene under study.

• DNA microarrays rely on the hybridization properties of nucleic acids to monitor DNA or RNA abundance on a genomic scale in different types of cells. • Principle: base-pairing hybridization Gene X is labeled and hybridized to a filter containing total RNA from various tissues. Gene X is expressed only in testis RNA

Northern blotting can study one gene at a time. It cannot monitor the expression of genes in the whole genome.

Microarray technology and applications

Microarray Terminology • Probe: DNA spotted on the array/chip = spot, oligo, immobilized substrate (in Northern analysis it’s the labeled cDNA in solution) • Target: cDNA hybridized to the array = mobile substrate (in Northern analysis it’s the mRNA bound to the membrane) • The terms slide or array or chip are used interchangeably to refer to the printed microarray.

If the expression of a gene is the same in normal and test condition, it gives you yellow color. If the expression of a gene is high in test condition, it shows Red color. You can get a list of up regulated genes in test condition.

Publications using Microarray technology Year

# of publication

1999

83

2000

289

2001

808

2002

1461

2003

2284

2004

3306

2005

4335

5000 4500 4000 3500 3000 2500 2000 1500 1000 500 0 Year

1999 2000 2001 2002 2003 2004 2005

Microarray steps • Start with individual gene, e.g. the ~6,200 genes of the yeast genome • Amplify all of them using PCR and put the cDNA on to 96 or 384 well plates. • “Spot” them on a slide using a robot

Note: some investigators use the term probe and target interchangeably, so careful reading is required.

Address Biological Questions • What genes are involved in a particular biological process? • What genes are turned-on? • What genes are turned-off? • What genes are the key elements in a biological process? • Do similar clinic samples share similar gene expression profiles?

Microarray Fabrication PCR amplification

Arrayed Library (96 or 384-well plates) Spot as microarray on glass slides

(1) The robot and software are used to deliver samples to the glass slide for microarray fabrication. (2) ChipWriter Pro robotic arraying system includes a water bath, a sonicator, a vacuum for pin cleaning, a pin blotter station, and a 384-well microtiter plate holder. (3) UV_crosslink helps to fix probes onto the array and rehydration helps to make spots uniform.

Microarray Fabrication

Microarray technology Array Chip Types

DNA Microarray Fabrication 1. Contact printing - Pins (Uptake 0.25μl , Dispense 0.6nl, approximately 1-10 ng per spot) with 100μM feature size

1. DNA chip ( two-channel array): – Probe cDNA (500~5,000 bases long) or Oligo (50-70mer) is immobilized to a glass surface – a robot is used to apply the spots – this technique was first developed at Stanford University. 2. Gene chip (called Affymetrix chip): – Developed at Affymetrix, Inc. , under the GeneChip® trademark – Each gene has 16 – 20 pairs of probes synthesized on the chip – Each pair of probe has two set of oligonucleotides

2. No Contact printing 1 drop = 100 picolitres

3. Photolithography ( for Affymetrix chip) 4. Agilent Technology (www.agilent.com) 5. Illumina (www.illumina.com)

Perfect match (PM) Mismatch (MM, one base change) ATG…C…TGC (20-25 bases) ATG…T…TGC

Microarray technology

Microarray Fabrication Affymetrix use different technology for chip fabrication. It is a combination of light and chemistry. •Light-sensitive chemical compound is coated on the surface to prevent coupling between the wafer and the first nucleotide of the DNA probe being created. •Lithographic masks are used to either block or transmit light onto specific locations of the wafer surface. •The surface is then flooded with a solution containing either A, T, C, or G, and coupling occurs only in those regions on the glass that have been deprotected through illumination. •The coupled nucleotide also bears a light-sensitive protecting group, so the cycle can be repeated. •The process is repeated until the probes reach their full length, usually 25 nucleotides.

The result for a given gene is the average differences between Perfect match (PM) and Mismatch (MM, one base change) over probes. http://www.affymetrix.com/technology/manufacturing/index.affx

Microarray technology

Microarray steps • Extract total RNA RNA quality check

cDNA microarray

Affymetrix array

One probe per gene

16-20 probe pairs per gene

Probe of varying length

Probes are 25 nucleotides

Two target samples per array

One target sample per array

Cost of cDNA Slides: $60-200 cDNA array hybridization: $100-300

,

Cost of Affymetrix chip: $200-400 Affymetrix chip hybridization: $400-800

• Convert mRNA into colored cDNA (fluorescently labeled) • Mix labeled cDNA together • Hybridize cDNA with array • Wash unhybridized cDNA off • Read array with laser • Analyze images

RNA Quality

RNA Quality BioAnalyzer

• RNA quality is essential for gene expression analysis using microarray technology.

•Good quality RNA has an OD 260/280 ratio of 1.8 to 2 but very degraded RNA can have also very good ratio.

• Ribonuclease (RNase) is an enzyme that catalyzes the breakdown of RNA into smaller components. It is critical to perform everything in an RNAse free environment.

•High quality total RNA has a very specific trace profile, dominated by the 18S and 28S ribosomal peaks. •On bioanalyzer chip, there are microchannels that are filled with the gel-dye mix. The dye intercalates directly with RNA. When RNAs pass detection window, laser is used to excite the fluorescence of the dye to detect RNA.

• Any degradation in the sample will be seen as a decrease in size of these peaks and an increase in smallersized RNA fragments.

Direct labeling of cDNA

Indirect labeling of RNA

Dye conjugated nucleotide CCAAGGTATGG

cDNA synthesis

AAAAAAA TTTTTTT Cy5-dCTP

+

or

Reverse transcriptase dNTP

Amino-allyl indirect labeling

mRNA cDNA

Cyanine dye labeled dCTP

Cy3-dCTP

…………….CCAAGGTATGG ……….. GGTTCCATACC

Dye conjugated fluorescent dNTPs are incorporated into cDNA during reverse transcription of the mRNA target. Problem: 1. Unequal incorporation of Cy5 vs. Cy3 2. Poor overall incorporation of direct-conjugated

CCAACCTATGG

cDNA synthesis Reverse transcriptase dNTP

T

…………….CCAAGGTATGG ……….. GGTTCCUTUCC

AAAAAAA TTTTTTTT Amino-allyl Modified dUTP

Cy3/Cy5

GGTTGGUTUCC

addition

It takes two steps in indirect labeling method: 1: cDNA synthesis: the amimo-ally dUTP incorporates to cDNA during reverse transcription. 2: Coupling: the resulting primary amine groups in the first step conjugate with succinimidyl ester of Cy3/Cy5.

Array Scanning

Hybridization Binding of cDNA target samples to cDNA probes on the slide • Mix the labeled two labeled cDNA and Hybridization – Prehybridize slide 42°C for 45 min (optional) – Hybridize preheated probes at 42°C 16-20 Hours • Wash the array to remove unbounded probes • Dry the slide and scanning 16-bit TIFF image file • Bioinformatics analysis of microarray data

GenePix lasers scan microarray slides at two wavelengths 532 nm (green) and 635 nm (red).

Image Analysis

Image Analysis •

1. Gridding. Estimate location of spot centers. 2. Segmentation. Classify pixels as foreground (signal) or background. 3. Information extraction. For each spot on the array and each dye signal intensities; Rfg, Gfg background intensities; Rbg, Gbg

The Genepix array layout file gives information of gene name for every spot on the array.

The signal intensities of the spots are correlated with the concentrations of target mRNA samples.

Microarray steps

Address Biological Questions

• Extract RNA

• What genes are involved in a particular biological process? • What genes are turned-on? • What genes are turned-off? • What genes are the key elements in a biological process? • Do similar clinic samples share similar gene expression profiles?

• Convert mRNA into colored cDNA (fluorescently labeled) • Mix labeled cDNA together • Hybridize cDNA with array • Wash unhybridized cDNA off • Read array with laser • Analyze images

Bioinformatics analysis

Bioinformatics analysis •

Software Tools • GeneSpring (Agilent Technology) • Expressionist (GeneData) • GeneTraffic (Iobion) • Spotfire (Spotfire) • Cluster and TreeView (free) • Acuity …..

Data Normalization Why? There are many sources of systematic variation that affect measurements of gene expression levels. • Difference in labeling efficiency of the two fluorescent dyes • Difference in the amounts of starting mRNA material in the two samples • Difference in Hybridization/washing



Normalization is a term that is used to describe the process of eliminating such variations to make data comparable.



There are many methods of normalization – Normalize to Housekeeping genes which expression levels do not change over different experimental conditions).

– “Normalize to a Median” lets you divide all of the measurements on each chip by a the median value of the chip.

Bioinformatics analysis

Bioinformatics analysis • Clustering methods One of the goals of microarray data analysis is to cluster genes or samples with similar expression profiles together, to make meaningful biological inference about the set of genes or samples.

Lowess Intensity dependent normalization is used for two-color data. It is used to adjust for intensity-dependent variation due to dye properties. Dye bias is caused by inconsistencies in the relative fluorescence intensity between Cy5 and Cy3.

Bioinformatics analysis hierarchical method

Bioinformatics analysis

– Hierarchical – Non-hierarchical » K-Means Clustering » Self-Organizing Maps (SOM)

Bioinformatics analysis non-hierarchical

Bioinformatics analysis

Bioinformatics analysis

Cost of Microarray • cDNA microarray slide: $60-200

– Gene classification – Functional screening • Functional annotation What is the functional behavior of a particular gene?

• Pathways Which pathways does the gene involves in?

Annotation can be got by search bioinformatics databases.

About Genomic Facility •

This facility is supervised by: Dr. Teresa Crease Rm 1401, Science complex Professor, Dept. of Integrative Biology Email: [email protected]



The technicians working in the facility are: Angela Holliss (DNA sequencing) Rm 1401, Science complex Email: [email protected] Phone: 519-824-4120 x58357



Jing Zhang (microarray) Rm 1401, Science complex Email: [email protected] Phone: 519-824-4120 Ext. 53380

http://www.uoguelph.ca/~genomics

• Reagent for labeling and hybridization: $ 100 - 200 (Cy3/ Cy5: $30/pair, enzyme and buffer, purification module etc.) • Affymetrix chip: $200-500 • Affymetrix chip hybridization: $400-800 • Pin: $200-300/pin • Arrayer: $140,000

Scanner: $120,000