Sample & Assay Technologies
Next Generation Sequencing: An introduction to applications and technologies Quan Peng, Ph.D. Scientist, R&D
[email protected]
Sample & Assay Technologies
Welcome to the three-part webinar series Next Generation Sequencing and its role in cancer biology
Webinar 1: Next-generation sequencing, an introduction to technology and applications Date: March April 4, 2013 Speaker: Quan Peng, Ph.D.
Webinar 2: Date: Speaker:
Next-generation sequencing for cancer research April 11, 2013 Vikram Devgan, Ph.D., MBA
Webinar 3: Date: Speaker:
Next-generation sequencing data analysis for genetic profiling April 18, 2013 Ravi Vijaya Satya, Ph.D.
Title, Location, Date
2
Sample & Assay Technologies
Agenda
Next Generation Sequencing Background Technologies Applications Workflow
Targeted Enrichment Methodology Data analysis New product released!
3
Sample & Assay Technologies
DNA Sequencing – The Past Decade
Output (Kb)
10E+8
10E+6
10E+4
10E+2
Adapted from ER Mardis. Nature 470, 198-203 (2011) doi:10.1038/nature09796 Title, Location, Date
4
Sample & Assay Technologies
Rapid Decrease in Cost
Title, Location, Date
5
Sample & Assay Technologies
What is Next-Generation Sequencing?
Sanger Sequencing
DNA is fragmented
NGS: Massive Parallel Sequencing
DNA is fragmented
.
Adaptors ligated to fragments (Library construction) Cloned to a plasmid vector
Cyclic sequencing reaction
Clonal amplification of fragments on a solid surface (Bridge PCR or Emulsion PCR)
Direct step-by-step detection of each nucleotide base incorporated during the sequencing reaction .
Separation by electrophoresis Readout with fluorescent tags
Title, Location, Date
6
Sample & Assay Technologies
Bridge PCR
DNA fragments are flanked with adaptors (Library) A flat surface (chip) coated with two types of primers, corresponding to the adaptors Amplification proceeds in cycles, with one end of each bridge tethered to the surface Clusters of DNA molecules are generated on the chip. Each cluster is originated from a single DNA fragment Used by Illumina Title, Location, Date
7
Sample & Assay Technologies
Illumina HiSeq/MiSeq
Run time 1- 10 days Produces 2 - 600 Gb of sequence Read length 2X100 bp – 2X250bp (pair end) Cost: $0.05 - $0.4/Mb
Title, Location, Date
8
Sample & Assay Technologies
Single-end reading
Single-end vs. paired-end reading
2nd strand synthesis
Pair-end reading
Single-end reading (SE): Sequencer reads a fragment from only one end to the other Pair-end reading (PE): Sequencer reads both ends of the same fragment More sequencing information, reads can be more accurately placed (“mapped”) May not be required for all experiments, more expensive and time-consuming
Title, Location, Date
9
Sample & Assay Technologies
Emulsion PCR
Fragments, with adaptors, are PCR amplified within a water drop in oil One primer is attached to the surface of a bead DNA molecules are synthesized on the beads. Each bead bears DNA originated from a single DNA fragment Beads with DNA are then deposit into the wells of sequencing chips, one well one bead Used by Roche 454, IonTorrent and SOLiD Title, Location, Date
10
Sample & Assay Technologies
Ion PGM/ Proton
Run time 3 hrs Read length 100‐300 bp; homopolymer can be an issue Throughput determined by chip size (pH meter array): 10Mb – 5 Gb Cost: $1 - $20/Mb Title, Location, Date
11
Sample & Assay Technologies
Multiplex Sequencing – Barcoding Samples
Depending on the application, we may not need to generate so many reads per sample Multiple samples with different index can be combined and put into one sequencing run or into one sequencing lane Save money on sequencing costs (pay per sample)
Title, Location, Date
12
Sample & Assay Technologies
NGS Applications
Next Generation Sequencing Genomics
Transcriptomics
Title, Location, Date
Epigenomics
Metagenomics
13
Sample & Assay Technologies
NGS Applications
Next Generation Sequencing Genomics
Transcriptomics
Epigenomics
Metagenomics
DNA-Seq
Mutation, SNVs, Indels, CNVs, Translocation
Title, Location, Date
14
Sample & Assay Technologies
NGS Applications
Next Generation Sequencing Genomics
Transcriptomics
DNA-Seq
RNA-Seq
Mutation, SNVs, Indels, CNVs, Translocation
Expression level, Novel transcripts, Fusion transcript, Splice variants
Title, Location, Date
Epigenomics
Metagenomics
15
Sample & Assay Technologies
NGS Applications
Next Generation Sequencing Genomics
Transcriptomics
Epigenomics
DNA-Seq
RNA-Seq
ChIP-Seq, Methyl-Seq
Mutation, SNVs, Indels, CNVs, Translocation
Expression level, Novel transcripts, Fusion transcript, Splice variants
Global mapping of DNA-protein interactions, DNA methylation, histone modification
Title, Location, Date
Metagenomics
16
Sample & Assay Technologies
NGS Applications
Next Generation Sequencing Genomics
Transcriptomics
Epigenomics
Metagenomics
DNA-Seq
RNA-Seq
ChIP-Seq, Methyl-Seq
MicrobialSeq
Mutation, SNVs, Indels, CNVs, Translocation
Expression level, Novel transcripts, Fusion transcript, Splice variants
Global mapping of DNA-protein interactions, DNA methylation, histone modification
Microbial genome Sequence, Microbial ID, Microbiome Sequencing,
Title, Location, Date
17
Sample & Assay Technologies
Next Generation Sequencing Workflow
Sample preparation
• Isolate samples (DNA/RNA) • Qualify and quantify samples • Several hours to days
Library construction
• Prepare platform specific library • Qualify and quantify library • 4-8 hours
Sequencing
• Perform sequencing run reaction on NGS platform • 8 hours to several days
Data analysis
• Application specific data analysis pipeline • Several hours to days
Title, Location, Date
18
Sample & Assay Technologies
Sample preparation
QIAGEN’s Solution for NGS Workflow
Target Enrichment kit HMW DNA prep kit Single Cell/WGA kit
rRNA depletion kit ChIP-seq Kit Pathogen bacteria prep kit
Library construction kit MinElute size selection kit Library construction Library quantification kit
Sequencing
Data analysis
Result validation
GeneRead DNAseq data analysis web portal
RT2 Profiler PCR Arrays Somatic Mutation PCR Arrays Pyrosequencing
Title, Location, Date
CNA/CNV PCR Arrays EpiTect ChIP PCR Arrays SNP PCR Arrays
19
Sample & Assay Technologies
GeneRead DNAseq Gene Panel: Targeted Sequencing
What is targeted sequencing? Sequencing a sub set of region in the whole-genome
Why do we need targeted sequencing? Not all regions in the genome are of interest or relevant to specific study Exome Sequencing: sequencing most of the coding regions of the genome (exome). Protein-coding regions constitute less than 2% of the entire genome Focused panel/hot spot sequencing: focused on the genes or regions of interest
What are the advantages of focused panel sequencing? More coverage per sample, more sensitive mutation detection More samples per run, lower cost per sample
Title, Location, Date
20
Sample & Assay Technologies
Target Enrichment - Methodology
Hybridization capture Large DNA input (1 ug) Long processing time (2-3 days) Large throughput (MB region to whole exome)
Sample preparation (DNA isolation)
Library construction
Title, Location, Date
Hybridization capture (24-72 hrs)
Sequencing
Data analysis
21
Sample & Assay Technologies
Target Enrichment - Methodology
Multiplex PCR Small DNA input (< 100ng) Short processing time (several hrs) Relatively small throughput (KB - MB region)
Sample preparation (DNA isolation)
PCR target enrichment (2 hours)
Title, Location, Date
Library construction
Sequencing
Data analysis
22
Sample & Assay Technologies
GeneRead DNAseq Gene Panel
Multiplex PCR technology based targeted enrichment for DNA sequencing Cover all human exons (coding region + UTR) Division of gene primers sets into 4 tubes; up to 1200 plex in each tube
23
Sample & Assay Technologies
GeneRead DNAseq Gene Panel Focus on your Disease of Interest Comprehensive Cancer Panel (124 genes)
Disease Focused Gene Panels (20 genes)
Genes Involved in Disease
Breast cancer
Colon Cancer
Gastric cancer
Leukemia
Liver cancer
Lung Cancer
Ovarian Cancer
Prostate Cancer
Genes with High Relevance 24
Sample & Assay Technologies
GeneRead DNAseq Custom Panel
25
Sample & Assay Technologies
NGS Data Analysis
Base calling From raw data to DNA sequences, generate sequencing reads
Mapping to a reference Align the reads to reference sequences Can be considered as “blast“ millions of sequences against reference database
Variants identification Identify the differences between sample DNA and reference DNA
Variant prioritization/filtering/validation
Title, Location, Date
26
Sample & Assay Technologies
NGS Data Analysis
Reference sequence A
alignment Sequencing reads C C
Title, Location, Date
27
Sample & Assay Technologies
NGS Data Analysis: Sequencing Depth
Coverage depth (or depth of coverage): how many times each base has been sequenced or read Unlike Sanger sequencing, in which each sample is sequenced 1-3 times to be confident of its nucleotide identity, NGS generally needs to cover each position many times to make a confident base call, due to relative high error rate (0.1 - 1% vs 0.001 – 0.01%) Increasing coverage depth is also helpful to identify low frequent mutation in heterogenous samples such as cancer sample
Reference sequence
NGS reads
coverage depth = 4 Title, Location, Date
coverage depth = 3
coverage depth = 2 28
Sample & Assay Technologies
NGS Data Analysis: Specificity
Specificity: the percentage of sequences that map to the intended targets region of interest number of on-target reads / total number of reads
Reference sequence
ROI 1
ROI 2
NGS reads
Off-target reads
On-target reads Title, Location, Date
On-target reads 29
Sample & Assay Technologies
NGS Data Analysis: Uniformity
Coverage uniformity: measure the evenness of the coverage depth of target position Calculate coverage depth of each position Calculate the median coverage depth Set the lower boundary of the coverage depth related to median depth (eg. 0.1 X median coverage depth) Calculate the percentage of target region covered by equal or more than the lower boundary Reference sequence
NGS reads
coverage depth = 10
coverage depth = 3 Title, Location, Date
coverage depth = 2 30
Sample & Assay Technologies
QIAGEN’s Solution
FREE Complete & Easy to use Data Analysis with Web-based Software
31
Sample & Assay Technologies
Summary
Run Summary
Specificity Coverage Uniformity Numbers of SNPs and Indels
Summary By Gene
Specificity Coverage Uniformity # of SNPs and Indels
32
Sample & Assay Technologies
Features of Variant Report
SNP detection Indel detection
33
Sample & Assay Technologies
QIAGEN’s GeneRead DNAseq Gene Panel System FOCUS ON YOUR RELEVANT GENES Focused: Biologically relevant content selection enables deep sequencing on relevant genes and identification of rare mutations Flexible: Mix and match any gene of interest NGS platform independent: Functionally validated for PGM, MiSeq/HiSeq Integrated controls: Enabling quality control of prepared library before sequencing Free, complete and easy of use data analysis tool
Sample & Assay Technologies
Upcoming webinars Next Generation Sequencing and its role in cancer biology
Webinar 2: Next-generation sequencing for cancer research Date: April 11, 2013 Speaker: Vikram Devgan, Ph.D., MBA Register here: https://www2.gotomeeting.com/register/126404050
Webinar 3: Next-generation sequencing data analysis for genetic profiling Date: April 18, 2013 Speaker: Ravi Vijaya Satya, Ph.D. Register here: ps://www2.gotomeeting.com/register/966970098
Title, Location, Date
35