A POPULATION GENETIC ANALYSIS OF FORENSIC IDENTITY AND ANCESTRY MARKERS IN INDONESIA

A POPULATION GENETIC ANALYSIS OF FORENSIC IDENTITY AND ANCESTRY MARKERS IN INDONESIA Samantha Jane Venables A thesis submitted for the degree of Doc...

Author: Ross Hutchinson

16 downloads 0 Views 152KB Size

Report

Download PDF

Recommend Documents

Genetic Biodiversity in Buffalo Population of Iraq Using Microsatellites Markers

Uniparental ancestry markers in Chilean populations

Uni-parental markers in human identity testing including forensic DNA analysis

Identity and Ancestry: Sticks & Stones & Buffalo Bones

UNDERSTANDING GENETIC AND MOLECULAR MARKERS IN LYMPHOMA

Selection and genotyping of unlinked genetic markers

Genetic analysis of a population of Tribolium. IX. Maximization of population size and the concept of a stochastic equilibrium

Molecular Markers in Plants for Analysis of Genetic Diversity: A Review

THE POPULATION OF INDONESIA

Forensic Analysis of CCTV DVRs. Forensic Analysis of CCTV DVRs

Use of Unlinked Genetic Markers to Detect Population Stratification in Association Studies

GENETIC POLYMORPHISM OF CYP2C19 IN A SAMPLE OF IRAQI POPULATION

MENNONITE HISTORY AND IDENTITY IN INDONESIA

Genetic diversity and population structure of Chinese honeybees (Apis cerana) under microsatellite markers

Genetic diversity and population structure of Nigerian indigenous goat using DNA microsatellite markers

Different differences: The use of genetic ancestry versus race in biomedical human genetic research

Molecular Genetic Markers in Acute Myeloid Leukemia

Genetic analysis of 17 Y-STRs in a Mestizo population from the Central Valley of Mexico

Contrasting patterns of genetic diversity at three different genetic markers in a marine mammal metapopulation

A Forensic Analysis of Android Malware

Forensic Implications of Identity Management Systems

A PANEL OF MICROSATELLITE MARKERS FOR GENETIC DIVERSITY AND PARENTAGE ANALYSIS OF DOG BREEDS IN PAKISTAN ABSTRACT

Molecular characterization and genetic diversity analysis of mandarin genotypes by SSR and SRAP markers

ANALYSIS OF GENETIC DIVERSITY AND STRUCTURE OF BALUCHI SHEEP BY MICROSATELLITE MARKERS

A POPULATION GENETIC ANALYSIS OF FORENSIC IDENTITY AND ANCESTRY MARKERS IN INDONESIA

Samantha Jane Venables

A thesis submitted for the degree of Doctor of Philosophy (Applied Science)

National Centre for Forensic Studies University of Canberra

December 2012

COPYRIGHT Under Section 35 of the Copyright Act of 1968, the author of this thesis is the owner of any copyright subsisting in the work, even though it is unpublished. Under Section 31(I)(a)(i), copyright includes the exclusive right to ‘reproduce the work in a material form’. Thus, copyright is infringed by a person who, not being the owner of the copyright, reproduces or authorises the reproduction of the work, or of more than a reasonable part of the work, in a material form, unless the reproduction is a ‘fair dealing’ with the work ‘for the purpose of research or study’ as further defined in Sections 40 and 41 of the Act. This thesis, entitled “A population genetic analysis of forensic identity and ancestry markers in Indonesia” must therefore be copied or used under the normal conditions of scholarly fair dealing for the purposes of research, criticism or review, as outlined in the provisions of the Copyright Act 1968. In particular, no results or conclusions should be extracted from it, nor should it be copied or closely paraphrased in whole or in part without the written consent of the author. Proper written acknowledgement should be made for any assistance obtained from this thesis. Copies of the thesis may be made by a library on behalf of another person provided the office in charge of the library is satisfied that the copy is being made for the purposes of research or study.

_____________________________________________ Signature of Candidate

_____________________________________________ Date i

ABSTRACT Population sub-structure and remote inbreeding cause individuals to become more genetically related. Since related individuals are more likely to share a DNA profile than unrelated individuals, the presence of population sub-structure causes inaccuracies in the match probability calculations that are presented in court by forensic biologists. Frequently, a co-ancestry (θ) correction is applied to match probability calculations to accommodate the variations that exist between subpopulations. Population sub-structure within Indonesia is highly probable given that the colonisation history of the archipelago has ensured the co-existence of numerous distinct traditional cultural and linguistic groups. The geography of Indonesia could also contribute to the formation of population sub-structure, with the 240 million residents spread across 6,000 inhabited islands some of which possess significant mountain ranges. However, the population genetic features of Indonesia have not been examined thoroughly and adequate allele frequency data for forensic identity markers in Indonesia does not exist. The aims of this project were to; a) genotype an extensive Indonesian sample set, covering many geographic locations and representing numerous linguistic and cultural groups, using forensic identity markers (autosomal STRs) and ancestry markers (autosomal SNPs) and identify genetically distinct subpopulations; b)compare the genetic subpopulation boundaries revealed through forensic STR and autosomal SNP genotyping and assess the efficacy of each marker type for detecting population differentiation and defining the genetic subpopulations within Indonesia; and c) construct defensible DNA databases for the genetic subpopulations discovered within Indonesia and determine appropriate sub-structure correction factors for use in forensic casework match probability assessments. In total, 15 forensic identity markers (Identifiler® kit) and 13 autosomal ancestry SNPs were genotyped in an extensive sample set from the Indonesian archipelago. Population genetic analyses revealed that both forensic identity and autosomal ancestry markers were able to discern population sub-structure within Indonesia and that the overarching patterns of population differentiation were the same. It was determined that islandbased subpopulations were the most suitable for the purposes of creating forensic ii

population databases. Appropriate values for the co-ancestry (θ) correction factor were determined to be θ=0.03 (3%) for Island Sumatra (N=129), θ=0.02 (2%) for Flores (N=245) and Sumba (N=174) and θ=0.01 (1%) for Mainland Sumatra (N=294), Sulawesi (N=419) and Java and Bali (N=484). The island-based subpopulations of Kalimantan (N=26) and Papua (N=57) are currently too small to make informed decisions about the appropriate value of θ. If desired by the Indonesian jurisdictions, each population database could be used with a θ=0.03 (3%), to assist the implementation of these population databases into routine forensic DNA casework. In summary, this study has generated allele frequency data for 15 forensic identity markers and characterised the levels of population sub-structure present within Indonesia using forensic identity and ancestry markers. Genetically appropriate population databases for the island-based subpopulations (Island Sumatra, Mainland Sumatra, Java & Bali, Kalimantan, Sulawesi, Flores, Sumba and Papua) have been generated and appropriate θ-correction factors are now available for use in the calculation of DNA match statistics.

iii

ACKNOWLEDGEMENTS There are so many people who have contributed to making my PhD journey all that it was, and what a journey this has been! First and foremost, I must thank my supervisors Dr. Runa Daniel, Professor Stephen Sarre, Dr. Roland van Oorschot, Dr. Simon Walsh and Dr. Dennis McNevin. Initially, the thought of having a supervisory “subpopulation” terrified me, but all of you defied every expectation that I had of what a supervisor should be. You all willingly shared your expertise, showed commitment to my project and provided me with invaluable guidance and much appreciated encouragement. I am especially grateful for your contributions to the completion of this thesis. I have thoroughly enjoyed the opportunity to work with you all and hope that we can collaborate again in the future. This project would not have been possible without the support of the Australian Federal Police, Forensic and Data Centres who generously provided research funding for consumables and access to an A.C.T. Caucasian STR dataset. The Eijkman Institute for Molecular Biology, through Professor Herawati Sudoyo, provided the extensive Indonesian population sample set for genotyping and the Indonesian National Police provided STR genotype data to complement the Eijkman sample set. The University of Canberra provided me with an Australian Postgraduate Award. In addition, I was kindly granted use of the Institute for Applied Ecology’s Wildlife Genetics Laboratory during my SNP genotyping. The National Centre for Forensic Studies at the University of Canberra provided a stimulating and productive working environment and funding for attendance at national and international conferences. Thank you to all of my colleagues, particularly Daniel Augustinus and Liz Peters who were fantastic office mates and Dr. Greg Adcock and Dr. Michelle Gahan for technical assistance and advice. Carolyn McLaren, we started our forensic science journey together as naïve undergraduates and here we are, many years later, completing our PhD journey! Your friendship was a positive influence on me throughout our many years of research. Your hard work and conscientious approach to juggling study and your family reminded me in the hard times that it can be done.

v

I am appreciative to the individuals from the AFP Forensic Biology team that have kindly and patiently answered my many questions over the years, especially Andrew Preston, Paul Roffey, Jo Lee, Tim Shaw and Shelley Fountain. During my time working at Weston, you passed on many things that one can only learn from years of setting up endless amps and analysing electropherograms. Who would have thought that too much DNA and no DNA can produce the same results! Without my amazing friends I would not have been able to endure the tough times, nor enjoy the many triumphs along the way. Laura and Stacey provided moral support in the forms of food and entertainment throughout my candidature and were generous with doses of encouragement. Vicki was a fantastic listener and always had time for a much needed chat when I dropped by to see her. You were my lifeline to the world outside of my project while I was working nightshift, so thank you! Karen deserves a special mention; a fantastic boss, mentor, and friend, who illuminated the (sometimes bumpy and winding) path to completion with her words of wisdom. I am lucky to count you all as friends. My wonderful family has provided endless support throughout my PhD journey and they all deserve thanks – the Dawes’, Orford’s and Venables’. Your love and never-ending encouragement has been much appreciated. You have been with me every step of the way, encouraging me to accomplish my lofty goals and supporting me in every way possible. None of my accomplishments would have been possible without you all. Last, but certainly not least, I would like to sincerely thank my husband Doug. You have been with me from day one this journey and you are still here at the end, four years and a wedding later. You have been tolerant and patient throughout my studies, particularly for the eight weeks that I essentially lived in the laboratory, and your support made such daunting tasks achievable. Your ability to motivate me, coupled with your perfectly timed gifts of hot chips with gravy, sustained me throughout the thesis writing process. Thank you, I couldn’t have done this without you.

vi

I dedicate this thesis to my grandparents, Bryan and Loretta Dawes; my parents, Sean and Kim Orford; and my husband, Doug Venables. Your unwavering confidence in my abilities still astounds me.

vii

ABBREVIATIONS A

Adenine

A.C.T.

Australian Capital Territory

A.D.

Anno Domini

AFP

Australian Federal Police

ALFRED

Allele Frequency Database

AMOVA

Analysis of Molecular Variance

ANOVA

Analysis of Variance

BGA

Biogeographical Ancestry

bp

Base pair

C

Cytosine

CCD

Charged Coupled Device

CE

Capillary Electrophoresis

CEPH

Centre d'Etude du Polymorphisme Humain

CODIS

Combined DNA Index System

ddNTP

Dideoxyribonucleotide triphosphate

DNA

Deoxyribonucleic Acid

dNTP

Deoxyribonucleotide triphosphate

dsDNA

Double stranded DNA

Exo

Exonuclease

FIS / f

Inbreeding parameter within populations

FIT / F

Total inbreeding coefficient

FST / θ

Co-ancestry coefficient

FROG-kb

Forensic Research/Reference on Genetics knowledge base

G

Guanine

GDA

Genetic Data Analysis

HD

Defence hypothesis

HP

Prosecution hypothesis

HEO

Hexaethyleneoxide

HRM

High Resolution Melting

hTERT

Human Telomerase Reverse Transcriptase

HUGO

Human Genome Organisation

In

Rosenberg’s informativeness for assignment viii

INP

Indonesian National Police

IPC

Internal Positive Control

kV

Kilovolt

LR

Likelihood Ratio

Mb

Mega base pair

MCMC

Markov Chain Monte Carlo

µL

Microlitre

MGB

Minor Groove Binder

mtDNA

mitochondrial DNA

NFQ

Non Fluorescent Quencher

ng

Nanogram

N.T.

Northern Territory

NTC

No Template Control

NRY

Non-Recombining portion of the Y-chromosome

OL

Off Ladder

PanSNP

The HUGO Pan Asian SNP Consortium

PC

Positive Control

PCR

Polymerase Chain Reaction

PCoA

Principal Co-ordinates Analysis

PNG

Papua New Guinea

RFU

Relative Fluorescence Unit

SAP

Shrimp Alkaline Phosphatase

SBE

Single Base Extension

SNP

Single Nucleotide Polymorphism

STR

Short Tandem Repeat

T

Thymine

Tm

Melting Temperature

TE

Tris Ethylene Diamine Tetra Acetic Acid (buffer)

TPM

Truncated Product Method

U.S.A.

United States of America

UV

Ultraviolet (radiation)

ix

TABLE OF CONTENTS Copyright ........................................................................................................................................ i Abstract .......................................................................................................................................... ii Certificate Of Authorship Of Thesis....................................................................................... iv Acknowledgements .................................................................................................................... v Abbreviations ........................................................................................................................... viii Table of Contents......................................................................................................................... x List of Figures ............................................................................................................................ xiv List of Tables.............................................................................................................................xvii Publications and Presentations ............................................................................................ xx Publications .................................................................................................................................................... xx Presentations................................................................................................................................................. xx 1. Introduction .......................................................................................................................... 1 1.1 The evolutionary history of Homo sapiens led to genetic differences between populations .............................................................................................................................................. 1 1.1.1

The settlement of Australia, Papua New Guinea and South East Asia had lasting cultural and genetic impacts....................................................... 2

1.2 Detecting genetic differences between populations ........................................................... 8 1.2.1

Forensic Identity Markers .............................................................................................. 9

1.2.2

Forensic Ancestry Markers ..........................................................................................21

1.3 Implications of population sub-structure in forensic casework ................................ 37 1.3.1

Quantifying population sub-structure ....................................................................37

1.3.2

Use of likelihood ratios to determine DNA profile probability...................38

1.3.3

Population genetic principles that impact forensic DNA match probabilities ........................................................................................................................40

1.4 Indonesia: A regional context ...................................................................................................... 42 1.4.1

Population Diversity........................................................................................................42

1.4.2

Forensic DNA analysis in Indonesia.........................................................................46

1.5 Aims and Approaches...................................................................................................................... 46 2. Methods ................................................................................................................................48 2.1 Samples utilised in this study ...................................................................................................... 48 2.1.1

Indonesian samples collected by the Eijkman Institute for Molecular Biology .............................................................................................................48

2.1.2

Indonesian genotypes from Indonesian National Police samples ............48 x

2.2 Sterilisation Procedures ................................................................................................................. 51 2.3 DNA Quantitation .............................................................................................................................. 51 2.4 Analysis of forensic identity markers ...................................................................................... 51 2.4.1

STR amplification ..............................................................................................................51

2.4.2

STR profiling........................................................................................................................53

2.4.3

STR data analysis ..............................................................................................................55

2.5 Analysis of ancestry informative SNP markers................................................................... 56 2.5.1

SNP selection.......................................................................................................................56

2.5.2

SNP primer design ............................................................................................................58

2.5.3

SNP genotyping ..................................................................................................................60

2.5.4

SNP data analysis ..............................................................................................................66

2.6 Population genetic analyses conducted on the STR and SNP data ............................ 69 2.6.1

Assessing departure from independence..............................................................69

2.6.2

Calculating co-ancestry distances and generating neighbourjoining trees .........................................................................................................................71

2.6.3

Calculating Wright’s F-statistics ................................................................................72

2.6.4

Assessing population structure .................................................................................72

2.6.5

Determining allele frequencies ..................................................................................74

3. Forensic Identity Marker Results..................................................................................75 3.1 STR genotyping................................................................................................................................... 75 3.1.1

Variant alleles detected in the Eijkman data set................................................75

3.2 Evaluating the data according to subpopulation of origin ............................................ 76 3.2.1

Assessing departure from independence..............................................................76

3.2.2

Determining co-ancestry distance ............................................................................81

3.2.3

Calculating Wright’s F-statistics ................................................................................84

3.2.4

Assessing population structure .................................................................................84

3.3 Evaluating the data according to island of origin .............................................................. 89 3.3.1

Assessing departure from independence..............................................................89

3.3.2

Determining co-ancestry distance ............................................................................91

3.3.3

Calculating Wright’s F-statistics ............................................................................. 101

3.3.4

Assessing population structure .............................................................................. 102

3.4 STR allele frequency data for island-based subpopulations ......................................103 4. Autosomal Ancestry Marker Results ......................................................................... 106 4.1 HRM method optimisation ..........................................................................................................106 4.1.1

Assessment of HRM kit sensitivity and performance................................... 106 xi

4.1.2

Reproducibility of SNP genotyping using MeltDoctor™ HRM MasterMix .......................................................................................................................... 108

4.1.3

SNP genotype confirmation using restriction enzyme digests ................ 109

4.2 The utility of HRM as a forensic SNP genotyping method ...........................................119 4.2.1

SNP genotyping ............................................................................................................... 119

4.2.2

Multiplex capability ...................................................................................................... 120

4.2.3

Reproducibility of the HRM SNP genotyping assays .................................... 122

4.3 Evaluating the data according to subpopulation of origin ..........................................125 4.3.1

Assessing departure from independence........................................................... 125

4.3.2

Determining co-ancestry distance ......................................................................... 130

4.3.3

Calculating Wright’s F-statistics ............................................................................. 133

4.3.4

Assessing population structure .............................................................................. 133

4.4 Evaluating the data according to island of origin ............................................................138 4.4.1

Assessing departure from independence........................................................... 138

4.4.2

Determining co-ancestry distance ......................................................................... 140

4.4.3

Calculating Wright’s F-statistics ............................................................................. 145

4.4.4

Assessing population structure .............................................................................. 146

4.5 Applicability of the autosomal SNPs as ancestry markers within Indonesia .....150 4.5.1

Informativeness of the SNP markers for ancestry inference.................... 150

4.5.2

Population assignment................................................................................................ 151

4.6 SNP allele frequency data for island-based subpopulations ......................................152 5. Discussion ......................................................................................................................... 153 5.1 Genetic population sub-structure in Indonesia ................................................................153 5.1.1

Insights from forensic identity markers ............................................................. 153

5.1.2

Insights from autosomal ancestry markers ...................................................... 157

5.1.3

Forensic identity and autosomal ancestry markers have different capacities for detecting population sub-structure ........................................ 164

5.2 Forensic implications of population sub-structure in Indonesia .............................166 5.3 Incidental insights ...........................................................................................................................168 5.3.1

Rare alleles at the D21S11 locus ............................................................................ 168

5.3.2

Autosomal ancestry markers ................................................................................... 169

5.4 The utility of HRM for forensic SNP genotyping...............................................................170 6. Conclusions....................................................................................................................... 173 Appendix 1 - Identifiler® kit Details ................................................................................ 174 Appendix 2 - STR Variant Confirmations......................................................................... 175 xii

Appendix 3 - STR Allele Frequency Tables...................................................................... 183 Appendix 4 - ANOVA results for within run variation.................................................. 199 Appendix 5 - ANOVA results for between run variation.............................................. 211 Appendix 6 - SNP Allele Frequency Tables...................................................................... 215 References................................................................................................................................ 219

xiii

LIST OF FIGURES Figure 1.1

Major Homo sapiens population movements and their timings......................... 1

Figure 1.2

A map of Oceania, designating the areas of Micronesia, Melanesia and Polynesia, as the terms are used in this study............................................................ 3

Figure 1.3

Indonesia and surrounding countries in South East Asia. The Indonesian islands referred to in this thesis are indicated in bold. ................. 5

Figure 1.4

The internal repeat structure of the complex STR D21S11. ..............................12

Figure 1.5

Overview of the processes underpinning the 5' nuclease (TaqMan®) assays utilised by the Quantifiler® Human DNA quantitation kit. ..................15

Figure 1.6

A SNP involves the mutation of a single base (underlined). ..............................21

Figure 1.7

A schematic representation of the types of bi-allelic SNPs. ...............................22

Figure 1.8

A schematic representation of the SNaPshot® process........................................32

Figure 1.9

Distribution of various language families throughout the Indonesian archipelago. ...............................................................................................................................44

Figure 1.10

The major branches of the Austronesian language family. ................................45

Figure 2.1

Collection sites in the Indonesian archipelago for samples used in this study. ...................................................................................................................................49

Figure 2.2

A schematic representation illustrating the use of the “Manual Call” function........................................................................................................................................67

Figure 3.1

The p-p plot for the combined Indonesian data set (N=1,828). .......................77

Figure 3.2 a) p-p plots for the Indonesian subpopulations under study (N>20).................78 Figure 3.3

A neighbour-joining tree showing the genetic relationships between the 36 Indonesian subpopulations (based on the co-ancestry distance matrix) and an A.C.T. Caucasian out-group (npops=37 and N=2,051). ....................................................................................................................................83

Figure 3.4

Results from the AMOVA analysis for the 36 originally sampled subpopulations. .......................................................................................................................84

Figure 3.5

Principal Co-ordinates Analysis conducted on the full distance matrix for the 36 Indonesian subpopulations .........................................................................87 xiv

Figure 3.6

Results from the STRUCTURE (v 2.3.4) analyses. ........................................................88

Figure 3.7

p-p plots for the six island-based subpopulations. .................................................90

Figure 3.8

The neighbour-joining tree for the subpopulations from Flores. ...................94

Figure 3.9

The neighbour-joining tree for the six subpopulations from Sulawesi........94

Figure 3.10

The neighbour-joining tree for the subpopulations from Sumba. ..................95

Figure 3.11

The neighbour-joining tree for subpopulations from Sumatra........................95

Figure 3.12

p-p plots for the regional subpopulations of Island Sumatra and Mainland Sumatra ..................................................................................................................96

Figure 3.13

The neighbour-joining tree depicting the genetic relationships between eight island-based subpopulations within Indonesia and an A.C.T. Caucasian out group. ............................................................................................. 100

Figure 3.14

AMOVA analysis in GENALEX v6.5 for the eight island-based subpopulations. .................................................................................................................... 102

Figure 3.15

Principal Co-ordinates Analysis of the eight island based subpopulations showing the west/east subpopulation clusters.................. 104

Figure 3.16

Results from the STRUCTURE (V 2.3.4) analyses....................................................... 105

Figure 4.1

Sensitivity results for the a) MeltDoctor™ HRM MasterMix, b) Precision Melt Supermix, c) KAPA HRM Fast Mix and d) SensiFast™ HRM kits. .................................................................................................................................. 107

Figure 4.2

HRM genotyping results for SNP rs733559 in the samples used for the assessment of kit sensitivity and performance. ............................................ 112

Figure 4.3

Gel images of the restriction enzyme digest confirming the genotype of SNP rs733559. ................................................................................................................. 112

Figure 4.4

HRM genotyping results for SNP rs733559............................................................ 113

Figure 4.5

Gel images of the restriction enzyme digest confirmation of sample genotypes for SNP rs733559. ........................................................................................ 114

Figure 4.6

HRM genotypes for SNP rs310850.............................................................................. 115

Figure 4.7

Gel images of the restriction enzyme digest confirming genotypes for SNP rs310850........................................................................................................................ 116

Figure 4.8

HRM results for SNP rs3892905. ............................................................................... 117 xv

Figure 4.9

Gel images from the restriction enzyme digest confirming genotypes for SNP rs3892905.............................................................................................................. 118

Figure 4.10

Duplex experiment for samples 5, 6 and 7 using rs963170 and rs10494531. ........................................................................................................................... 121

Figure 4.11

p-p plot for the combined Indonesian data set (N=1,284). ............................ 126

Figure 4.12

p-p plots for the Indonesian subpopulations under study (N>20).............. 127

Figure 4.13

Neighbour-joining tree for the 31 Indonesian subpopulations .................... 132

Figure 4.14

AMOVA results for the Indonesian dataset portioned into the 31 originally sampled subpopulations............................................................................. 133

Figure 4.15

Principal Co-ordinates Analysis conducted on the full distance matrix for the 31 Indonesian subpopulations. ..................................................................... 136

Figure 4.16

Results from the STRUCTURE (v2.3.4) analyses.. ..................................................... 137

Figure 4.17

p-p plots for six island-based subpopulations.. ..................................................... 139

Figure 4.18

A neighbour-joining tree for the eight-island based subpopulations. ....... 144

Figure 4.19

AMOVA results for the Indonesian dataset partitioned into islandbased subpopulations........................................................................................................ 146

Figure 4.20

The Principal Co-ordinates Analysis conducted on the eight islandbased subpopulations showing the west/east clustering................................ 148

Figure 4.21

Results from STRUCTURE (v2.3.4) analyses conducted on the eight island-based subpopulations. ........................................................................................ 149

Figure 4.22

A plot of Rosenberg’s Accumulated In Divergence for the 12 SNPs used in this study. ................................................................................................................ 150

xvi

LIST OF TABLES Table 2.1

The distribution of Indonesian population samples collected by Eijkman Institute for Molecular Biology. ....................................................................50

Table 2.2

The distribution of Indonesian population samples collected by the Indonesian National Police. ...............................................................................................50

Table 2.3

Details of half volume AmpFℓSTR® Identifiler® reactions. ................................52

Table 2.4

PCR conditions for DNA amplification with the AmpFℓSTR® Identifiler® PCR amplification kit ...................................................................................53

Table 2.5

An example 96-well plate layout for CE separation of amplified DNA fragments....................................................................................................................................54

Table 2.6

Peak height requirements for allelic designation.. .................................................55

Table 2.7

Information on the SNPs analysed in this study......................................................57

Table 2.8

Details of the forward primers designed for SNP genotyping ..........................59

Table 2.9

Details of the reverse primers designed for SNP genotyping ...........................59

Table 2.10

PCR set up and thermal cycling protocols for the HRM kits tested. ...............61

Table 2.11

Sensitivity and performance testing conducted on each HRM kit.. ................62

Table 2.12

Details of SNPs and restriction enzymes used to confirm that the HRM SNP genotypes were accurate...............................................................................64

Table 2.13

The PCR and thermal cycling protocols for the restriction digest reactions used to confirm the genotypes of SNPs rs733559 (a), rs310850 and rs3892905 (b). ..........................................................................................64

Table 2.14

The optimised PCR and thermal cycling protocols for SNP genotyping using MeltDoctor® on the ViiA 7 instrument ............................................................65

Table 2.15

An example 96-well plate layout for SNP genotyping using HRM analysis.. ......................................................................................................................................66

Table 2.16

GDA parameters for equilibrium analysis. .................................................................70

Table 2.17

GDA parameters used for estimating co-ancestry distances between subpopulations. .......................................................................................................................71

Table 2.18

Options for the estimation of Wright’s F-statistics using GDA. ........................72 xvii

Table 3.1

Variant alleles detected during STR genotyping .....................................................75

Table 3.2

Truncated product method results for Indonesian subpopulations (N>20) and the combined Indonesian dataset.. .......................................................80

Table 3.3

Co-ancestry distances (x100) for the 36 Indonesian subpopulations studied. ........................................................................................................................................82

Table 3.4

Wright’s F-statistics calculated for the Indonesian dataset partitioned into 36 subpopulations. .......................................................................................................84

Table 3.5

Subpopulation designations for the STRUCTURE analysis on the STR data. ...............................................................................................................................................85

Table 3.6

Truncated product method results for the six island-based subpopulations. .......................................................................................................................91

Table 3.7

Co-ancestry distances (x100) for island-based subpopulation groupings ....................................................................................................................................93

Table 3.8

The truncated product method results for the regional subpopulations of Island Sumatra and Mainland Sumatra. ...............................97

Table 3.9

Co-ancestry distances (x100) for subpopulations within a) Island Sumatra and b) Mainland Sumatra. ...............................................................................97

Table 3.10

Co-ancestry distance (x100) matrix of the eight island-based Indonesian subpopulations. ..............................................................................................99

Table 3.11

Wright’s F-statistics calculated for the Indonesian dataset partitioned into eight island based subpopulations and for each island-based subpopulation. ...................................................................................................................... 101

Table 3.12

Island-based subpopulations used for the STRUCTURE analysis ..................... 103

Table 4.1

MeltDoctor® HRM MasterMix genotype reproducibility study..................... 108

Table 4.2

Genotypes of samples used during the optimisation of the HRM SNP genotyping method. ............................................................................................................ 109

Table 4.3

SNP genotypes of the three samples tested with the duplex and the number of melting domains expected in the melting profile ......................... 120

Table 4.4

A summary of the one-way ANOVAs conducted on the variation in temperature means for each genotype across a single HRM genotyping run...................................................................................................................... 123 xviii

Table 4.5

A summary of the one-way ANOVAs conducted on the temperature variation for each genotype across different HRM genotyping runs.......... 124

Table 4.6

Truncated product method results for Indonesian subpopulations (N>20) and the combined Indonesian dataset.. .................................................... 129

Table 4.7

Co-ancestry distances (x100) for the 31 Indonesian subpopulations studied. ..................................................................................................................................... 131

Table 4.8

Wright’s F-statistics calculated for the Indonesian dataset partitioned into 31 subpopulations. .................................................................................................... 133

Table 4.9

Subpopulation designations for the STRUCTURE analyses on the SNP data. ............................................................................................................................................ 134

Table 4.10

Tests of independence for six island-based subpopulations obtained with the truncated product method.. ......................................................................... 140

Table 4.11

Co-ancestry distances (x100) for subpopulations within island-based groups. ...................................................................................................................................... 141

Table 4.12

Co-ancestry distances (x100) for the eight island-based subpopulations.. ................................................................................................................... 143

Table 4.13

Wright’s F-statistics calculated for the Indonesian dataset partitioned into eight island-based subpopulations and for each island-based subpopulation. ...................................................................................................................... 145

Table 4.14

Island-based subpopulations used for the Structure (v2.3.4.) analyses. ................................................................................................................................... 147

Table 4.15

Results of the population assignment analysis for the eight islandbased subpopulations performed in GENALEX v6.5............................................. 151

Table 4.16

Results of the population assignment analysis for regional based subpopulations performed in GENALEX v6.5.. ........................................................ 152

xix

PUBLICATIONS AND PRESENTATIONS Publications Venables, S.J., McNevin, D.M., Daniel, R., Sarre, S.D., van Oorschot, R.A.H., and Walsh, S.J. (2011). An in-depth population genetic analysis of forensic short tandem repeat loci in Indonesia. Forensic Science International: Genetics Supplement Series. 3: e157 –e158

Presentations Venables, S.J., Daniel, R., Sarre, S.D., Sudoyo, H., van Oorschot, R.A.H., Walsh, S.J., and McNevin, D.M. (2012). Development of Forensically Relevant STR Allele Frequency Databases for use in Indonesia. ANZFSS 21st International Symposium on the Forensic Sciences, Hobart, September 2012 (Poster). Venables, S.J., McNevin, D.M., Daniel, R., Sarre, S.D., van Oorschot, R.A.H., and Walsh, S.J. (2011). An in-depth population genetic analysis of forensic short tandem repeat loci in Indonesia. International Society of Forensic Genetics, Vienna, August 2011 (Poster). Venables, S.J., McNevin, D.M., Daniel, R., Sarre, S.D., van Oorschot, R.A.H., and Walsh, S.J. (2010). Population genetics of forensic STR loci within South East Asia and Pacific Island populations. ANZFSS 20th International Symposium on the Forensic Sciences, Sydney, September 2010 (Poster).

xx