A POPULATION GENETIC ANALYSIS OF FORENSIC IDENTITY AND ANCESTRY MARKERS IN INDONESIA
Samantha Jane Venables
A thesis submitted for the degree of Doctor of Philosophy (Applied Science)
National Centre for Forensic Studies University of Canberra
December 2012
COPYRIGHT Under Section 35 of the Copyright Act of 1968, the author of this thesis is the owner of any copyright subsisting in the work, even though it is unpublished. Under Section 31(I)(a)(i), copyright includes the exclusive right to ‘reproduce the work in a material form’. Thus, copyright is infringed by a person who, not being the owner of the copyright, reproduces or authorises the reproduction of the work, or of more than a reasonable part of the work, in a material form, unless the reproduction is a ‘fair dealing’ with the work ‘for the purpose of research or study’ as further defined in Sections 40 and 41 of the Act. This thesis, entitled “A population genetic analysis of forensic identity and ancestry markers in Indonesia” must therefore be copied or used under the normal conditions of scholarly fair dealing for the purposes of research, criticism or review, as outlined in the provisions of the Copyright Act 1968. In particular, no results or conclusions should be extracted from it, nor should it be copied or closely paraphrased in whole or in part without the written consent of the author. Proper written acknowledgement should be made for any assistance obtained from this thesis. Copies of the thesis may be made by a library on behalf of another person provided the office in charge of the library is satisfied that the copy is being made for the purposes of research or study.
_____________________________________________ Signature of Candidate
_____________________________________________ Date i
ABSTRACT Population sub-structure and remote inbreeding cause individuals to become more genetically related. Since related individuals are more likely to share a DNA profile than unrelated individuals, the presence of population sub-structure causes inaccuracies in the match probability calculations that are presented in court by forensic biologists. Frequently, a co-ancestry (θ) correction is applied to match probability calculations to accommodate the variations that exist between subpopulations. Population sub-structure within Indonesia is highly probable given that the colonisation history of the archipelago has ensured the co-existence of numerous distinct traditional cultural and linguistic groups. The geography of Indonesia could also contribute to the formation of population sub-structure, with the 240 million residents spread across 6,000 inhabited islands some of which possess significant mountain ranges. However, the population genetic features of Indonesia have not been examined thoroughly and adequate allele frequency data for forensic identity markers in Indonesia does not exist. The aims of this project were to; a) genotype an extensive Indonesian sample set, covering many geographic locations and representing numerous linguistic and cultural groups, using forensic identity markers (autosomal STRs) and ancestry markers (autosomal SNPs) and identify genetically distinct subpopulations; b)compare the genetic subpopulation boundaries revealed through forensic STR and autosomal SNP genotyping and assess the efficacy of each marker type for detecting population differentiation and defining the genetic subpopulations within Indonesia; and c) construct defensible DNA databases for the genetic subpopulations discovered within Indonesia and determine appropriate sub-structure correction factors for use in forensic casework match probability assessments. In total, 15 forensic identity markers (Identifiler® kit) and 13 autosomal ancestry SNPs were genotyped in an extensive sample set from the Indonesian archipelago. Population genetic analyses revealed that both forensic identity and autosomal ancestry markers were able to discern population sub-structure within Indonesia and that the overarching patterns of population differentiation were the same. It was determined that islandbased subpopulations were the most suitable for the purposes of creating forensic ii
population databases. Appropriate values for the co-ancestry (θ) correction factor were determined to be θ=0.03 (3%) for Island Sumatra (N=129), θ=0.02 (2%) for Flores (N=245) and Sumba (N=174) and θ=0.01 (1%) for Mainland Sumatra (N=294), Sulawesi (N=419) and Java and Bali (N=484). The island-based subpopulations of Kalimantan (N=26) and Papua (N=57) are currently too small to make informed decisions about the appropriate value of θ. If desired by the Indonesian jurisdictions, each population database could be used with a θ=0.03 (3%), to assist the implementation of these population databases into routine forensic DNA casework. In summary, this study has generated allele frequency data for 15 forensic identity markers and characterised the levels of population sub-structure present within Indonesia using forensic identity and ancestry markers. Genetically appropriate population databases for the island-based subpopulations (Island Sumatra, Mainland Sumatra, Java & Bali, Kalimantan, Sulawesi, Flores, Sumba and Papua) have been generated and appropriate θ-correction factors are now available for use in the calculation of DNA match statistics.
iii
ACKNOWLEDGEMENTS There are so many people who have contributed to making my PhD journey all that it was, and what a journey this has been! First and foremost, I must thank my supervisors Dr. Runa Daniel, Professor Stephen Sarre, Dr. Roland van Oorschot, Dr. Simon Walsh and Dr. Dennis McNevin. Initially, the thought of having a supervisory “subpopulation” terrified me, but all of you defied every expectation that I had of what a supervisor should be. You all willingly shared your expertise, showed commitment to my project and provided me with invaluable guidance and much appreciated encouragement. I am especially grateful for your contributions to the completion of this thesis. I have thoroughly enjoyed the opportunity to work with you all and hope that we can collaborate again in the future. This project would not have been possible without the support of the Australian Federal Police, Forensic and Data Centres who generously provided research funding for consumables and access to an A.C.T. Caucasian STR dataset. The Eijkman Institute for Molecular Biology, through Professor Herawati Sudoyo, provided the extensive Indonesian population sample set for genotyping and the Indonesian National Police provided STR genotype data to complement the Eijkman sample set. The University of Canberra provided me with an Australian Postgraduate Award. In addition, I was kindly granted use of the Institute for Applied Ecology’s Wildlife Genetics Laboratory during my SNP genotyping. The National Centre for Forensic Studies at the University of Canberra provided a stimulating and productive working environment and funding for attendance at national and international conferences. Thank you to all of my colleagues, particularly Daniel Augustinus and Liz Peters who were fantastic office mates and Dr. Greg Adcock and Dr. Michelle Gahan for technical assistance and advice. Carolyn McLaren, we started our forensic science journey together as naïve undergraduates and here we are, many years later, completing our PhD journey! Your friendship was a positive influence on me throughout our many years of research. Your hard work and conscientious approach to juggling study and your family reminded me in the hard times that it can be done.
v
I am appreciative to the individuals from the AFP Forensic Biology team that have kindly and patiently answered my many questions over the years, especially Andrew Preston, Paul Roffey, Jo Lee, Tim Shaw and Shelley Fountain. During my time working at Weston, you passed on many things that one can only learn from years of setting up endless amps and analysing electropherograms. Who would have thought that too much DNA and no DNA can produce the same results! Without my amazing friends I would not have been able to endure the tough times, nor enjoy the many triumphs along the way. Laura and Stacey provided moral support in the forms of food and entertainment throughout my candidature and were generous with doses of encouragement. Vicki was a fantastic listener and always had time for a much needed chat when I dropped by to see her. You were my lifeline to the world outside of my project while I was working nightshift, so thank you! Karen deserves a special mention; a fantastic boss, mentor, and friend, who illuminated the (sometimes bumpy and winding) path to completion with her words of wisdom. I am lucky to count you all as friends. My wonderful family has provided endless support throughout my PhD journey and they all deserve thanks – the Dawes’, Orford’s and Venables’. Your love and never-ending encouragement has been much appreciated. You have been with me every step of the way, encouraging me to accomplish my lofty goals and supporting me in every way possible. None of my accomplishments would have been possible without you all. Last, but certainly not least, I would like to sincerely thank my husband Doug. You have been with me from day one this journey and you are still here at the end, four years and a wedding later. You have been tolerant and patient throughout my studies, particularly for the eight weeks that I essentially lived in the laboratory, and your support made such daunting tasks achievable. Your ability to motivate me, coupled with your perfectly timed gifts of hot chips with gravy, sustained me throughout the thesis writing process. Thank you, I couldn’t have done this without you.
vi
I dedicate this thesis to my grandparents, Bryan and Loretta Dawes; my parents, Sean and Kim Orford; and my husband, Doug Venables. Your unwavering confidence in my abilities still astounds me.
vii
ABBREVIATIONS A
Adenine
A.C.T.
Australian Capital Territory
A.D.
Anno Domini
AFP
Australian Federal Police
ALFRED
Allele Frequency Database
AMOVA
Analysis of Molecular Variance
ANOVA
Analysis of Variance
BGA
Biogeographical Ancestry
bp
Base pair
C
Cytosine
CCD
Charged Coupled Device
CE
Capillary Electrophoresis
CEPH
Centre d'Etude du Polymorphisme Humain
CODIS
Combined DNA Index System
ddNTP
Dideoxyribonucleotide triphosphate
DNA
Deoxyribonucleic Acid
dNTP
Deoxyribonucleotide triphosphate
dsDNA
Double stranded DNA
Exo
Exonuclease
FIS / f
Inbreeding parameter within populations
FIT / F
Total inbreeding coefficient
FST / θ
Co-ancestry coefficient
FROG-kb
Forensic Research/Reference on Genetics knowledge base
G
Guanine
GDA
Genetic Data Analysis
HD
Defence hypothesis
HP
Prosecution hypothesis
HEO
Hexaethyleneoxide
HRM
High Resolution Melting
hTERT
Human Telomerase Reverse Transcriptase
HUGO
Human Genome Organisation
In
Rosenberg’s informativeness for assignment viii
INP
Indonesian National Police
IPC
Internal Positive Control
kV
Kilovolt
LR
Likelihood Ratio
Mb
Mega base pair
MCMC
Markov Chain Monte Carlo
µL
Microlitre
MGB
Minor Groove Binder
mtDNA
mitochondrial DNA
NFQ
Non Fluorescent Quencher
ng
Nanogram
N.T.
Northern Territory
NTC
No Template Control
NRY
Non-Recombining portion of the Y-chromosome
OL
Off Ladder
PanSNP
The HUGO Pan Asian SNP Consortium
PC
Positive Control
PCR
Polymerase Chain Reaction
PCoA
Principal Co-ordinates Analysis
PNG
Papua New Guinea
RFU
Relative Fluorescence Unit
SAP
Shrimp Alkaline Phosphatase
SBE
Single Base Extension
SNP
Single Nucleotide Polymorphism
STR
Short Tandem Repeat
T
Thymine
Tm
Melting Temperature
TE
Tris Ethylene Diamine Tetra Acetic Acid (buffer)
TPM
Truncated Product Method
U.S.A.
United States of America
UV
Ultraviolet (radiation)
ix
TABLE OF CONTENTS Copyright ........................................................................................................................................ i Abstract .......................................................................................................................................... ii Certificate Of Authorship Of Thesis....................................................................................... iv Acknowledgements .................................................................................................................... v Abbreviations ........................................................................................................................... viii Table of Contents......................................................................................................................... x List of Figures ............................................................................................................................ xiv List of Tables.............................................................................................................................xvii Publications and Presentations ............................................................................................ xx Publications .................................................................................................................................................... xx Presentations................................................................................................................................................. xx 1. Introduction .......................................................................................................................... 1 1.1 The evolutionary history of Homo sapiens led to genetic differences between populations .............................................................................................................................................. 1 1.1.1
The settlement of Australia, Papua New Guinea and South East Asia had lasting cultural and genetic impacts....................................................... 2
1.2 Detecting genetic differences between populations ........................................................... 8 1.2.1
Forensic Identity Markers .............................................................................................. 9
1.2.2
Forensic Ancestry Markers ..........................................................................................21
1.3 Implications of population sub-structure in forensic casework ................................ 37 1.3.1
Quantifying population sub-structure ....................................................................37
1.3.2
Use of likelihood ratios to determine DNA profile probability...................38
1.3.3
Population genetic principles that impact forensic DNA match probabilities ........................................................................................................................40
1.4 Indonesia: A regional context ...................................................................................................... 42 1.4.1
Population Diversity........................................................................................................42
1.4.2
Forensic DNA analysis in Indonesia.........................................................................46
1.5 Aims and Approaches...................................................................................................................... 46 2. Methods ................................................................................................................................48 2.1 Samples utilised in this study ...................................................................................................... 48 2.1.1
Indonesian samples collected by the Eijkman Institute for Molecular Biology .............................................................................................................48
2.1.2
Indonesian genotypes from Indonesian National Police samples ............48 x
2.2 Sterilisation Procedures ................................................................................................................. 51 2.3 DNA Quantitation .............................................................................................................................. 51 2.4 Analysis of forensic identity markers ...................................................................................... 51 2.4.1
STR amplification ..............................................................................................................51
2.4.2
STR profiling........................................................................................................................53
2.4.3
STR data analysis ..............................................................................................................55
2.5 Analysis of ancestry informative SNP markers................................................................... 56 2.5.1
SNP selection.......................................................................................................................56
2.5.2
SNP primer design ............................................................................................................58
2.5.3
SNP genotyping ..................................................................................................................60
2.5.4
SNP data analysis ..............................................................................................................66
2.6 Population genetic analyses conducted on the STR and SNP data ............................ 69 2.6.1
Assessing departure from independence..............................................................69
2.6.2
Calculating co-ancestry distances and generating neighbourjoining trees .........................................................................................................................71
2.6.3
Calculating Wright’s F-statistics ................................................................................72
2.6.4
Assessing population structure .................................................................................72
2.6.5
Determining allele frequencies ..................................................................................74
3. Forensic Identity Marker Results..................................................................................75 3.1 STR genotyping................................................................................................................................... 75 3.1.1
Variant alleles detected in the Eijkman data set................................................75
3.2 Evaluating the data according to subpopulation of origin ............................................ 76 3.2.1
Assessing departure from independence..............................................................76
3.2.2
Determining co-ancestry distance ............................................................................81
3.2.3
Calculating Wright’s F-statistics ................................................................................84
3.2.4
Assessing population structure .................................................................................84
3.3 Evaluating the data according to island of origin .............................................................. 89 3.3.1
Assessing departure from independence..............................................................89
3.3.2
Determining co-ancestry distance ............................................................................91
3.3.3
Calculating Wright’s F-statistics ............................................................................. 101
3.3.4
Assessing population structure .............................................................................. 102
3.4 STR allele frequency data for island-based subpopulations ......................................103 4. Autosomal Ancestry Marker Results ......................................................................... 106 4.1 HRM method optimisation ..........................................................................................................106 4.1.1
Assessment of HRM kit sensitivity and performance................................... 106 xi
4.1.2
Reproducibility of SNP genotyping using MeltDoctor™ HRM MasterMix .......................................................................................................................... 108
4.1.3
SNP genotype confirmation using restriction enzyme digests ................ 109
4.2 The utility of HRM as a forensic SNP genotyping method ...........................................119 4.2.1
SNP genotyping ............................................................................................................... 119
4.2.2
Multiplex capability ...................................................................................................... 120
4.2.3
Reproducibility of the HRM SNP genotyping assays .................................... 122
4.3 Evaluating the data according to subpopulation of origin ..........................................125 4.3.1
Assessing departure from independence........................................................... 125
4.3.2
Determining co-ancestry distance ......................................................................... 130
4.3.3
Calculating Wright’s F-statistics ............................................................................. 133
4.3.4
Assessing population structure .............................................................................. 133
4.4 Evaluating the data according to island of origin ............................................................138 4.4.1
Assessing departure from independence........................................................... 138
4.4.2
Determining co-ancestry distance ......................................................................... 140
4.4.3
Calculating Wright’s F-statistics ............................................................................. 145
4.4.4
Assessing population structure .............................................................................. 146
4.5 Applicability of the autosomal SNPs as ancestry markers within Indonesia .....150 4.5.1
Informativeness of the SNP markers for ancestry inference.................... 150
4.5.2
Population assignment................................................................................................ 151
4.6 SNP allele frequency data for island-based subpopulations ......................................152 5. Discussion ......................................................................................................................... 153 5.1 Genetic population sub-structure in Indonesia ................................................................153 5.1.1
Insights from forensic identity markers ............................................................. 153
5.1.2
Insights from autosomal ancestry markers ...................................................... 157
5.1.3
Forensic identity and autosomal ancestry markers have different capacities for detecting population sub-structure ........................................ 164
5.2 Forensic implications of population sub-structure in Indonesia .............................166 5.3 Incidental insights ...........................................................................................................................168 5.3.1
Rare alleles at the D21S11 locus ............................................................................ 168
5.3.2
Autosomal ancestry markers ................................................................................... 169
5.4 The utility of HRM for forensic SNP genotyping...............................................................170 6. Conclusions....................................................................................................................... 173 Appendix 1 - Identifiler® kit Details ................................................................................ 174 Appendix 2 - STR Variant Confirmations......................................................................... 175 xii
Appendix 3 - STR Allele Frequency Tables...................................................................... 183 Appendix 4 - ANOVA results for within run variation.................................................. 199 Appendix 5 - ANOVA results for between run variation.............................................. 211 Appendix 6 - SNP Allele Frequency Tables...................................................................... 215 References................................................................................................................................ 219
xiii
LIST OF FIGURES Figure 1.1
Major Homo sapiens population movements and their timings......................... 1
Figure 1.2
A map of Oceania, designating the areas of Micronesia, Melanesia and Polynesia, as the terms are used in this study............................................................ 3
Figure 1.3
Indonesia and surrounding countries in South East Asia. The Indonesian islands referred to in this thesis are indicated in bold. ................. 5
Figure 1.4
The internal repeat structure of the complex STR D21S11. ..............................12
Figure 1.5
Overview of the processes underpinning the 5' nuclease (TaqMan®) assays utilised by the Quantifiler® Human DNA quantitation kit. ..................15
Figure 1.6
A SNP involves the mutation of a single base (underlined). ..............................21
Figure 1.7
A schematic representation of the types of bi-allelic SNPs. ...............................22
Figure 1.8
A schematic representation of the SNaPshot® process........................................32
Figure 1.9
Distribution of various language families throughout the Indonesian archipelago. ...............................................................................................................................44
Figure 1.10
The major branches of the Austronesian language family. ................................45
Figure 2.1
Collection sites in the Indonesian archipelago for samples used in this study. ...................................................................................................................................49
Figure 2.2
A schematic representation illustrating the use of the “Manual Call” function........................................................................................................................................67
Figure 3.1
The p-p plot for the combined Indonesian data set (N=1,828). .......................77
Figure 3.2 a) p-p plots for the Indonesian subpopulations under study (N>20).................78 Figure 3.3
A neighbour-joining tree showing the genetic relationships between the 36 Indonesian subpopulations (based on the co-ancestry distance matrix) and an A.C.T. Caucasian out-group (npops=37 and N=2,051). ....................................................................................................................................83
Figure 3.4
Results from the AMOVA analysis for the 36 originally sampled subpopulations. .......................................................................................................................84
Figure 3.5
Principal Co-ordinates Analysis conducted on the full distance matrix for the 36 Indonesian subpopulations .........................................................................87 xiv
Figure 3.6
Results from the STRUCTURE (v 2.3.4) analyses. ........................................................88
Figure 3.7
p-p plots for the six island-based subpopulations. .................................................90
Figure 3.8
The neighbour-joining tree for the subpopulations from Flores. ...................94
Figure 3.9
The neighbour-joining tree for the six subpopulations from Sulawesi........94
Figure 3.10
The neighbour-joining tree for the subpopulations from Sumba. ..................95
Figure 3.11
The neighbour-joining tree for subpopulations from Sumatra........................95
Figure 3.12
p-p plots for the regional subpopulations of Island Sumatra and Mainland Sumatra ..................................................................................................................96
Figure 3.13
The neighbour-joining tree depicting the genetic relationships between eight island-based subpopulations within Indonesia and an A.C.T. Caucasian out group. ............................................................................................. 100
Figure 3.14
AMOVA analysis in GENALEX v6.5 for the eight island-based subpopulations. .................................................................................................................... 102
Figure 3.15
Principal Co-ordinates Analysis of the eight island based subpopulations showing the west/east subpopulation clusters.................. 104
Figure 3.16
Results from the STRUCTURE (V 2.3.4) analyses....................................................... 105
Figure 4.1
Sensitivity results for the a) MeltDoctor™ HRM MasterMix, b) Precision Melt Supermix, c) KAPA HRM Fast Mix and d) SensiFast™ HRM kits. .................................................................................................................................. 107
Figure 4.2
HRM genotyping results for SNP rs733559 in the samples used for the assessment of kit sensitivity and performance. ............................................ 112
Figure 4.3
Gel images of the restriction enzyme digest confirming the genotype of SNP rs733559. ................................................................................................................. 112
Figure 4.4
HRM genotyping results for SNP rs733559............................................................ 113
Figure 4.5
Gel images of the restriction enzyme digest confirmation of sample genotypes for SNP rs733559. ........................................................................................ 114
Figure 4.6
HRM genotypes for SNP rs310850.............................................................................. 115
Figure 4.7
Gel images of the restriction enzyme digest confirming genotypes for SNP rs310850........................................................................................................................ 116
Figure 4.8
HRM results for SNP rs3892905. ............................................................................... 117 xv
Figure 4.9
Gel images from the restriction enzyme digest confirming genotypes for SNP rs3892905.............................................................................................................. 118
Figure 4.10
Duplex experiment for samples 5, 6 and 7 using rs963170 and rs10494531. ........................................................................................................................... 121
Figure 4.11
p-p plot for the combined Indonesian data set (N=1,284). ............................ 126
Figure 4.12
p-p plots for the Indonesian subpopulations under study (N>20).............. 127
Figure 4.13
Neighbour-joining tree for the 31 Indonesian subpopulations .................... 132
Figure 4.14
AMOVA results for the Indonesian dataset portioned into the 31 originally sampled subpopulations............................................................................. 133
Figure 4.15
Principal Co-ordinates Analysis conducted on the full distance matrix for the 31 Indonesian subpopulations. ..................................................................... 136
Figure 4.16
Results from the STRUCTURE (v2.3.4) analyses.. ..................................................... 137
Figure 4.17
p-p plots for six island-based subpopulations.. ..................................................... 139
Figure 4.18
A neighbour-joining tree for the eight-island based subpopulations. ....... 144
Figure 4.19
AMOVA results for the Indonesian dataset partitioned into islandbased subpopulations........................................................................................................ 146
Figure 4.20
The Principal Co-ordinates Analysis conducted on the eight islandbased subpopulations showing the west/east clustering................................ 148
Figure 4.21
Results from STRUCTURE (v2.3.4) analyses conducted on the eight island-based subpopulations. ........................................................................................ 149
Figure 4.22
A plot of Rosenberg’s Accumulated In Divergence for the 12 SNPs used in this study. ................................................................................................................ 150
xvi
LIST OF TABLES Table 2.1
The distribution of Indonesian population samples collected by Eijkman Institute for Molecular Biology. ....................................................................50
Table 2.2
The distribution of Indonesian population samples collected by the Indonesian National Police. ...............................................................................................50
Table 2.3
Details of half volume AmpFℓSTR® Identifiler® reactions. ................................52
Table 2.4
PCR conditions for DNA amplification with the AmpFℓSTR® Identifiler® PCR amplification kit ...................................................................................53
Table 2.5
An example 96-well plate layout for CE separation of amplified DNA fragments....................................................................................................................................54
Table 2.6
Peak height requirements for allelic designation.. .................................................55
Table 2.7
Information on the SNPs analysed in this study......................................................57
Table 2.8
Details of the forward primers designed for SNP genotyping ..........................59
Table 2.9
Details of the reverse primers designed for SNP genotyping ...........................59
Table 2.10
PCR set up and thermal cycling protocols for the HRM kits tested. ...............61
Table 2.11
Sensitivity and performance testing conducted on each HRM kit.. ................62
Table 2.12
Details of SNPs and restriction enzymes used to confirm that the HRM SNP genotypes were accurate...............................................................................64
Table 2.13
The PCR and thermal cycling protocols for the restriction digest reactions used to confirm the genotypes of SNPs rs733559 (a), rs310850 and rs3892905 (b). ..........................................................................................64
Table 2.14
The optimised PCR and thermal cycling protocols for SNP genotyping using MeltDoctor® on the ViiA 7 instrument ............................................................65
Table 2.15
An example 96-well plate layout for SNP genotyping using HRM analysis.. ......................................................................................................................................66
Table 2.16
GDA parameters for equilibrium analysis. .................................................................70
Table 2.17
GDA parameters used for estimating co-ancestry distances between subpopulations. .......................................................................................................................71
Table 2.18
Options for the estimation of Wright’s F-statistics using GDA. ........................72 xvii
Table 3.1
Variant alleles detected during STR genotyping .....................................................75
Table 3.2
Truncated product method results for Indonesian subpopulations (N>20) and the combined Indonesian dataset.. .......................................................80
Table 3.3
Co-ancestry distances (x100) for the 36 Indonesian subpopulations studied. ........................................................................................................................................82
Table 3.4
Wright’s F-statistics calculated for the Indonesian dataset partitioned into 36 subpopulations. .......................................................................................................84
Table 3.5
Subpopulation designations for the STRUCTURE analysis on the STR data. ...............................................................................................................................................85
Table 3.6
Truncated product method results for the six island-based subpopulations. .......................................................................................................................91
Table 3.7
Co-ancestry distances (x100) for island-based subpopulation groupings ....................................................................................................................................93
Table 3.8
The truncated product method results for the regional subpopulations of Island Sumatra and Mainland Sumatra. ...............................97
Table 3.9
Co-ancestry distances (x100) for subpopulations within a) Island Sumatra and b) Mainland Sumatra. ...............................................................................97
Table 3.10
Co-ancestry distance (x100) matrix of the eight island-based Indonesian subpopulations. ..............................................................................................99
Table 3.11
Wright’s F-statistics calculated for the Indonesian dataset partitioned into eight island based subpopulations and for each island-based subpopulation. ...................................................................................................................... 101
Table 3.12
Island-based subpopulations used for the STRUCTURE analysis ..................... 103
Table 4.1
MeltDoctor® HRM MasterMix genotype reproducibility study..................... 108
Table 4.2
Genotypes of samples used during the optimisation of the HRM SNP genotyping method. ............................................................................................................ 109
Table 4.3
SNP genotypes of the three samples tested with the duplex and the number of melting domains expected in the melting profile ......................... 120
Table 4.4
A summary of the one-way ANOVAs conducted on the variation in temperature means for each genotype across a single HRM genotyping run...................................................................................................................... 123 xviii
Table 4.5
A summary of the one-way ANOVAs conducted on the temperature variation for each genotype across different HRM genotyping runs.......... 124
Table 4.6
Truncated product method results for Indonesian subpopulations (N>20) and the combined Indonesian dataset.. .................................................... 129
Table 4.7
Co-ancestry distances (x100) for the 31 Indonesian subpopulations studied. ..................................................................................................................................... 131
Table 4.8
Wright’s F-statistics calculated for the Indonesian dataset partitioned into 31 subpopulations. .................................................................................................... 133
Table 4.9
Subpopulation designations for the STRUCTURE analyses on the SNP data. ............................................................................................................................................ 134
Table 4.10
Tests of independence for six island-based subpopulations obtained with the truncated product method.. ......................................................................... 140
Table 4.11
Co-ancestry distances (x100) for subpopulations within island-based groups. ...................................................................................................................................... 141
Table 4.12
Co-ancestry distances (x100) for the eight island-based subpopulations.. ................................................................................................................... 143
Table 4.13
Wright’s F-statistics calculated for the Indonesian dataset partitioned into eight island-based subpopulations and for each island-based subpopulation. ...................................................................................................................... 145
Table 4.14
Island-based subpopulations used for the Structure (v2.3.4.) analyses. ................................................................................................................................... 147
Table 4.15
Results of the population assignment analysis for the eight islandbased subpopulations performed in GENALEX v6.5............................................. 151
Table 4.16
Results of the population assignment analysis for regional based subpopulations performed in GENALEX v6.5.. ........................................................ 152
xix
PUBLICATIONS AND PRESENTATIONS Publications Venables, S.J., McNevin, D.M., Daniel, R., Sarre, S.D., van Oorschot, R.A.H., and Walsh, S.J. (2011). An in-depth population genetic analysis of forensic short tandem repeat loci in Indonesia. Forensic Science International: Genetics Supplement Series. 3: e157 –e158
Presentations Venables, S.J., Daniel, R., Sarre, S.D., Sudoyo, H., van Oorschot, R.A.H., Walsh, S.J., and McNevin, D.M. (2012). Development of Forensically Relevant STR Allele Frequency Databases for use in Indonesia. ANZFSS 21st International Symposium on the Forensic Sciences, Hobart, September 2012 (Poster). Venables, S.J., McNevin, D.M., Daniel, R., Sarre, S.D., van Oorschot, R.A.H., and Walsh, S.J. (2011). An in-depth population genetic analysis of forensic short tandem repeat loci in Indonesia. International Society of Forensic Genetics, Vienna, August 2011 (Poster). Venables, S.J., McNevin, D.M., Daniel, R., Sarre, S.D., van Oorschot, R.A.H., and Walsh, S.J. (2010). Population genetics of forensic STR loci within South East Asia and Pacific Island populations. ANZFSS 20th International Symposium on the Forensic Sciences, Sydney, September 2010 (Poster).
xx