Bachman, L.F. Kunnan, A.J. Workbook

136 Book reviews Bachman, L.F. and Kunnan, A.J. 2005: Statistical analyses for language assessment workbook and CD ROM. Cambridge: Cambridge Univers...
Author: Madlyn Collins
3 downloads 4 Views 128KB Size
136

Book reviews

Bachman, L.F. and Kunnan, A.J. 2005: Statistical analyses for language assessment workbook and CD ROM. Cambridge: Cambridge University Press. 171 pp. ISBN: 0 521 60906 7 (paperback) £25.70 or $46.00.

Bachman and Kunnan (2005) is the workbook for Bachman (2004), which will hereafter be referred to as the main book (see separate review in this volume). Any workbook is defined in large part by its relationship to the main book, and this workbook is no exception. This workbook does not stand alone, depending as it does on the explanations in the main book. Yet it will clearly help any students who are serious about learning the material in the main book because this workbook provides carefully planned reinforcement in a variety of practical examples and hands-on exercises. The workbook also includes a CD that is loaded with useful data and other sorts of files. Given that the workbook and CD are complex and highly interrelated, I will begin by discussing the workbook and then turn to the CD, after which I will consider how well the two fit together as a package with the main book. Workbook At a superficial level, the first thing I noticed about the workbook was that it provides a useful Table of Contents and a one-level Index. The workbook also shares the same attractive layout as the main book and is presented in the same concise and clear writing style. Table 1 presents a detailed summary of the contents of both the main book and the workbook. Notice that the table shows the overall and chapter outlines of the main book in the first two columns along with numerous other details about the workbook in the remaining four columns. The first column also shows the organization for the workbook because it has exactly the same three main parts and 10 chapter headings as the main book. The third column shows that all 10 chapters in the workbook follow the same internal pattern with Key concepts and Conceptual exercises sections. Most of the chapters also have sections for Hand calculation exercises with small data sets, SPSS examples, and SPSS exercises. These repetitive patterns should help students know what to expect in each chapter and help them coordinate their reading and practice in the main book and workbook. Naturally, there is some variation in this pattern as appropriate for the various topics involved in the chapters (more on this below).

Topics covered in Bachman & Kunnan (2005) workbook All chapters: Key concepts; Conceptual exercises (write in book, with answer keys in .rtf files on CD)

Chapter 2 “Describing test scores”

None

Calculations in Excel

(Continued )

Excel data sets available (Means, SDs, 1 correlation coefficient calculated)

SPSS exercise None (naming variables, entering data)

Hand calculations with SPSS exercises small data sets (write answers in book; answer keys in .rtf files on CD)

Score distributions, SPSS example N size, mode, median, SPSS data sets graphically representing (FREQUENCIES procedure mean, range, available; score distributions including all descriptive standard deviation SPSS exercises (including an excellent and statistics, EXPLORE (FREQUENCIES useful discussion of boxprocedure including for wide variety and-whisker plots), shapes stem-and-leaf and of descriptive of score distributions, box-and-whisker plots); statistics plus

Part I: Basic concepts and statistics Chapter 1 Test usefulness, the nature Introducing SPSS “Basic concepts of language assessment, (SPSS terms, Using and terms” uses of language assessSPSS, Steps in ments for relative and using SPSS) absolute decisions, the steps in the measurement process, measurement scales, limitations on measurement, norm-referenced (NR) and criterion-referenced (CR) frames of reference, and using statistics for understanding and interpreting test scores

Topics covered in Bachman (2004) main book

Comparison of Content in Bachman (2004) with Bachman & Kunnan (2005)

Overall outline

Table 1

Book reviews 137

Continued

descriptive statistics of mode, median, and mean, indicators of variability, indications of skewness and kurtosis, and applications of descriptive statistics to test development and use

Topics covered in Bachman (2004) main book

Background for SPSS exercises

Topics covered in Bachman & Kunnan (2005) workbook All chapters: Key concepts; Conceptual exercises (write in book, with answer keys in .rtf files on CD) histograms, interpreting histograms)

Hand calculations with SPSS exercises small data sets (write answers in book; answer keys in .rtf files on CD)

Calculations in Excel

Chapter 3 Concept of correlation, SPSS example Linear & curvilinear SPSS data sets Excel data “Investigating Pearson and Spearman (FREQUENCIES relationships, available; sets relationships correlation coefficients, procedure for checking plotting scatterplots, SPSS exercises available among different problems and solutions univariate data including Pearson and (FREQUENCIES, (Means, sets of test related to correlation histograms, skewness, and Spearman EXPLORE, SDs, scores” coefficients, interpreting kurtosis, EXPLORE coefficients SCATTERPLOT, Pearson correlation coefficients, and procedure including & CORRELAcorrelation some advanced correla extreme values, TIONS) coefficents tional procedures (including scatterplots, CORRELATION calculated) overviews of multiple short procedure for choosing regression analysis,linear and calculating Pearson and analysis, factor, path, Spearman correlation and structural analysis, coefficients); Background equation modeling) for SPSS exercises (steps for all above SPSS procedures)

Overall outline

Table 1

138 Book reviews

(Continued )

Chapter 5 Sources of variation in SPSS example (RELIABILITY Spearman correlation, SPSS data sets Two small 15“Investigating language testing scores, procedure for estimating Spearman-Brown available; SPSS student data reliability for classical test theory (CTT) Cronbach alph and correction, and exercise sets in Excel norm-referenced and definitions of reliability, Guttman split-half Guttman split-half, RELIABILITY format avail tests” CTT reliability estimates (of corrected) and Cronbach alpha procedure for able (with M, internal consistency, estimating SD, SD2,

Part II: Statistics for test analysis & improvement Chapter 4 Classical item analysis for SPSS example (RELIABILITY Item difficulty, item SPSS data sets Excel data set “Analyzing test NR and CR tests, using item procedure for examining discrimination, available; SPSS available for tasks” statistics to improve tests, each item: scale mean if CR difference index exercise TULC (no item selection procedures, item deleted, scale variance RELIABILITY statistics as well as brief introductions if item deleted, corrected procedure for calcluated) to item response theory item-total, squared multiple examining and many-facet Rasch correlation, alpha if item number of measurement deleted) cases, number of items, overall alpha, mean for scale, scale standard deviation, two easiest and two most difficult items (including item difficulty, item SD, and item-total correlation) for different subtests

Book reviews 139

Continued

test-retest, parallel forms, & rater consistency estimates), CTT standard error of measurement, limitations of CTT, and brief coverage of generalizability theory and item response theory (IRT) precision of measurement, as well as factors that affect NR reliability

Topics covered in Bachman (2004) main book

Topics covered in Bachman & Kunnan (2005) workbook All chapters: Key concepts; Conceptual exercises (write in book, with answer keys in .rtf files on CD)

None

Cronbach alph and Guttman split-half corrected (1st-2nd half, and odd-even)

Hand calculations with SPSS exercises small data sets (write answers in book; answer keys in .rtf files on CD)

Chapter 6 Reliability in CR tests, the SPSS example (nothing Phi, SEMabs, SEmeas(Xi), “Investigating index of dependability, CR from SPSS here, authors CI, phi lambda, reliability for standard errors of refer readers to GENOVA agreement, kappa criterionmeasurement and confidence & mGENOVA programs); referenced tests” intervals, CR agreement Illustrative research study indices of threshold loss taken from Kunnan (1992) agreement and squaredwith a few conceptual error loss agreement, as exercises well as factors that affect agreement, kappa, phi, phi CR reliability estimates. lambda, F, & KR-20)

Overall outline

Table 1

One small 20-student Excel data set (with M, SD, SD2, and Pearson r statistics calculated) for hand calculations

Pearson r, t- test, and F-ratio statistics calculated)

Calculations in Excel

140 Book reviews

Chapter 8 “Tests of statistical significance”

None

None

None

(Continued )

The standard error of the SPSS examples (EXPLORE z-tests, t-tests, SPSS data sets Four Excel mean from a sample, the for histogram, descriptive confidence intervals, available; files which relationship between statistics,confidence p values SPSS exercise - include data standard error and sample intervals; T-TEST for T-TEST for and illustra statistics, differences between independent and dependent t-test, tions for means in large samples, samples; ONEWAY for confidence calculating confidence intervals, five level one-way ANOVA intervals; independent differences between means & post hoc comparisons) ONEWAY for & dependent in small samples using the 8-level t-tests with t-test, confidence intervals one-way p values, based on z-scores and t-tests, ANOVA F-ratio one-way comparing more than two and post hoc ANOVA no p group means using one-way comparisons values, & analysis of variance (ANOVA), graph for five statistical significance of means correlation coefficients, and very brief treatments of effect size, non-parametric tests, and N-way, multivariate, and repeated ANOVA)

Part III: Statistics for test use Chapter 7 Differences and relationships None beyond Key concepts; “Stating between samples, Conceptual exercises hypotheses formulating hypotheses, data and making collection designs, statistical statistical inference, sampling inferences” distributions, the normal distribution, one- and twotailed probabilities, levels of significance, and errors in hypothesis testing

Book reviews 141

Chapter 10 Reporting test scores in ways SPSS examples “Reporting and that are meaningful to (FREQUENCIES, interpreting test test users, problems DESCRIPTIVES, scores” interpreting test scores, & COMPUTE used for

Mean, standard deviation, z-scores, T-scores, weighted scores

The unitary view of validity, Write short answers to the None interpretive arguments, following scenarios; validation claims, quantitative Research designs; approaches to validation, References to illustrate the process of validation, and analysis of: test content, examples of quantitative test-taking processes, validations studies that show correlations among scores analysis of test content, from different tests, analysis of processes used in differences among taking tests, use of comparison groups, exploratory factor analysis, differences among nonmultitrait-multimethod equivalent criterion groups (MTMM) analysis, confirmatory factor analysis of MTMM correlations, and comparisons of equivalent groups & different groups

None

Calculations in Excel

SPSS data sets Eight Excel available; files with SPSSexercises - data sets FREQUENCIES, and answer

None

Hand calculations with SPSS exercises small data sets (write answers in book; answer keys in .rtf files on CD)

Chapter 9 “Investigating validity”

Topics covered in Bachman & Kunnan (2005) workbook All chapters: Key concepts; Conceptual exercises (write in book, with answer keys in .rtf files on CD)

Topics covered in Bachman (2004) main book

Continued

Overall outline

Table 1

142 Book reviews

CR scores, NR scores, comparing different types of NR scores, appropriateness of NR scores, and combining scores from different measures

calculating percentiles, descriptive statistics, skew, kurtosis, ses, sek, z-scores, T-scores, and graphing histogram)

DESCRIPTIVES, & COMPUTE used for calculating percentiles, descriptive statistics, skew, kurtosis, ses, sek, z-scores, T-scores, and graphing histogram

keys for z-score & T-score calculations

Book reviews 143

144

Book reviews

The last three columns in Table 1 are labeled as follows: Hand calculations with small data sets, SPSS exercises, and Calculations in Excel. Given that Table 1 provides a detailed description of the material presented in the workbook, I will not do so again here in prose. However, comparing the second column with the other columns (especially the third one) reveals that some topics covered in the main book are not included in the workbook, and conversely, some topics in the workbook are not covered in the main book. Topics found in main book but not in workbook. Those topics covered in the main book but not in the workbook tend to be ones that get little coverage in the main book to start with. These include the short overviews of multiple linear regression analysis, path analysis, factor analysis, and structural equation modeling found in Chapter 3; the brief introductions to item response theory (IRT) and many-facet Rasch measurement found in Chapter 4; the short discussions of generalizability theory and reliability in IRT in Chapter 5; and exploratory factor analysis, multitrait-multimethod (MTMM) analysis, confirmatory factor analysis of MTMM correlations, and comparisons of equivalent groups and different groups in Chapter 9. In fairness to the authors, they do refer students to additional information sources for the above topics. For example, in Chapter 6, the authors refer students to the generalizability software programs GENOVA and mGENOVA, which are downloadable from the internet, and provide some parts of an Illustrative research study taken from Kunnan (1992) with conceptual exercises that cover agreement, kappa, phi, phi lambda, as well as F and KR-20. Also in Chapter 9, the authors provide a number of useful references that illustrate analysis of test content, test-taking processes, correlations among scores from different tests, differences among comparison groups, and differences among non-equivalent criterion groups. Topics found in workbook but not in main book. Oddly, in Chapter 2, one of the Excel data files that contains calculations for a number of descriptive statistics, also includes a Pearson correlation coefficient before that statistic is introduced in the main book. The same sort of thing occurs in Chapter 5, where t, and F-ratio statistics are calculated in an Excel data set before those statistics have been explained in the main book. In Chapter 6, the F-ratio again appears in an exercise associated with the Illustrative research study before it has been explained in the main book. I also found it curious that, unlike the main book, the workbook uses and apparently advocates the use of the Spearman rank-order correlation coefficient (in lieu of the Pearson correlation coefficient)

Book reviews 145

in calculating split-half reliability. This use of the Spearman coefficient does not appear in the main book, but it shows up twice in the hand calculations in Chapter 5 in the workbook. The Pearson coefficient is not much more difficult to calculate (see main book pp. 85–87) than the Spearman rank-order coefficient, yet the Pearson coefficient is considered to be more precise. General comments about the workbook. As noted in the heading of the third column of Table 1, the most consistent sections included in the workbook chapters are the Key concepts and Conceptual exercises, which appear in all chapters. The Key concepts section is simply a list of the important terminology found in each chapter with no definitions provided (though the terms are defined in the text of the main book). The Conceptual exercises are very useful and in themselves make the workbook worth having for anyone interested in learning the material in the main book. Students can write their answers directly in the workbook and then check the answer keys, which are provided in .rtf files on the accompanying CD. Another positive feature of the workbook is that conceptual questions are not restricted to the Conceptual exercises, but are also sprinkled throughout the Hand calculations with small data sets and SPSS exercises. Thus students are frequently encouraged not only to calculate statistics but also to think about what those statistics mean. Emphasis on SPSS. One persistent problem with this workbook is the extent to which it relies on the SPSS statistical program. This is problematic for two reasons: (a) not everybody has access to SPSS on their computer and (b) SPSS is not particularly well suited to doing test development and validation statistics. According to the Preface in the workbook, the accompanying CD contains among other things, “. . . SPSS programs for running the statistical analyses for these data sets . . .” and “In addition, SPSS program files (*.sps) are provided for the exercises and examples in the Workbook” (p. ix of the workbook). These statements are a bit misleading. Novices might be led to believe that everything they will need to calculate statistics is included on the CD when, in fact, .sps files are not stand-alone “program files”, but rather are SPSS Syntax Document files that can only be used when SPSS is installed on the computer. In other words, these files will do nothing on a computer unless the SPSS statistical package is already present on the same computer. Since the SPSS examples and SPSS exercises are essential to the workbook and since they will have considerably more meaning if the students can actually run them on SPSS, the workbook can only be said to be fully functional if the students do, in fact, have access to the SPSS program.

146

Book reviews

Another issue related to SPSS is that this statistical program is not designed as a test analysis, test development, or test validation tool. Indeed, it is not particularly well suited to such tasks. SPSS is certainly not my program of choice in doing many of the statistics that are presented in the main book. Indeed, most of the testing statistics (especially those for CR tests) can be done much more conveniently and easily in Excel than in SPSS. The accompanying CD The CD has two directories. The first one, labeled “Answer files”, contains files with all the answers to various sorts of exercises in .rtf files, including answer keys for the Conceptual exercises (in all 10 chapters), Hand calculations with small data sets (in Chapters 2–6, & 10), SPSS exercises (in Chapters 2–5, 8, & 10), Illustrative research study (Chapter 6), Short answers to scenarios (Chapter 9) and Research designs (Chapter 9). Generally, these answer keys are complete and useful. In addition, the CD has a directory labeled “Data sets used in workbook chapters” that contains a number of files for seven of the ten chapters (Chapters 2, 3, 4, 5, 6, 8, & 10). These files come in seven different file formats: .xls Microsoft Excel Worksheet; .doc Microsoft Word Document; .rtf Rich Text Format; .sav SPSS Data Document; .sps SPSS Syntax Document; .spo SPSS Viewer Document; and .txt Text Document. In most cases, the data and other files are for examples and exercises in the workbook, but in other cases they are also for examples in the main book. Notice also that, some interesting supplementary material is provided in such files as: Description of data set ESLPE-F96.doc; Test booklet in ESLPE D2 F96.doc; Listening passage in ESLPE D2 Listening-Lectures.doc; Description of CTCS data set.doc; fisher’s z transformationformula.doc; p-z-equivs.doc, and so forth. Unfortunately, many of the other file names are cryptic and not particularly helpful in describing what the associated file contains. In addition, the files are not consistently labeled across chapters, nor are they uniform in the purposes for which the different file formats are used to provide one type of data/information/analysis or another. The main book and workbook/CD package Other mismatches occur in the ways the main book and workbook/CD package work together. Notice in Table 1 that four chapters (1, 6, 7 & 9) provide less of the sorts of material shown in the last

Book reviews 147

three columns of the table. Chapter 1 does offer an SPSS exercise that gives introductory practice in naming variables and data entry, which is about all that is necessary for a first chapter. However, Chapter 6 correctly says that “SPSS provides no programs for computing CR dependability indices” (p. 102), and the CD does provide data files in.xls and.sav formats for the hand calculations, but the workbook does not provide any means for learning how to use a computer to calculate key CR dependability estimates like phi and phi (lambda). Chapter 7, because it is predominantly conceptual, provides sufficient practice in the Conceptual exercises. However, Chapter 9 offers no computer calculations at all. This seems odd to me, given that all of the statistical procedures described in Chapter 9, except for confirmatory factor analysis, can be calculated in SPSS. Conclusion The level of detail involved in writing a workbook like the one being reviewed here often leads to situations where some topics are covered in one book that are not covered in the other. However, a problem more unique to this workbook is the extent to which it relies on the SPSS statistical program. Naturally, the degree to which the mismatch of material or the reliance on SPSS are critical issues will depend on who is using the workbook and for what purpose. On the positive side of the ledger, this workbook is closely and clearly linked to the main book on most main topics, and shares the positive features of the main book, including its clear layout, organization, and writing style. In addition, the exercises and answer keys are wide-ranging and useful, and perhaps more importantly, conceptual questions are scattered throughout the various exercises that regularly encourage students to think carefully about the meaning of all the statistics they are calculating. In short, despite a few problems, this workbook, along with the accompanying CD, should prove very useful for students learning the material in the Bachman (2004) main book. References Bachman, L.F. 2004: Statistical analyses for language assessment. Cambridge: Cambridge University Press. Kunnan, A.J. 1992: An investigation of a criterion-referenced test using G-theory, and factor and cluster analysis. Language Testing 9, 30–49.

James Dean Brown University of Hawai’i at Manoa