Item Response Theory and Computerized Adaptive Testing

Item Response Theory and Computerized Adaptive Testing Hands-on Workshop, day 2 John Rust, [email protected] Iva Cek, [email protected] Luning Sun, ls523@...

Author: Osborne Lucas

17 downloads 1 Views 971KB Size

Report

Download PDF

Recommend Documents

Item Response Theory

The New Psychometrics Item Response Theory

Item Response Theory. Psych 818 DeShon

Fixed item parameter calibration for assessing differential item functioning in computerized adaptive tests

Comparison of classical test theory and item response theory in terms of item parameters 1

Modeling Nonignorable Missing Data With Item Response Theory (IRT)

Adaptive Filtering - Theory and Applications

Adaptive immune system response

Adaptive Resonance Theory (ART)

The Application of the Monte Carlo Approach to Cognitive Diagnostic Computerized Adaptive Testing With Content Constraints

Purchase Specification for Universal Testing Machine (computerized)

Polytomous adaptive classification testing: Effects of item pool size, test termination criterion and number of cutscores

Assembling an item-bank for computerised linear and adaptive testing in Geography

WDW-200Y Computerized Electronic Universal Testing Machine

Model-Based Collaborative Filtering Analysis of Student Response Data: Machine-Learning Item Response Theory

IRT Models for Polytomous Response Data. University of Kansas Item Response Theory Stats Camp 07

VALIDITY OF COGNITIVE ABILITY TESTS COMPARISON OF COMPUTERIZED ADAPTIVE TESTING WITH PAPER AND PENCIL AND COMPUTER-BASED FORMS OF ADMINISTRATIONS

CHAPTER 18 ADAPTIVE STRUCTURATION THEORY

Measurement Equivalence: A Comparison of Methods Based on Confirmatory Factor Analysis and Item Response Theory

Advanced Autonomic Response Testing

Reader Response Theory

THERMAL RESPONSE TESTING: DEVELOPMENT AND PRACTICE

Current Transformer Theory & Testing

Part I: Theory, Analysis, and Testing

Item Response Theory and Computerized Adaptive Testing Hands-on Workshop, day 2 John Rust, [email protected] Iva Cek, [email protected] Luning Sun, [email protected] Michal Kosinski, [email protected]

www.psychometrics.cam.ac.uk

Goals

 General understanding of IRT and CAT concepts  No equations!

 Acquire necessary technical skills (R)

 Tomorrow: Build your own IRT-based CAT tests using Concerto

Introduction to IRT

Some materials and examples come from the ESRC RDI in Applied Psychometrics run by: Anna Brown (University of Cambridge) Jan Böhnke (University of Trier) Tim Croudace (University of Cambridge)

Classical Test Theory    

Observed Test Score = True Score + random error Item difficulty and discrimination Reliability Limitations:  Single reliability value for the entire test and all participants  Scores are item dependent  Item stats are sample dependent  Bias towards average difficulty in test construction

1 Probability of getting item right

Measured concept (ability)

Ratio of correct responses to items on different level of total score

Please mind that those and many other graphs presented here are just Excel based mock-ups created for the presentation purposes rather than representing actual data

Item Response Function Binary items

1

Probability of getting item right Inattention

Parameters: • Difficulty • Discrimination • Guessing • Inattention

Difficulty Guessing Measured concept (theta)

Models: • 1 Parameter • 2 Parameter • 3 Parameter • 4 Parameter • unfolding

One-Parameter Logistic Model/Rasch Model (1PL)

7 items of varying difficulty (b)

Two-Parameter Logistic Model (2PL)

5 items of varying difficulty (b) and discrimination (a)

Three-Parameter Model (3PL)

One item showing the guessing parameter (c)

Option Response Function Binary items 1.0

Correct response

Probability

0.8

0.6

0.4

0.2

0.0 -3.0

-2.0

-1.0

0.0

Theta

1.0

2.0

3.0

Incorrect response

Probability of Correct + Probability of Incorrect = 1

Graded Model (example of a model with polytomous items – e.g. Likert Scales)

“I experience dizziness when I first wake up in the morning” (0) “never” (1) “rarely” (2) “some of the time” (3) “most of the time” (4) “almost always”

Category Response Curves for an item representing the probability of responding in a particular category conditional on trait level

Fisher Information Function

Probability

1.0

0.8

0.6

0.4

0.2

0.0 -3.0

-2.0

-1.0

0.0

Theta

1.0

2.0

3.0

(Fisher) Test Information Function Three items

Probability

1.0

0.8

0.6

0.4

0.2

0.0 -3.0

-2.0

-1.0

0.0

Theta

1.0

2.0

3.0

TIF and Standard Error (SE)  Error of measurement inversely related to information  Standard error (SE) is an estimate of measurement precision at a given theta

Scoring 1.0

Probability

Test: 1. Normal distribution 2. q1 – Correct 3. q2 – Correct 4. q3 - Incorrect

0.8

0.6

0.4

0.2

0.0 -3.0

-2.0

-1.0

0.0

1.0

2.0

Theta

Most likely score Most likely score

3.0

Classical Test Theory vs. Item Response Theory Classical

IRT

Total score

Individual items (questions)

Same for all participants and scores

Estimated for each score / participant

Adaptivity

Virtually not possible

Possible

Score

Depends on the items

Item independent

Item Parameters

Sample dependent

Sample independent

Preferred items

Average difficulty

Any difficulty

Modelling / Interpretation Accuracy / Information

Why use Item Response Theory?       

Reliability for each examinee / latent trait level Modelling on the item level Examinee / Item parameters on the same scale Examinee / Item parameters invariance Score is item independent Adaptive testing Also, test development is: cheaper and faster!

ltm package

IRT in R

Suggested Resource: Computerised Adaptive Testing: The State of the Art (November 2010) Dr Philipp Doebler of the University of Munster describes the latest thinking on adaptivity in psychometric testing to an audience of psychologists.

“Mobility” Survey  A rural subsample of 8445 women from the Bangladesh Fertility Survey of 1989 (Huqand Cleland, 1990).  The dimension of interest is women’s mobility and social freedom.  Described in: Bartholomew, D., Steel, F., Moustaki, I. and Galbraith, J. (2002) The Analysis and Interpretation of Multivariate Data for Social Scientists. London: Chapman and Hall.

 Data is available within R software package “ltm”

“Mobility” Survey

Women were asked whether they could engage in the following activities alone (1 = yes, 0 = no): 1. Go to any part of the village/town/city. 2. Go outside the village/town/city. 3. Talk to a man you do not know. 4. Go to a cinema/cultural show. 5. Go shopping. 6. Go to a cooperative/mothers' club/other club. 7. Attend a political meeting. 8. Go to a health centre/hospital.

ltm package

install.packages("ltm") require(ltm) help(ltm) head(Mobility) my1pl