MODERATION AND SCALING POLICY

Cross-FLTSEC 16 October 2014 Document G MODERATION AND SCALING POLICY This policy should be read in conjunction with the University’s examination con...
Author: Willa Thompson
39 downloads 1 Views 112KB Size
Cross-FLTSEC 16 October 2014 Document G

MODERATION AND SCALING POLICY This policy should be read in conjunction with the University’s examination conventions and policy of moderation and scaling. Purpose and Objectives: This policy and guidance document has three specific objectives: 1. To clarify the University core requirements on moderation and scaling practices and provide a framework to ensure consistently impartial and equitable marking practices; 2. To designate specific responsibilities for moderation and scaling within INTO Newcastle; 3. To provide clear guidance on suggested moderation and scaling practices and to establish effective practices where appropriate. Scope: ‘Moderation’ applies broadly to a range of processes whereby assessment tasks, assessment ‘component’ marks and/or module marks are scrutinised to ensure that the assessment criteria are applicable and consistently applied and that there is a shared understanding of the academic standards students are expected to meet. In a narrower sense, ‘moderation’ is also used to distinguish between two types of shared grading: moderation, in which samples of work are validated by a second reader; and second marking, in which all pieces of work are marked by a second reader. This document addresses ‘moderation’ both in the broadest sense, as a range of processes, and as a specific form of shared grading. The University’s Policy on Moderation and Scaling supplements the requirements outlined in the Undergraduate and Postgraduate Taught Examination Conventions. This policy and guidance applies to all INTO Pathway Programmes (Foundation, Diploma and Graduate Diploma) and English-only programmes. The policy requirements on assessment tasks and ‘component’ marks apply to all summative assessments that contribute to modules that count for award classification purposes (subject to the limiting clauses noted below). They do not apply to formative assessments. The requirements on module marks apply to all modules, including those that are pass/fail. Exemptions from the requirements established by this policy may only be granted by the Dean acting as the Chair of the Examination Board for INTO Newcastle. Responsibilities: INTO Newcastle is responsible for devising and maintaining the local implementation of this policy. Relevant Deans/Chairs of FLTSECs are responsible for reviewing local moderation and scaling policies. The duty to moderate assessments and to scale or calibrate marks resides with INTO Newcastle University.

1

Cross-FLTSEC 16 October 2014 Document G

Policy Requirements: Moderation of Assessment Tasks: Internal moderators (Programme Manager (PM) or Assistant Programme Manager (APM)) must review drafts of all summative assessment tasks. Such internal moderation is intended to ensure the appropriate design of assessment tasks and largely to remove the need for scaling at a later point. External examiners should review and evaluate draft examination papers and other coursework tasks that contribute significantly to students’ results. The selection of assessment tasks to be reviewed and the date(s) by which such drafts will be received should be agreed between the Chair of the Board of Examiners and the external examiner(s) at the beginning of each academic year. INTO Newcastle University has scope for negotiating a selection of tasks with external examiners, but they should acknowledge that external scrutiny can improve or clarify all types of assessment tasks and that coursework should be considered equally with exam papers. The following areas should be taken into consideration: Moderation of Assessment Components: All summative assessments should be moderated (by a 20% sample) or double marked to confirm the fairness and validity of marking processes and standards. It is also be good practice to moderate (by a 20% sample) lesser-weighted and/or pass/fail assessments. The mechanism for moderation will necessarily depend on the type of assessment involved and may include ‘checking’ (in the case of objective tests), the moderation of a 20% sample of exam scripts/submitted work, and/or double marking. For oral exams and presentations, two assessors should be present (and/or provided with a recording) to ensure fairness in the moderation process. All extended pieces of writing (e.g. the Research Essay in Graduate Diploma) must be blind double marked. Blind double marking is defined as the marking of an assessment by two separate markers, in which the second marker cannot see the comments or mark given by the first marker. The rationale for blind double marking is not the length of the work, but its individualised nature, and academic units must decide which assessment tasks are characterised in this way. Although students should normally receive only agreed marks and not evidence of moderation, INTO Newcastle will be transparent in their procedures and provide students with timely explanations of moderation and/or scaling processes. All internal moderation should take place within the 20 working day turnaround time and before agreed provisional marks are returned to students. Students must be informed, however, that provisional marks are subject to additional review and potential moderation prior to Boards of Examiners. Under the Data Protection Act, students do have the legal right to request and obtain any comments made on their work, including comments made by a moderator or second marker. The scaling of marks on an assessment component must be considered by the module leader, the Programme Manager and the Chair of the Board of Examiners if marks fail to reflect student performance adequately and/or fail to map onto the standard University marking scale. All scaling should be applied to the entire cohort affected; normally, this will be the cohort taking a given module, but it might be a specific subset of that cohort (e.g. in the case of circumstances affecting only one of a number of exam rooms). Scaling is distinct from calibration, defined as the regular 2

Cross-FLTSEC 16 October 2014 Document G

and systematic numerical adjustment of marks to ensure that they map onto the University marking scale (i.e., when raw marks total to a number that does not accurately reflect student performance on the marking scale). Calibration of EAP (an exception to standard University policy): INTO EAP tests are somewhat different from Academic examinations and assessments. For all INTO EAP assessments, there is a requirement that the score for the assessment meets the expectation that the mark is ‘broadly comparable’ to the IELTS scale. For objective tests, this means a conversion of a raw score into a 090 range – to fit with INTO English Language Proficiency scale. Scaling: The need for scaling might arise from an issue in the assessment process (e.g., an error or ambiguity in a question) or if the assessment turns out to be easier or harder for candidates than anticipated. The need for scaling will typically be detected if the marks as a whole appear to be too high or too low; however, scaling is not always linear, and a discrepancy may appear only at one end of the scale. Scaling should usually take place before provisional marks are returned to students, and all instances of scaling must be reported, via the INTO Academic Director, to the External Examiner before the Board of Examiners meeting. If the module leader and the Chair of the Board of Examiners cannot agree on the scaling of marks the question should be referred to at least one other senior colleague (e.g., another Dean). Moderation of Module Marks: INTO Newcastle has procedures in place to review performance both historically and across modules taken in a given year. Such procedures should include the investigation of any anomalies within a specific module or any unusually high or low mark distributions in a given cohort, as well a general consideration of any concerns raised in the marking process. Module and Programme Moderation Boards typically have responsibility for this review. Local Monitoring of Moderation Procedures: INTO Newcastle must ensure that there are written records of all moderation and scaling, including any notes from markers, any explanation of how disparate marks have been reconciled (if appropriate), and any minutes from Module Moderation Boards (if held). Records of moderation and scaling should be made available annually to all external examiners. INTO Newcastle must develop local policies on moderation and scaling, to be reviewed by the relevant Dean/Chair of FLTSEC and supplied to the relevant external examiner(s). Local policies and procedures on moderation and scaling will be monitored by external examiners and through Internal Subject Review. Operational Organisation within INTO Newcastle University: The Programme Manager ensures that all work subject to the policy is moderated using the following organisation: • When more than one marker is involved, standardisation sessions are run before marking commences. • All work is moderated either by blind double marking or by the allocation of moderators for a particular assessment

3

Cross-FLTSEC 16 October 2014 Document G

• •

Moderators chosen may be other markers of the assessment, the module leader or the Assistant Programme Manager. The timescale for moderation is agreed between module leader and moderators so that the 20 day turn around period is maintained.

Internal Moderation of Draft Assessments: All assessment documentation is checked before it is sent for external examiner approval. A permutation of Assistant Programme Manager, Module Leader, Programme Manager and Coordinator may constitute the group for this purpose. Assessments are checked for • Adherence to formatting guidelines from University • Accuracy of grammar and vocabulary • Question wording and what is expected from an answer • Matching of assessment to learning outcomes being assessed • Topic coverage of assessment and variation between cohorts and year sets Coverage: All summative assessed work is subject to moderation of its marking. Written Examinations and Objective Tests: For objective test exams and written exams, the same process is followed: A sample of marked scripts/papers is reviewed by a moderator. See below for information on the approach to sampling. Objective tests are moderated by reviewing the spread of marks achieved and considering whether calibration or scaling of the results might be required. Where a question has been answered correctly by very few students, then a decision may be taken to ignore the item completely.

Essays and Reports: A sample of marked scripts is reviewed by the moderator. See below for information on the approach to sampling. Presentations/performances: Oral presentations/performances are moderated by one of these methods: Recording the presentation/performance by video or audio so as to allow internal moderators and external examiners to test marking standards. Recordings are stored until after the beginning of the next academic year. OR Oral presentations/performances are viewed and assessed by at least two members of staff Projects – Study Skills / Study and Research Skills: Written projects are blind double marked. This means that the second marker does not know what mark the first marker has allocated and cannot see the first marker’s comments. All Architectural Portfolios / Design Projects are assessed by two or three markers in order to ensure fair and appropriate marking. This is blind in that none of the markers can see what the other markers are giving while the assessment is taking place. Selection of Samples: 4

Cross-FLTSEC 16 October 2014 Document G

Samples of work for moderation are selected so as to test the security of standards across the full marking range and where the candidate has failed. Apart from fail candidates, other class borderlines do not apply to individual assessments. Work is also moderated where a candidate fails to follow the rubric or is penalised for failing to answer the question. The normal sample size is 20% of the number of pieces of work, but where cohorts are below 12 all work is moderated. Outcomes of moderation/double marking: Where all the work for a module is blind double marked, a rounded average of the two marks given is the final agreed mark given. When the two marks are > ±5% (>±5 for English Academic Purposes) different, the assessment is referred to a third marker. The mark given by the third marker is set against the two other marks and the average of the third marker and the nearest other marker is the final agreed mark given. Where a sample of work is moderated, individual marks will not be changed. Where the moderator agrees that the marking is in accordance with the marking criteria for the school/subject, the marks are confirmed. Where the moderator disagrees with the marking by > 5% (> 5 for English Academic Purposes), then consideration will be given to total double marking of all work, where marking is deemed unreliable, or scaling of the marks, where the marking shows a systematic error. Recording the moderation/double marking process: As a recorded audit trail of the moderation process, the following procedure is followed: • Where all the work for a module is blind double-marked, the two marks given for each script are recorded and retained for external examiner inspection as evidence that double-marking has taken place. Each script is numbered, and the number is cross referenced to a list of first and second markers. • The final agreed mark between the two markers (see above) is then calculated and retained on the script and in a separate document – again for external examiner inspection. • The markers involved are noted on each occasion. Significant discrepancies and/or inaccuracies are forwarded to the Programme Manager for consideration of staff development input for the marker(s) involved. • Where a sample of work is moderated, the moderator signs and dates the scripts. Review of module performance across modules and over time: At the Module Review Meeting a review of performance across modules will be undertaken looking at means, range, standard deviation [and others e.g. interquartile range] for all module cohorts with a view to identifying modules where student performance does not match normal expectations or where marks do not map to the common marking scale. [Module performance is also compared to module performance in previous years]. In such cases, it should be considered whether marks ought to be scaled. (see below) Calibration of EAP Tests: When a new EAP test is developed, it is piloted in accordance with accepted language testing methodology. For a pilot, the scores achieved in the new test are compared with the scores the same sample have achieved either in a SELT (IELTS, TOEFL) if available or an existing INTO test which has been calibrated accordingly. 5

Cross-FLTSEC 16 October 2014 Document G

Once this is done, the marks are plotted so that the new test scores broadly fit with these previous scores. A conversion scheme is devised, piloted and adjusted.

Availability of this Policy: This policy will be made available to all staff and students in the School/Subject and also to external examiners. It will also be referred to and provided in the relevant degree programme handbooks.

6