TEACHER SELF-EFFICACY BELIEFS TOWARD MEASUREMENT AND EVALUATION PRACTICES

TEACHER SELF-EFFICACY BELIEFS TOWARD MEASUREMENT AND EVALUATION PRACTICES A THESIS SUBMITTED TO THE GRADUATE SCHOOL OF SOCIAL SCIENCES OF MIDDLE EAST...
Author: Vivien Chambers
1 downloads 0 Views 1MB Size
TEACHER SELF-EFFICACY BELIEFS TOWARD MEASUREMENT AND EVALUATION PRACTICES

A THESIS SUBMITTED TO THE GRADUATE SCHOOL OF SOCIAL SCIENCES OF MIDDLE EAST TECHNICAL UNIVERSTIY

BY

FATMA RANA CEYLANDAĞ

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN THE DEPARTMENT OF EDUCATIONAL SCIENCES

SEPTEMBER 2009

i

Approval of the Graduate School of Social Sciences

_________________________ Prof. Dr. Sencer Ayata Director

I certify that thesis satisfies all the requirements as a thesis for the degree of Master of Science.

_________________________ Prof. Dr. Ali Yıldırım Head of Department

This is to certify that we have read this thesis and that in our opinion it is fully adequate, in scope and quality, as a thesis for the degree of Master of Science.

_________________________ Assist. Prof. Dr. Yeşim Çapa Aydın Supervisor

Examining Committee Members Assoc. Prof. Dr. Oya Yerin Güneri (METU, EDS) ________________________ Assoc. Prof. Dr. Jale Çakıroğlu

(METU, ELE) ________________________

Assist. Prof. Dr. Yeşim Çapa Aydın (METU, EDS)_________________________

ii

I hereby declare that all information in this document has been obtained and presented in accordance with academic rules and ethical conduct. I also declare that, as required by these rules and conduct, I have fully cited and referenced all material and results that are not original to this work. Surname, Name: Ceylandağ, F. Rana Signature

iii

:

ABSTRACT

TEACHER SELF-EFFICACY BELIEFS TOWARD MEASUREMENT AND EVALUATION PRACTICES Ceylandağ, F. Rana M.S., Department of Educational Sciences Supervisor: Assist. Prof. Dr. Yeşim Çapa Aydın September 2009, 94 pages

Teacher self-efficacy refers to teachers’ belief in their abilities to perform an action. In the present study, a new scale was developed to measure teacher selfefficacy beliefs toward measurement and evaluation practices, called ―Teacher Self-Efficacy toward Measurement and Evaluation Practices Scale‖ (TEMES). The purpose of this study was to test a model of relationships among teacher selfefficacy toward measurement and evaluation practices, teachers’ sense of efficacy, year in teaching, and frequency of using traditional and alternative measurement and evaluation tools. Three hundred ninety-four teachers participated in the study. Confirmatory Factor Analysis (CFA), Multivariate Analysis of Variance (MANOVA), Canonical Correlation Analysis, and Structural Equation Modeling (SEM) were conducted to answer the research questions. CFA provided evidence for five-factor structure of the TEMES. Cronbach’s alpha coefficients of these five factors were satisfactory, ranging from .76 to .87. Teachers reported more frequent use of traditional measurement and evaluation tools than alternative tools. Separate MANOVAs yielded non-significant effect of gender on the factors of TEMES, but of teaching level. In addition, findings of canonical correlation analysis indicated that factors of TEMES were correlated with factors of Turkish teachers’ sense of efficacy scale (TTSES). Results of the SEM indicated that teacher self-efficacy toward measurement and evaluation practices was positively correlated with frequency of using traditional and iv

alternative measurement and evaluation tools. Year of teaching was found to be a non-significant predictor of teachers’ sense of efficacy, teacher self-efficacy toward measurement and evaluation practices, and frequency of using traditional and alternative measurement and evaluation tools.

Keywords: Self-efficacy, Teacher Self-efficacy, Measurement and Evaluation Practices

v

ÖZ

ÖLÇME-DEĞERLENDĠRME UYGULAMALARINA YÖNELĠK ÖĞRETMEN ÖZYETERLĠĞĠ Ceylandağ, F. Rana Yüksek Lisans, Eğitim Bilimleri Bölümü Tez Yöneticisi: Yrd.Doç.Dr. Yeşim Çapa Aydın Eylül 2009, 94 sayfa

Öğretmen özyeterliği, bir öğretmenin mesleğinin gerekliliklerini gerçekleştirmeye olan inancıdır. Bu çalışmada, ölçme-değerlendirme uygulamalarına yönelik öğretmen özyeterliğini ölçmek için yeni bir ölçek geliştirilmiş ve bu ölçek ÖlçmeDeğerlendirme Uygulamalarına Yönelik Öğretmen Özyeterliği Ölçeği olarak adlandırılmıştır. Çalışmanın amacı, ölçme-değerlendirme uygulamalarına yönelik öğretmen özyeterliği, genel öğretmen özyeterliği, meslekteki yıl, alternatif ve geleneksel ölçme-değerlendirme araçlarını kullanım sıklığı arasındaki ilişkiyi açıklayan bir model test etmektir. Çalışmaya 394 öğretmen katılmıştır. Araştırma sorularına cevap bulmak için Doğrulayıcı Faktör Analizi, Çoklu Varyans Analizi, Kanonik Korelasyon Analizi ve Yapısal Eşitlik Modeli (YEM) kullanılmıştır. Doğrulayıcı Faktör Analizi, Ölçme Değerlendirme Uygulamalarına Yönelik Öğretmen Özyeterliği Ölçeği’nin 5 faktörlü yapıda olduğunu göstermiştir. Bu beş faktörün Cronbach alfa katsayıları tatmin edicidir ve .76 ile .87 arasında değişmektedir. Öğretmenler, alternatif ölçme değerlendirme araçlarını geleneksel ölçme-değerlendirme araçlarına göre daha sık kullandıklarını belirtmişlerdir. Çoklu Varyans Analizleri, yeni ölçeğin beş faktörü üzerindeki cinsiyet etkisinin istatistiksel olarak anlamlı olmadığını, fakat öğretim seviyesinin fark yarattığını ortaya çıkarmıştır. Ayrıca Kanonik Korelasyon Analizi sonuçları, yeni ölçek vi

faktörlerinin Öğretmen Özyeterlik Ölçeği’nin faktörleriyle ilişkili olduğunu göstermiştir. YEM analizinin sonuçları, ölçme ve değerlendirme uygulamalarına yönelik öğretmen özyeterliğinin alternatif ve geleneksel ölçme değerlendirme araçlarının kullanım sıklığı ile olumlu bir ilişkisi olduğuna işaret etmiştir. Ancak öğretmenlerin meslekte geçirdikleri yıl ile öğretmen özyeterliği, ölçme değerlendirmeye yönelik öğretmen özyeterliği, alternatif ölçme değerlendirme araçlarının kullanım sıklığı ve geleneksel ölçme değerlendirme araçlarının kullanım sıklığı arasında istatistiksel olarak anlamlı bir ilişki bulunamamıştır. Anahtar Kelimeler: Özyeterlik, Öğretmen Özyeterliği, Ölçme ve Değerlendirme Uygulamaları

vii

To my parents and lovely sister

viii

ACKNOWLEDGMENTS

I am heartily thankful to my supervisor, Assist. Prof. Dr. Yesim Çapa Aydın for her guidance and support from the beginning to the end of this research. She was very generous and patient in answering my endless questions, identifying and correcting any gaps. Attending her graduate level courses and working with her during my thesis research was a pleasure for me. I would like to thank Assoc. Prof. Dr. Oya Yerin Güneri and Assoc. Prof. Dr. Jale Çakıroğlu for their contribution to this study and the motivating discussions during the thesis defense. I wish to express my deep gratitude to Assist. Prof. Dr. Finlay McQuade, who supported me in any respect during the time I had the opportunity to assist him in his work. He helped me broaden my perspective on my goals along with my assessment and problem solving skills. My deepest thanks to my dearest sister, Gökce Girgin, and my brother-in-law, Murat Girgin, for their encouragement and warmth. I had the enormous pleasure of spending time with you in the United States, and you have given me motivation by your passion-filled speeches. Thank you for being there despite the distance. I extend particular thanks to Özer Özaydın for his genuineness, faithfulness, unconditional love, endless support, and emphatic understanding. I am also indebted to my friends Funda and Ayşegül, who have persistently urged me to stay in the library for long hours despite craving coffee and chocolate. Special thanks to my dearest friend, Elif Sürer Köse for her sincerity and unconditional positive regard. Thank you for being a true friend, who is distant in miles but close at heart. Finally, and most importantly, thanks to the giants of the statistics in social sciences, Barbara G. Tabachnick, Linda S. Fidell, Andy Field, Hair et al. and others whom I have not mentioned. Thank you for writing the thickest books of all times.

ix

TABLE OF CONTENTS

PLAGIRISM…………………………………………………………………..….iii ABSTRACT………………………………………………………………............iv ÖZ...........................................................................................................................vi DEDICATION…………………………………………………………..............viii ACKNOWLEDGMENTS......................................................................................ix TABLE OF CONTENTS……………………………………………..……...........x LIST OF TABLES................................................................................................xiii LIST OF FIGURES..............................................................................................xiv LIST OF ABBREVIATIONS................................................................................xv CHAPTER 1. INTRODUCTION……………………………………………...……………..1 1.1 Background of the Study…………….………….....………….…………..1 1.2 Purpose of the Study…………….……………………………………..….2 1.3 Significance of the Study…………………...…….………..……………..3 1.4 Definition of the Terms.………………...….……………………………..4 2. LITERATURE REVIEW……………………………….……………………..6 2.1

Self-Efficacy………………….………………………………….6

2.1.1

Four Sources of Self-Efficacy……………...……………….……8

2.1.2

Self-Efficacy and Other Self Constructs…..…………………..…9

2.1.3

Measurement of Self-Efficacy Beliefs……………………….....11

2.2 2.2.1

Teachers’ Sense of Efficacy Beliefs…………………………....12 Measurement Studies of Teachers’ Sense of Efficacy Beliefs…………………………………………………………...13

2.2.2

Measurement Studies of Teachers’ Sense of Efficacy Beliefs in Turkey……...…………………………...…….……...20

2.2.3

Research on the Relationship between Teachers’ Sense of Efficacy and Other Variables…………………...……….…...25 x

2.2.3.1

The Relationship between Teacher Self-Efficacy and Year in Teaching…………………………………….……...25

2.2.3.2

The Relationship between Teacher Self-Efficacy and Frequency of Using Different Measurement and Evaluation Tools…………………………………...…….....26 Summary of the Related Studies……….…………..…………...27

2.3

3. METHOD……………………………………………………...………..……29 3.1

Research Design…………………………………………….…..29

3.2

Research Questions……………………………………….……..30

3.3

Description of Variables………………………………………...30

3.4

Participants……………………………………………...……....31

3.5

Data Collection Instruments………………………..………..….34

3.5.1

Demographic Information…………………………..………..…35

3.5.2

TEMES (Teacher Self-Efficacy toward Measurement and Evaluation Practices Scale)……….………………..………36

3.5.2.1

Instrument Development..…………...…………………......36

3.5.2.2

Pilot Study…………………………………..……………...37

3.5.3

Scale for Measuring Frequency of Using Different Measurement and Evaluation Tools………………….………...41

3.5.4

Turkish Teachers’ Sense of Efficacy Scale…….………...…….42

3.6

Data Collection Procedure…………………………….…….....43

3.7

Data Analysis…………………………………………………..43

3.8

Limitations……………………………………………………..48

4. RESULTS …………………………………………………………….….….50 4.1

Confirmatory Factor Analysis…………………………...…….52

4.2

Reliability………………………………………….………..…55

4.3

Additional Validity Evidence………………….…………….....57

4.3.1

Canonical Correlation between TTSES and TEMES.……...….58

4.3.2

Multivariate Analysis of Variance: Investigation of TEMES by Gender and Teaching Level…………...…………...62 xi

4.4

Structural Equation Modeling ...…………………….……...….68

4.5

Summary…………………………………………………...…..74

5. DISCUSSION…………………………………………………….……….....76 5.1

Discussion of the Study Results……………………………….76

5.2

Implications for Practice…………………………………….....79

5.3

Recommendations for Further Research……………………….81

REFERENCES…………………………………………………….……………..83 APPENDICES………………………………………………………….......…….91 A. DEMOGRAFĠK VERĠ FORMU……………………………………..............91 B. TEMES……………………………………………………………….............92 C. FMES……………….………………………………………………….….....93 D. TTSES……………………………………………………….………….…....94

xii

LIST OF TABLES

TABLES

Table 2.1 Items from Some Teacher Self-Efficacy Scales…………………….....19 Table 2.2 Items from Some Teacher Self-Efficacy Scales Adapted or Developed in Turkey……………………………………………………………...……...24 Table 3.1 Demographic Information of the Participants…………………………...33 Table 4.1 Results of Descriptive Statistics for TEMES, TTSES, and FMES……..51 Table 4.2 Reliability Coefficients of TEMES Factors and Related Items....…..…56 Table 4.3 Results for Canonical Correlation Analysis between the Factors of TTSES and TEMES………………………….…………………….…...61 Table 4.4 Results of Descriptive Statistics…………………………………………....64 Table 4.5 MANOVA for TEMES Factors by Gender………………………….…....65 Table 4.6 Results of Descriptive Statistics……………………………..………..…...67 Table 4.7 MANOVA for TEMES Factors by Teaching Level……………………...68 Table 4.8 Zero-order Correlations, Means, and Standard Deviations for Study Variables………………………………………..…………………...……..….70 Table 4.9 Unstandardized Estimates for Latent and Manifest Variables........…..73

xiii

LIST OF FIGURES

FIGURES

Figure 2.1 Theoretical Model of Triadic Reciprocal Determinism…………..…...7 Figure 3.1 Scree Plot………………………………………………………………...….40 Figure 3.2 Structural Model Displaying the Relationship between Variables......48 Figure 4.1 Five Factor CFA Model of TEMES with Standardized Estimates…...54 Figure 4.2 Factor Structure of TTSES with standardized estimates…….………..59 Figure 4.3 Canonical Correlation Representation between the Factors of TTSES and TEMES…………………………..…………………….…..60 Figure 4.4 Structural Model Representing the Relationship between Teachers’ Sense of Efficacy toward Measurement and Evaluation Practices, Year, Frequency of Using Different Measurement and Evaluation Tools and Teachers’ Sense of Efficacy…………………………………..74

xiv

LIST OF ABBREVIATIONS

ABBREVIATIONS TTKB: The Authority of Turkish Board of Education MoNE: Turkish Ministry of National Education ERDHO: Educational Research and Development Head Office HSEC: METU Human Subjects Ethics Committee TEMES: Teacher Self-Efficacy toward Measurement and Evaluation Practices Scale FMES: Frequency of Using Different Measurement and Evaluation Tools Scale Alternative-ME: Frequency of Using Alternative Measurement and Evaluation Tools Traditional-ME: Frequency of Using Alternative Measurement and Evaluation Tools TTSES: Turkish Teachers’ Sense of Efficacy Scale SPSS: Statistical Package for Social Sciences EFA: Exploratory Factor Analysis M: Mean SD: Standard deviation ANOVA: Analysis of Variance MANOVA: Multivariate Analysis of Variance AMOS: Analysis Moments of Structures CFA: Confirmatory Factor Analysis SEM: Structural Equation Modeling NNFI: Non-normed Fit Index CFI: Comparative Fit Index RMSEA: Root Mean Square of Approximation

xv

CHAPTER I

INTRODUCTION

In the following sections, the reason for researchers’ decision to study on teachers’ efficacy beliefs toward measurement and evaluation practices, the purpose and the significance of the study, and definition of the terms are reported in detail.

1.1. Background of the Study

Measurement and evaluation are important in terms of including the activities in which teachers can get information to modify or improve instructional strategies (Boston, 2002). If teachers know about students’ progress and needs in learning, they can decide to try alternative methods, use additional materials to teach or persist how they teach. What if a teacher thinks that she or he is not good enough at assessing student learning and evaluating the results of assessment?

It has been suggested that there are problems in measurement and evaluation applications in public schools (Ministry of National Education, 2005, 2006). Moreover, most of the teachers suffer from not having enough background in using the techniques of student assessment proposed in the new educational program. Teachers also reported having difficulty in preparing and administering assessment tools, and making use of the results of student assessment (Gelbal & Kelecioğlu, 2007). In an extensive study conducted by the Turkish Ministry of National Education (MoNE) and Educational Research and Development Head Office (ERDHO), general teacher qualifications in different teaching activities 1

were examined, e.g., knowing student, developing instructional strategies, measurement and evaluation, communication with parents and other stakeholders. One of the striking results of this study was that the mean score of qualification ratings of teachers were the lowest in measurement and evaluation practices and communicating with parents or other teachers in the school among other areas (like use of instructional strategies, development of educational program, and content knowledge). In addition, participants also stated that they need help for developing their skills in using alternative assessment methods, analyzing the results of student assessment, and giving feedback to students and their parents about student evaluation. In the light of these results, the researchers concluded that teachers strongly need in-service training in measurement and evaluation practices and teachers’ perception toward measurement and evaluation practices may change in a positive way by this support (MoNE & ERDHO, 2006).

These studies have led researchers conduct studies on teacher self-efficacy toward measurement and evaluation practices. In the study which was conducted by MoNE and ERDHO, it was stated that teachers were asked for their perception toward their qualifications in teaching. However, perception can occur under the effect of interacting factors, such as past experiences and culture (Chalmers, 1997). Since self-efficacy is a construct that differs from perception in a way that people question themselves only in a particular action, it can be practical and meaningful to examine teachers’ efficacy beliefs rather than their perceptions toward measurement and evaluation practices.

1.2. Purpose of the Study

First of all, researchers intended to examine teacher self-efficacy toward measurement and evaluation practices. Since there is no instrument to measure teachers’ efficacy beliefs toward measurement and evaluation practices, a new 2

scale was developed and validated in this study. During literature search, the researchers realized that year in teaching can be an important variable which can influence teacher self-efficacy toward measurement and evaluation practices. In addition, another variable, frequency of using different measurement and evaluation tools, was considered that can distinguish the teachers who are efficacious in measurement and evaluation practices from the teachers who are not.

All in all, there were two main purposes of this study: One was to develop an instrument to measure teacher self-efficacy toward measurement and evaluation practices and the other was to test a model of relationships among teacher selfefficacy toward measurement and evaluation practices, teachers’ sense of efficacy, year in teaching, and frequency of using traditional and alternative measurement and evaluation tools.

1.3. Significance of the Study

Teacher self-efficacy is an issue which has been studied for almost 30 years and there have been many scales developed to assess teacher self-efficacy during these studies (Henson, 2002). It is also possible to see research studies examining the relationship between teacher self-efficacy and various variables such as student self-efficacy, student achievement, and teacher behavior. Further, many scales were developed to assess teacher self-efficacy in different fields such as classroom management, student engagement, and science teaching (Tschannen-Moran, Woolfolk Hoy & Hoy, 1998). On the other hand, efficacy items related to measurement and evaluation practices appear in small numbers (Karaca, 2008). In one of the previous studies held in Turkey, Çakan (2004) reported that teachers perceive themselves inadequate in measurement and evaluation practices and most 3

of the teachers from various teaching grades prefer to use traditional methods of measurement and evaluation. Regarding the results of Çakan’s study, developing an instrument which assesses teacher self-efficacy particularly in measurement and evaluation practices can make a contribution to what is known about teachers’ efficacy beliefs about measurement and evaluation practices.

It has been proposed that as teachers gain experience in teaching, they may develop self-efficacy toward measurement and evaluation practices. Bandura (1997) also suggested that mastery experiences (own performances of people) is the most important source for developing self-efficacy. Thus, year in teaching was considered as an important variable in the present study. In addition to year in teaching, frequency of using different measurement and evaluation practices was included as another variable in this study to investigate the proposition that teachers who have higher self-efficacy tend to try new methods in measurement and evaluation. Similarly, Gibson and Dembo (1984) found that efficacious teachers are open to new ideas; therefore, in the present study it is expected that efficacious teachers may have a tendency to try alternative measurement and evaluation tools rather than traditional ones.

1.4. Definition of the Terms Self-efficacy: Belief in one’s capabilities to organize and execute the courses of action required to produce given attainments (Bandura, 1997, p. 3). Teachers’ sense of efficacy: Teacher’s belief in his or her capability to organize and execute courses of action required to successfully accomplishing a specific teaching task in a particular context (Tschannen-Moran et al., 1998, p.22).

4

Teacher self-efficacy toward measurement and evaluation practices: Teacher’s belief in his or her ability in measurement and evaluation practices.

5

CHAPTER II

REVIEW OF THE LITERATURE

In this chapter, theoretical framework for the study was represented with the leading studies on self-efficacy, teachers’ sense of efficacy and measurement of self-efficacy beliefs. Firstly, the construct of self-efficacy is introduced under the framework of Social Cognitive Theory. This is followed by the section describing how self-efficacy belief was measured and the psychometric properties of the existing self-efficacy scales. Lastly, teachers’ sense of efficacy is defined and measurement studies related to teacher self-efficacy in Turkey and other countries are presented in a chronological order.

2.1. Self-Efficacy

In his book named as Self-efficacy: the Exercise of Control, Bandura (1997) defined self-efficacy as ―beliefs in one’s capabilities to organize and execute the courses of action required to produce given attainments‖ (p.3). The concept of self-efficacy arose from Bandura’s Social Cognitive Theory in 1977. According to this theory, human behavior, environment and personal factors interact and influence each other through the process of reciprocal determinism (presented in Figure 2.1.) (Bandura, 1997). In this theory, reciprocal causality implies that there is a bidirectional interaction between personal factors, behavioral patterns and environmental influences. For example, a person’s self-efficacy (personal factor) can be an indicator of how he or she self-regulates the performance (behavior), and their performance can affect their future self-efficacy beliefs in turn (Bandura, 1997). 6

PERSON

ENVIRONMENT

BEHAVIOR

Figure 2.1 Theoretical Model of Triadic Reciprocal Determinism Source: Bandura (1986, p. 24) Bandura (1997) emphasized that perceived self-efficacy contributes to the acquisition of knowledge structures related to possessed skills by influencing motivation and the choice of activities. Therefore, perceived self-efficacy has an important role in Social Cognitive Theory. Bandura, Caprara, Barbaranelli, Gerbino, and Pastorelli (2003) found that people with high self efficacy tend to display the behavior of cooperativeness, helpfulness, sharing with others, and caring for others’ welfare. The most important characteristic of self-efficacy is that self-efficacy is task and situation specific (Bandura, 1997). That is, selfefficacy beliefs may differ according to the task they are responsible for and the situation in which they perform. For example, one may feel comfortable with writing an essay but not with speaking in public.

In addition, Bandura (1997) reported that efficacy beliefs differ in level, generality and strength. People’s self-efficacy may differ in level by the contribution of difficulty of task demands. An example for level is when the athletes are asked to judge their high-jumping efficacy; they would consider whether or not they can jump over the barriers at different heights. While mentioning the generality dimension, Bandura (1997) stated that people may think themselves as efficacious on either many of the activities or on just a few of them. Moreover, efficacy beliefs vary in strength; that is, having stronger sense of efficacy beliefs causes an increase in perseverance in the face of difficulties and hence possibility of being successful. 7

2.1.1. Four Sources of Self-Efficacy Beliefs

Bandura (1997) proposed that self-efficacy beliefs develop through four sources of influence. These sources are named as enactive mastery experience (which is the one that you perform by your own), vicarious experiences (those are the ones that you observe others’ performance on a particular task), social persuasion (being approved by someone who is professional in the area like a supervisor or a colleague)

and,

physiological

and

emotional

states

(e.g.

physical

accomplishments, health functioning, coping with stress).

Bandura (1997) noted that the most influential source of efficacy is enactive mastery experiences since they give the most realistic evidence of whether an individual can perform whatever it takes to succeed. If people succeed only in easy tasks, then they start to expect quick results and give up by failures. According to Bandura (1997), successful performances do not contribute to selfefficacy and failures do not lower self-efficacy all the time. The contribution to the development of self-efficacy depends on people’s questioning their capability toward a success or failure. Further, mastery experiences will contribute to one’s self-efficacy belief in consideration with level, strength and generality dimensions. While simple tasks may result in belief that they succeed only in easy tasks but not in difficult ones, tasks requiring perseverance will contribute much more to self-efficacy beliefs.

For vicarious experience, Bandura (1997) suggested that mastery experiences cannot be the only source of information about people’s capabilities. Efficacy beliefs are influenced by experiences of other people, and these are named as vicarious experiences. When somebody sees, hears from others or gives evidence that others perform in any kind of task, especially the hard ones, he can start to believe that he may perform in the same task as well. Bandura (1997) gave the 8

example that high jumpers can compare their proficiency and their improvement with the previous heights reached by other athletes. He concluded that people assess their capability in comparison with their peers or colleagues.

Another source of efficacy judgments is verbal persuasion. If other people make someone believe that he or she is capable of doing something, it can be easier to struggle with difficulties in performing an action (Bandura, 1997). Therefore, people who are persuaded verbally and capable of performing an action will show greater effort, and keep on trying. Finally, affective states can have considerable influence on self-efficacy beliefs of people. In this respect, enhancing physical status, decreasing the effect of stress and emotional tendencies can be a way of developing positive self-efficacy beliefs (Bandura, 1991). Considering Bandura’s four sources of efficacy information, Gist and Mitchell (1992) suggested that there are major questions to ask when people judge their capabilities. These questions are: What do different tasks require? How much does an individual attribute a failure or achievement to himself or herself? How does each performance contribute to self-efficacy? Furthermore Gist and Mitchell (1992) proposed three strategies to change self-efficacy beliefs. These strategies include providing the individual information to understand the task attributes, providing the individual information about how efficacy beliefs develops (i.e., the sources of self-efficacy), and providing the individual guideline about how much effort he or she should make to develop self-efficacy beliefs. 2.1.2. Self-Efficacy and Other “Self” Constructs When self-efficacy is compared with other ―self‖ concepts such as self-concept, self-confidence, self-esteem, and self-worth, self-efficacy differs from those in terms of being specific to a particular task (Tschannen-Moran, Woolfolk Hoy & 9

Hoy, 1998). Bandura (1997) stated how self-concept is measured and the difference between self-concept and self-efficacy. In Bandura’s words, selfconcept contributes ―understanding of people’s attitudes toward themselves and how these attitudes may affect their general outlook on life‖ (p. 11). In addition, Bandura (1997) stated that measurement of self-concept is done by asking people how many appreciable characteristics they attribute themselves. In the light of self-concept measurement studies, Bandura (1997) concluded that the predictive value of self-concept decreases when the influence of self-efficacy is considered in someone’s personal belief.

Another similar concept, self-confidence is defined as believing in oneself (Benabou & Tirole, 2002). In that sense, this construct seems a general view of a person about himself and not an opinion about his characteristics specific to a situation as in self-efficacy.

While differentiating self-efficacy from self-esteem, Bandura (1997) reported that ―perceived self-efficacy is concerned with judgments of personal capability; whereas self-esteem is concerned with judgments of self-worth.‖ (p. 11). That is, self-esteem deals with how much an individual appreciates himself. However, the question of self-efficacy is how well people can act in different task situations. Similarly, Pajares (1996) also pointed out that self-esteem and self-efficacy differs from each other with the questions used to assess them. For example, ―how I define myself‖ and ‖how I feel about myself‖ are the questions referring to self esteem; self-efficacy considers the ones like ―how well can I solve this science problem?‖ or ―how well can I write a bestseller book?‖ Other than the ―self‖ concepts discussed in the literature, one more distinction is needed to be made between self-efficacy and outcome expectancy, since both have a relationship with self-regulation. Gist and Mitchell (1992) reported that 10

―self-efficacy is one of several cognitive processes frequently considered in selfregulation.‖ (p. 186). Self-efficacy was considered in a relationship with outcome expectancy which was defined as expectancy about consequences of a performance by Bandura (1997). People who are self-efficacious have a tendency to show more effort to attain their expectations, when they face with a difficulty in performing an action (Bandura & Cervone, 1986). However, self-efficacy differs from outcome expectancy in that it is a belief in one’s ability to perform a particular action. For example, Zimmerman (2000) stated that a student’s belief on getting grade A is a kind of self-efficacy belief, whereas considering this grade as a useful indicator to get a good job refers to outcome expectancy. In consideration with the definition of outcome expectancy, locus of control, whether people have the control of their behavior, should be defined at this point. According to Rotter (1966), locus of control is related to how people relate internal and external factors to their outcome. Internal locus of control refers to belief in self-responsibility for failure or success while external locus of control means that a person relate his failure or success to external factors, such as fate, luck, or external circumstances (Rotter, 1966). Bandura (1997) also reported that locus of control is an inconsistent predictor of different behaviors which can be uniquely explained by self-efficacy.

2.1.3 Measurement of Self-efficacy Beliefs

Self-efficacy is a construct that has attracted many researchers in social sciences and this led to the development of several instruments measuring this construct. There are many self-efficacy scales assessing people’s self-efficacy in different fields such as alcohol resistance (Rychtarik, Prue, Rapp, & King, 1992), parenting (Bandura, Caprara, Barbaranelli, & Pastorelli, 2001), career decision (Betz, Klein & Taylor, 1996), teaching (Tschannen-Moran & Woolfolk Hoy, 2001), computer usage (Thatcher & Perrewé, 2002), and geometry (Cantürk-Günhan & Başer, 11

2007).

There are some points to consider while developing instruments to measure selfefficacy. According to Bandura (1997), there has been a discussion on what a scale measuring self-efficacy should question; should it ask for beliefs on performing an action but not the personal qualities? Later, Bandura (2006) reported guidelines which should be considered in development of a self-efficacy scale: First, the items of the instrument should include ―can‖ or ―will‖ as a judgment of capability and a statement of intention, respectively. This is because of the fact that self-efficacy is a judgment of how much a person can perform in a specific task (Bandura, 1997). Second, the scale should be unipolar. That is, the scale cannot include negative integers like -1, -2, -3 etc. Because zero value does not indicate any gradation, it is not recommended to use negative numbers (e.g., 1, -2) in the scale. Third, it should be guaranteed to the participants that their answers will not be shared with others. Otherwise, people would feel uncomfortable with others’ judgment on their ideas about themselves. Lastly, it is very important to make self-efficacy scales have predictive validity; hence, selfefficacy interests people’s future performance on a given task (Bandura, 2006). 2.2. Teachers’ Sense of Efficacy It is possible to derive the definition of teacher self-efficacy from the description of self-efficacy as ―teacher’s belief in his or her capability to organize and execute courses of action required to successfully accomplishing a specific teaching task in a particular context‖ (Tschannen-Moran et al., 1998, p.22). Some researchers defined teacher self-efficacy as teachers’ beliefs in their abilities to affect student performance (Armor et al., 1976; Gibson & Dembo, 1984). In addition to affecting student performance, Dellinger, Bobbett, Olivier and Ellett (2007) emphasized that teacher self-efficacy focuses on outcome of successful teaching 12

behaviors and student characteristics and behaviors. In addition, Bandura (1997) pointed out that low teacher efficacy beliefs can give rise to low student efficacy and low academic achievement, and these may yield to negative teacher selfefficacy beliefs. Furthermore, teachers’ sense of efficacy beliefs has a strong influence on not only student performance but also on how much goals are achieved, and how much a teacher changes (Tschannen-Moran et al., 1998).

According to Bandura (1994), self-efficacy beliefs have an impact on how people make their choices, on their level of motivation, their resilience against difficulties or stressors, and their sensitivity to depression. In that sense, it is not very hard to predict which factors would affect teacher self-efficacy. There are many research studies showing the relationship between student achievement and three kinds of efficacy which are students’ self-efficacy, teacher self-efficacy and collective efficacy (Pajares, 1996; Tschannen-Moran et al., 1998). Gibson and Dembo (1984) reported that teachers who have high self-efficacy work longer with a student who has difficulty in learning. Moreover, teacher self-efficacy beliefs influence their resilience against the difficult situations (Gibson & Dembo, 1984). These results are supported by recent studies. For example, Ware and Kitsantas (2007) found that efficacious teachers display greater effort for teaching and feel responsible for both their failures and achievements. 2.2.1. Measurement of Teachers’ Sense of Efficacy Beliefs

As well as some instruments were developed to measure teacher self-efficacy in teaching a subject area such as efficacy in science teaching (Riggs & Enochs, 1990), efficacy in computer teaching (Akkoyunlu, Orhan, & Umay, 2005), efficacy in geography teaching (Karadeniz, 2005), it is possible to notice that some scales on teacher self-efficacy included the factors on personal teaching efficacy and general teaching efficacy (Gibson & Dembo, 1984); efficacy to 13

influence decision making, school resources, instruction, discipline, efficacy to enlist parental involvement, community involvement, and efficacy to create a positive school climate (Bandura, 2001); teacher self-efficacy in classroom management, instructional strategies, and student engagement (Tschannen-Moran & Woolfolk Hoy, 2001). Measurement studies of teachers’ sense of efficacy beliefs have started by the research of RAND organization on student learning and teachers’ characteristics in 1976. There were just two items which could be identified classified as measuring teachers’ self-efficacy. However, this study shed light to other studies measuring what teachers’ opinion was on their personal responsibility in student learning (Guskey & Passaro, 1994).

RAND Items (1976). The first example of assessing teacher efficacy was observed in the study of Rand Corporation in 1976. The main purpose of the study was to increase reading scores of elementary students by defining most successful school and classroom policies and other variables (Armor et al., 1976). To determine those, the researchers examined the success of different reading programs and interventions. There were two items measuring teacher efficacy and these two focused on how teachers may influence student motivation (Tschannen-Moran et al., 1998). In this study, researchers concluded that teacher efficacy was one of the significant factors that had an influence on reading achievement of elementary students (Armor et al., 1976). Rose and Medway (1981). The relationship between teacher’s locus of control and student learning was examined in this study. Locus of control was defined in a preceding study of Rotter (1966). According to Rotter (1966) locus of control is related to how people relate internal and external factors to their outcome. Internal locus of control refers to belief in self responsibility for failure or success while 14

external locus of control means that a person relate his failure or success to external factors, such as fate, luck, or external circumstances (Rotter, 1966). Rose and Medway (1981) found significant relationship between teachers’ locus of control and student achievement.

Webb Scale (1982). This scale was developed in order to contribute to the measurement of teacher efficacy by expanding Rand’s measure. In order to make participants avoid giving responses fitting social desirability, Webb and his colleagues used a forced-response format. Any reliability value or validation study has not been reported by the researchers (Tschannen-Moran et al., 2001).

Aston Vignettes (1984). Ashton, Buhr and Crocker (1984) developed a scale including vignettes describing situations a teacher acts and questions on how effective a teacher would be in that kind of situation. The scale had two versions in response as self-referenced with ―extremely ineffective‖ to ―extremely effective,‖ and norm-referenced with ―much less effective than most teachers‖ to ―much more effective than other teachers.‖ However, the instrument has not been accepted and used widely in the field.

Gibson and Dembo (1984). Gibson and Dembo (1984) stated that teacher selfefficacy beliefs are teachers’ evaluation on how much they are able to create positive student change. In this concern, they developed a 30- item teacher selfefficacy instrument which included two factors named as personal teaching efficacy (PTE, alpha= .75) and teaching efficacy (GTE, alpha= .79). Gibson and Dembo (1984) concluded that validation studies are needed to stabilize the factor structure. After development of this instrument, there have been many research studies done on teacher self-efficacy and its relationship with teachers’ classroom behaviors, openness to new ideas, and attitudes toward teaching.

15

Riggs and Enochs (1990). Another important study to measure teacher selfefficacy belief was done by Riggs and Enochs in 1990. They developed a 25-item instrument called Science Teaching Efficacy Belief Instrument (STEBI) to measure classroom teacher self-efficacy beliefs toward science teaching. This instrument included two factors named as personal science teaching efficacy belief (alpha= .92) and science teaching outcome expectancy (alpha= .77). Riggs and Enochs (1990) reported that their scale produces valid and reliable scores indicating teachers’ belief toward science teaching and learning.

Bandura (2001). Bandura developed a teacher self-efficacy scale which included 30 items on a nine-point scale with seven subscales: efficacy to influence decision making, efficacy to influence school resources, instructional efficacy, disciplinary efficacy, efficacy to enlist parental involvement, efficacy to enlist community involvement, and efficacy to create a positive school climate. However, Bandura has not reported any finding regarding validity or reliability for his instrument.

Tschannen-Moran and Woolfolk Hoy (2001). Tschannen-Moran and Woolfolk Hoy (2001) reported that most of the teacher self-efficacy scales did not include items on personal competence and tasks which exist in teaching process. Moreover, Tschannen-Moran, Woolfolk-Hoy and Hoy (1998) argued the necessity of a valid and reliable teacher self-efficacy scale. In the light of these arguments, Tschannen-Moran and Woolfolk Hoy (2001) developed a new scale with 52 items and named it as Teachers’ Sense of Efficacy Scale (TSES), originally known as Ohio State Teacher Self-Efficacy Scale (OSTES). To validate the scores obtained from this scale, Tschannen-Moran and Woolfolk Hoy (2001) constructed three different studies with 624 participants including pre-service and in-service teachers. At the end of these studies resulting scale had 24 items in the long form, and 12 items in the short form. To make sure that both two versions of the scale provide evidence for

construct validity, Tschannen-Moran and 16

Woolfolk Hoy (2001) checked for the correlation between their scales and previously developed teacher self-efficacy scales as RAND items and Hoy and Woolfolk (1993)’s 10-item adaptation of Gibson and Dembo TES. Among the resulting correlation coefficients, the highest ones were obtained with the scale measuring personal teaching efficacy. To indicate that both forms of TSES measured the same construct, Tschannen-Moran and Woolfolk Hoy (2001) reported that the intercorrelations between short and long form of TSES were in between .95 and .98. Moreover, they conducted Principal-Axis Factoring with Varimax Rotation and concluded that TSES had a three-factor structure. The factors were named as efficacy for student engagement (ESE), efficacy for instructional strategies (EIS), and efficacy for classroom management (ECM). Reliability analysis indicated that total scale reliability was .94 and those three subscales had high Cronbach Alpha Coefficients as .87 for ESE, .91 for EIS, and .90 for ECM (Tschannen-Moran & Woolfolk Hoy, 2001). The alpha values and the validation study indicated that Teachers’ Sense of Efficacy Scale was a valid and reliable measure to assess teachers’ sense of efficacy in student engagement, instructional practices and classroom management (Tschannen-Moran & Woolfolk Hoy, 2001). Schmitz and Schwarzer (2005). Based on Bandura’s Social Cognitive Theory Schmitz and Schwarzer (2005) developed a 4-point response scale composing of 27 items and administered their scale to 300 German teachers. They reported the values .67, .76 and .65 for test–retest reliability of the instrument in three year study. Further, the scale was reported to be related with personal attitudes than general self-efficacy scale and this situation was emphasized as an evidence for discriminant validity.

Dellinger, Bobbett, Olivier and Ellett (2007). The latest measure of teacher selfefficacy beliefs was developed by Dellinger, Bobbett, Olivier and Ellett and 17

named as Teachers’ Efficacy Beliefs System—Self Form (TEBS-Self). The scale was on a 4-point rating scale [weak belief in my capabilities (1), moderate beliefs belief in my capabilities (2), strong belief in my capabilities (3), and very strong belief in my capabilities (4)] composing of 30 items. This scale was used in three distinct studies of the researchers and they did not reach a consensus in terms of the factor structure of the scale (Dellinger et al., 2007).

Aforementioned instruments are summarized in Table 2.1.

18

19

A forcedchoice format 5-point Liker t scale

6-point Likert scale 5-point Likert scale 9-point Likert scale 9-point Likert scale 4-point Likert Scale

A teacher should not be expected to reach every child; some students are not going to make academic progress. Your school district has adopted a self-paced instructional program for remedial students in your area. How effective would you are in keeping a group of remedial students on task and engaged in meaningful learning while using these materials? If a student masters a new math concept quickly, this might be because I knew the necessary steps in teaching that concept. I understand science concepts well enough to be effective in teaching elementary science. How much can you do to get children to follow classroom rules? To what extent can you craft good questions for your students? Even if I get disrupted while teaching, I am confident that I can maintain my composure and continue to teach well. 1. Weak belief in my capabilities. 2. Moderate belief in my capabilities. 3. Strong belief in my capabilities. 4. Very strong belief in my capabilities. Effective manage routine and procedures for learning tasks...

Ashton et al. (1982)

Ashton et al. (1984)

Gibson & Dembo (1984)

Riggs & Enochs (1990)

Bandura (2001)

Tschannen-Moran & Woolfolk Hoy (2001)

Schmitz and Schwarzer (2005)

Dellinger, Bobbett, Olivier & Ellett (2007)

Table 2.1 Items from Some Teacher Self-Efficacy Scales

4-point response scale

A forcedchoice format

When the grades of your students improve, it is more likely a. because you found ways to motivate the students, or b. because the students were trying harder to do well.

Rose & Medway (1981)

(Ashton Vignettes)

5-point Likert Scale

Type of Rating Scale

If I really try hard, I can get through to even the most difficult or unmotivated students.

Sample Items

Armor et al. (1976)

Authors

31

27

24

30

25

30

50

7

28

2 items on teacher self-efficacy

Number of items in the scale

2.2.2. Measurement Studies of Teachers’ Sense of Efficacy Beliefs in Turkey

In Turkey, history of the studies on teacher self-efficacy is not very old beginning in 2000s. The researchers mostly adapted previously established instruments in their studies. The examples of instrument adaptation studies are the ones that belong to Yılmaz, Köseoğlu, Gerçek and Soran (2004), Bıkmaz (2004), and Çapa, Çakıroğlu, and Sarıkaya (2005). Further, Erdem and Demirel (2007), Akkoyunlu, Umay and Orhan (2005), Karadeniz (2005), and Karaca (2008) conducted the development and validation studies of instruments assessing teacher self-efficacy in different fields. Yılmaz, Köseoğlu, Gerçek, and Soran (2004). Yılmaz et al. adapted the Teacher Self-Efficacy Scale, which was developed by Schmitz and Schwarzer in 2000 in Germany. In this study, the researchers translated the original survey and reported reliability and validity findings after administering the instrument to Turkish teachers. Yılmaz and his colleagues (2004) reported that the reliability of the adapted scale was found .79, as Cronbach alpha value. Moreover, they found two factors and decided on keeping eight items, whereas the original scale included 10 items. The factors of the adapted instrument were coping behavior (başa çıkma davranışı) and reformist behavior (yenilikçi davranış). Bıkmaz (2004). Bıkmaz adapted the Science Teaching Efficacy Belief Instrument (STEBI) developed by Riggs and Enochs on teacher self-efficacy beliefs toward science teaching. In this study, the purpose of the researcher was to provide evidence for validity and reliability of the scale for classroom teachers in Turkey. Bıkmaz (2004) reported that the adapted instrument has two factors including 20 items. Cronbach’s alpha coefficient for the first factor which was named as selfefficacy belief was .78, and for the second factor, outcome expectancy, it was .60. In addition, .71 was the reliability coefficient for the whole instrument. 20

Çapa, Çakıroğlu, and Sarıkaya (2005). Çapa and her colleagues (2005) stated that a valid measure for efficacy beliefs of teachers has not been developed in Turkey. In that sense, Çapa et al. (2005) adapted the Teachers’ Sense of Efficacy Scale (TSES) which was developed by Tschannen-Moran and Woolfolk Hoy in 2001. The purpose of the study was to adapt TSES in Turkish, examine reliability values for subscales and the whole scale, and provide construct related evidence for the adapted version of TSES. Çapa, Çakıroğlu, and Sarıkaya (2005) ran Confirmatory Factor and Rasch analyses to examine the factor structure and to report reliability coefficients of the factors. The analyses resulted in reliability indices as follows: .82 for the first factor, student engagement, .86 for the second factor, instructional strategies, and .84 for the third factor, classroom management. Çapa et al. (2005) confirmed the three-dimensional structure of the Turkish Teachers’ Sense of Efficacy Scale (TTSES) using the data of 628 Turkish pre-service teachers.

Akkoyunlu, Orhan, and Umay (2005). Akkoyunlu et al. developed a teacher selfefficacy scale for computer teachers in 2005. Before developing the instrument, Akkoyunlu and her colleagues (2005) asked ten different experts who were instructors in Faculty of Education of Hacettepe University and ensured that the instrument had the content validity. The latest version of the instrument was a 5point Likert scale consisting of 12 items and it was named as Teacher SelfEfficacy Scale for Computer Teachers (Bilgisayar Öğretmenliği Özyeterlik Ölçeği). The data were collected from 315 senior students in computer education and instructional technologies departments of eight different universities in Turkey. Findings yielded one dimension. The alpha coefficient of the instrument was very high with a value of .93.

Karadeniz (2005). The instrument assessing teacher efficacy in teaching geography was established by Karadeniz (2005). She developed a self-efficacy scale of geography for pre-service teachers of social sciences. The developed 21

instrument had 19 items and these items were collected under three factors. The factors and the reliability alpha values were reported as follows: .86 for transform geography

knowledge

into

life

skills

(coğrafyayı

yaşam

becerilerine

dönüştürebilme), .76 for self-efficacy beliefs (yeterlik algısı), and .63 for awareness of behaviors in geography (coğrafya alanındaki davranışlarda farkındalık). In addition, the split half reliability coefficient was reported as .79.

Erdem and Demirel (2007). A new instrument was developed and validated to assess pre-service teachers’ self-efficacy beliefs toward teaching by Erdem and Demirel in 2004. They studied with 346 student teachers attending six different departments of a faculty of education. The instrument was established as a 5-point Likert scale in a single-factor model and the reliability coefficient for the whole scale was reported as .92.

Karaca (2008). In this study, the purpose was to investigate the perceptions of primary and high school teachers toward measurement and evaluation in education in Turkey. To measure the perceptions of teachers toward assessment practices, Karaca (2003) constructed a 5-point Likert scale with 75 items. Actually, it was reported that teachers’ perception levels of efficacy was proposed to be measured. However, items did not have one of the important properties that an efficacy scale should have like did not include ―can‖ or ―will‖ as a judgment of capability and a statement of intention (Bandura, 2006). Karaca (2008) collected the data from 225 primary and high school teachers who worked in Eskisehir, Turkey. According to the results of this study, independent samples t-test yielded non-significant difference between male and female teachers’ perception levels of efficacy toward measurement and evaluation practices. In addition, it was found that high school teachers’ perception levels of efficacy were found out to be higher than primary teachers’ by independent samples t-test. The results of one way ANOVA indicated no significant difference in teachers’ perception levels of 22

efficacy toward measurement and evaluation practices according to year in teaching.

Sample items from the instruments which were adapted and developed in these studies are summarized in Table 2.2.

23

24

Table 2. 2 Items from Some Teacher Self-Efficacy Scales Adapted or Developed in Turkey

5-point Likert scale

Öğretim hedeflerine ve hedef davranışlara uygun ölçme araçlarını belirleyebilme. Her bir maddenin ayırt ediciliğini hesaplayabilme.

Karaca (2008)

Karaca (2008)

5-point Likert scale

I can ensure my students trust me by expressing my ideas and behaviors clearly.

Erdem & Demirel (2007)

Erdem & Demirel (2007)

5-point Likert scale

Coğrafya konularına yönelik grafik ve tabloları yorumlayabilirim.

Karadeniz (2005)

Karadeniz (2005)

5-point Likert scale

Akkoyunlu, Orhan & Umay (2005)

Akkoyunlu, Orhan & Umay (2005) No item was reported

9-point scale

Öğrencileri okulda başarılı olabileceklerine inandırmayı ne kadar sağlayabilirsiniz?

Çapa, Çakıroğlu and Sarıkaya (2005)

Tschannen-Moran and Hoy (2001)

Riggs & Enochs (1990)

5-point Likert scale

Yılmaz, Köseoğlu, Gerçek & Soran (2004)

Schmitz & Schwarzer (2000) Öğrencilerin fen dersindeki başarılarından öğretmen sorumludur.

Type of Rating Scale

Bıkmaz (2004)

Sample Items 4-point Likert scale

Researchers who adapted the instrument to Turkish Zor durumlarda bile ebeveynlerle iyi bir iletişim kurabilirim.

Developers

2.2.3. Research on the Relationship between Teachers’ Sense of Efficacy and Other Variables

The relationship between teacher self-efficacy and many different variables such as commitment to teaching, developing instructional strategies, classroom management, student achievement, and motivation was studied in various research studies (Tschannen-Moran & Woolfolk Hoy, 2001). For example, teacher self-efficacy was found in a relationship with student achievement (Ross, 1992), planning and organization in teaching (Freidman & Kass, 2002), enthusiasm for teaching (Guskey, 1984), and meeting needs of students (Guskey, 1988).

2.2.3.1. The Relationship between Teacher Self-Efficacy and Year in Teaching

Teacher self-efficacy was found in a relationship with year in teaching (Hoy & Woolfolk Hoy, 1993), grade level (Çapa, 2005), teaching area of specialization (Ross, Cousins, Gadalla & Hannay, 1999), education level (Friedman, 2003), and student achievement (Lee, Dedrick & Smith, 1991). Among these variables, increase in year in teaching was found to have an impact on developing positive teaching efficacy in the study of Hoy and Woolfolk Hoy (1993). However, some researchers concluded that teacher self-efficacy decreased by increasing year in teaching experience (Dembo & Gibson, 1985; Ghaith & Yaghi, 1997). There were other studies showing differences in teacher efficacy among the teachers who have varying levels of teaching experiences. For example, year in teaching was reported as positively correlated to teacher self-efficacy in the study of Tschannen-Moran and Woolfolk Hoy (2007). In addition, Tschannen-Moran et al. (1998) suggested that self-efficacy beliefs of expert teacher are resistant to change. In the line with this suggestion, Woolfolk Hoy and Burke-Spero (2005) 25

reported that self-efficacy is more changeable in the early years of teaching. Furthermore, they reported that novice teachers who have positive self-efficacy beliefs develop positive attitude toward teaching and have less stress in their job in their first year of teaching. On the contrary, Karaca (2008) reported that teachers’ perceptions of efficacy toward measurement and evaluation practices do not differ significantly by the change in years of teaching. Çakan (2004) found a similar result that experienced teachers’ perceptions toward their qualification levels are not different than the novice teachers’ perceptions. In this context, it is important to understand what influences teacher self-efficacy and which factors are affected by teacher self-efficacy by the changing years of teaching experience. In the present study, to clarify the relationship between year in teaching and teacher self-efficacy toward measurement and evaluation practices, the researchers examined whether teacher self-efficacy toward measurement and evaluation practices is correlated with year in teaching, and whether these relationships are in positive or negative direction.

2.2.3.2. The Relationship between Teacher Self-Efficacy and Frequency of Using Different Measurement and Evaluation Tools

Regarding the inference of Gibson and Dembo (1984) that efficacious teachers tend to be open to try new methods and are not against alternative methods in teaching, using different measurement and evaluation tools are supposed to be a characteristic of teachers who have positive self-efficacy in teaching. In addition, Vitali (1993) reported that efficacious teachers prefer performance-based assessment, which is a kind of alternative assessment method, rather than traditional tests. Similar results were also found by Ross, Cousins and Gadalla in 1996. Ross and his colleagues (1996) examined whether the effect of different teaching tasks on teacher self-efficacy was moderated by between teacher variables (i.e., subject, experience, gender, preference for student centered 26

instruction and alternative assessment techniques). Ross et al. (1996) clarified different teaching tasks as feelings of past success, feelings of being wellprepared, and student engagement. The conclusion of this study was that when perceived success was positively correlated to teacher self-efficacy, teachers tended to use traditional assessment techniques more. Teachers prefer alternative assessment techniques when teacher self-efficacy was related to feelings of preparedness. Ross and his colleagues (1996) attributed using alternative assessment techniques to teachers’ ability to take risks and try new methods. Correspondingly, the finding of Gibson and Dembo (1984) about efficacious teachers’ tendency to being openness to new methods supports the view of Ross and his colleagues (1996).

2.3. Summary of Related Studies

In previous sections, the definition of self-efficacy, the sources contributing to self-efficacy development, the definition of teacher self-efficacy and measurement studies on teacher self-efficacy and related factors were reported in a chronological order. In this way, researchers clarified when teacher self-efficacy was started to be considered as an important construct, how teachers’ sense of efficacy was measured and which constructs or variables were thought to be related to it.

Related literature indicated that there was a relationship between year in teaching and teaching efficacy (Dembo & Gibson, 1985; Hoy & Woolfolk, 1993; Ghaith & Yaghi, 1997; Tschannen-Moran & Woolfolk Hoy, 2007). In addition to relationship, more change is possible in teaching efficacy in the early years of teaching according to Woolfolk Hoy and Burke-Spero (2005). They concluded that efficacious novice teachers tend to develop positive attitude toward teaching and have less trouble in the first year of teaching. However, Çakan’s (2004) 27

finding that teachers’ perception about their qualification levels had no correlation to year in teaching is a contradictory result to these findings. Karaca (2008) supported this result by reporting non significant relationship between teachers’ perception levels of efficacy in measurement and evaluation practices and year in teaching. This contradiction in the literature findings encouraged researchers to conduct a study to examine the relationship between year in teaching and teacher self-efficacy toward measurement and evaluation practices.

Because efficacious teachers were found to take risks in teaching (Gibson & Dembo, 1984), they were expected to develop and administer alternative teaching methods without hesitation (Ross et al., 1996). In that sense, the researchers intended to investigate whether teachers who have positive self-efficacy toward measurement and evaluation practices have a tendency to prefer alternative measurement and evaluation tools to traditional ones.

28

CHAPTER III

METHOD

This chapter presents the research methodology of the study. In detail, research design, research questions, description of variables, participants’ demographic information, and instruments used in the study are mentioned respectively. The last section introduces the data analysis employed in this study. 3.1. Research Design This study was an associational research since the relationship between years in teaching, frequency of using different kinds of measurement tools and teachers’ efficacy beliefs toward measurement and evaluation tools were examined. In associational research, relationships among two or more variables are investigated without manipulating variables. Moreover, numerical representation is possible to display the relationship between variables (Fraenkel & Wallen, 2008). To measure teachers’ efficacy beliefs toward measurement and evaluation practices, a 9-point scale with 24 items was developed. Necessary permissions to administer the instrument were taken from the METU Human Subjects Ethics Committee (HSEC) and Educational Research and Development Head Office (ERDHO) in Ankara. Data were collected from 394 experienced teachers who worked in public primary and high schools in Ankara, Samsun, and Istanbul. Data were collected between May and June of 2008.

29

3.2. Research Question In order to measure teacher self-efficacy toward measurement and evaluation practices, an instrument was developed. By using this instrument, a model was tested in which the following main and sub-research questions were addressed: What is the best model explaining the relationship between teacher selfefficacy in measurement and evaluation practices, years of teaching experience, teachers’ sense of efficacy, and frequency of using alternative and traditional measurement and evaluation tools? 1. How well do years of teaching experience predict frequency of using alternative and traditional measurement and evaluation tools? 2. How well do years of teaching experience and teachers’ sense of efficacy predict the teacher self-efficacy in measurement and evaluation practices? 3. How well does teacher self-efficacy in measurement and evaluation practices

predict

frequency of

using

alternative and traditional

measurement and evaluation tools? 3.3. Description of Variables This section provides the operational definitions of variables investigated in this study: Years in teaching: This independent variable corresponds to the number of years the participant teacher has been teaching. It was a continuous variable and the level of measurement was considered as ratio.

30

Teachers’ sense of efficacy: Mean score was computed for the Turkish Teachers’ Sense of Efficacy Scale (TTSES). High score indicates high teachers’ sense of efficacy. The level of measurement for this variable was considered as interval. Frequency of using different measurement and evaluation tools: This variable of the study was measured on a 5-point rating scale (1 referred to ―never‖ and 5 referred to ―always‖) and scores were obtained out of 5 by taking mean of 17 items. Items were generated from the measurement and evaluation tools that the Turkish Ministry of National Education (MoNE) proposed in latest curriculum (Erdoğan, 2007). To examine whether efficacious teachers prefer more alternative or traditional methods, the researchers divided this variable into two distinct variables as frequency of using alternative and traditional measurement and evaluation tools, i.e., Alternative-ME and Traditional-ME, respectively. Alternative-ME was measured by 10 items and Traditional-ME was represented by 7 items. Teacher self-efficacy beliefs toward measurement and evaluation practices: The dependent variable, assessing teachers’ beliefs in their abilities to perform tasks related to measurement and evaluation practices, was measured by an instrument developed by the researchers. It included 24 items on a 9-point rating scale ranging from ―nothing‖ (1) to ―a great deal‖ (9). The mean score of each participant was generated out of 9. The level of measurement for this variable was considered as interval. 3.4. Participants Target population of the study was the public school teachers who were working in elementary and secondary schools in three different cities of Turkey: Ankara (the districts of Çankaya and Sincan), Samsun (Center district), and Ġstanbul (the districts of Zeytinburnu, Bakırköy and Eyüp). Convenient sampling procedure 31

was performed within this target population. The cities preferred to collect data in this study were selected from three different regions of Turkey, because these are the ones convenient to the researchers. Data were collected from 44 elementary and secondary schools. The percentage of secondary schools was 47 and the rest (53%) were elementary schools. Table 3.1 displays the participating teachers’ background data on gender, teaching level, branch, and graduation history. Three hundred and ninety-four teachers participated in the study and these teachers were from public elementary and secondary schools. 57.11% of the participants were female and 42.89% of them were male. Participants’ ages ranged from 22 to 63 and had a mean of 40. Year in teaching ranged from 1 to 40 with an average of 16. The percentage of teachers working in elementary schools was 53.05 and in secondary school were 46.95%. Twenty two percent of participating teachers had a science (i.e.,, teaching Physics, Chemistry etc.) and mathematics major, while 78% of them had a social science major (i.e.,, teaching Turkish, English, and Geography etc.). Among these teachers, 4.3% of them were graduated from a teacher school, 11.7% of them were graduated from a pre undergraduate program (two-year university program), 77.9% of them had a bachelor’s degree, and 6.1 % of them had a master’s degree or Ph.D. degree. Approximately fifty-nine percent (58.9%) of all participants graduated from a faculty of education, whereas 41.1% of them graduated from other faculties rather than education faculty. The percentage of the ones who have taken a course on measurement and evaluation during university education was 86.5 and 13.5% of all participants have never taken a course on this issue. Lastly, 35.3% of all participants have joined an in-service training program, while 64.7% of participant teachers did not join such a training program (Table 3.1).

32

Table 3.1 Demographic Information of the Participants Percentage

N

Female

57.11

225

Male

42.89

169

Elementary

53.05

209

Secondary

46.95

185

Science

22

87

Social Science

78

307

Teacher School

4.3

17

Pre undergraduate

11.7

46

Undergraduate

77.9

307

Graduate

6.1

24

Yes

58.9

232

No

41.1

162

Yes

86.5

341

No

13.5

53

Yes

35.3

139

No

64.7

255

Gender

Teaching Level

Branch

Graduation

Faculty of Education

Course

In service Training

33

3.5. Data Collection Instruments Data were collected with an instrument composing of four sections: The first section was composed of the demographic information. Section II included a scale of Teacher Self-Efficacy toward Measurement and Evaluation Practices (TEMES), which was developed by the researchers. The questionnaire was a 9-point scale ranging from ―nothing‖ to ―a great deal.‖ The scale included the items generated from the teaching qualifications in measurement and evaluation practices which were developed by MoNE and ERDHO. The scale development procedure is presented in detail in section 3.5.2. Section III included Frequency of Using Different Measurement and Evaluation Tools Scale (FMES), and it was developed by the researchers as a 5-point Likert scale including 17 measurement and evaluation tools suggested by the Turkish Ministry of National Education (MoNE) in the latest curriculum (Erdoğan, 2007). This scale was developed to measure the frequency of using different measurement and evaluation tools. Two different variables were extracted from this scale to measure frequency of using alternative and traditional methods and these were named as Alternative-ME and Traditional-ME. Alternative-ME, namely frequency of using alternative measurement and evaluation tools, was measured by ten items, and 7 items assessed Traditional-ME or frequency of using traditional measurement and evaluation practices. The score for these two variables were computed by adding the item scores and taking an average of total score dividing by the number of items. For example, mean score of AlternativeME is equal to the total score of ten items divided by ten. Therefore, both Alternative-ME and Traditional-ME corresponded to a score out of five. Section IV included Turkish Teachers’ Sense of Efficacy Scale (TTSES). The scale was originally developed by Tschannen-Moran and Woolfolk Hoy in 2001 34

and was adapted to Turkish by Çapa, Çakıroğlu, and Sarıkaya (2005). The items include ―how well can you…?‖ and ―how much can you…?‖ patterns to meet the criteria of Bandura (2005) which are considered in developing self-efficacy scales. TTSES includes 24 items on a 9-point scale ranging from (1) ―nothing‖ to (9) ―a great deal.‖ and these items measure teacher self-efficacy beliefs in three domains:

classroom

management,

instructional

strategies,

and

student

engagement. 3.5.1. Demographic Information In the original instrument, after the information about the purpose of the study and confidentiality of the results were stated, eleven questions were included in the demographic information section to determine the characteristics of the participating teachers in detail. In demographic information part, the categorical variables were gender, participating teachers’ graduate degree (levels were teacher school, pre undergraduate, undergraduate, graduate, and doctorate), teaching branch (with levels of science and social science), teaching level (primary and secondary), school type (levels were public primary and public high school), whether they have taken any course on measurement and evaluation during the undergraduate education (levels were yes and no) and whether they have taken any in-service training toward measurement and evaluation (levels were yes and no). Age and year in teaching were continuous variables. In addition to these, the name of the faculty and the program which they were graduated from were asked as open ended questions and these were coded as one variable with two levels: being a graduate of a faculty of education or not.

35

3.5.2. Teacher Self-Efficacy toward Measurement and Evaluation Practices Scale (TEMES) In order to examine how efficacious teachers are in consideration with measurement and evaluation practices, the researchers decided to develop a new scale in the light of the qualifications in teaching which the Turkish Ministry of National Education submitted in 2007. Before the item construction, resources on measurement of self-efficacy, available teacher self-efficacy scales (e.g., teachers’ general efficacy, teachers’ efficacy toward mathematics and science teaching), validity and reliability evidences for these scales were examined in detail. 3.5.2.1. Instrument Development During the development of the instrument, the following steps were followed: deciding the dimension of the proposed instrument, generating items from different sources including the qualifications that Turkish Ministry of National Education proposed, determining the rating scale of the instrument reviewing items by experts, validating the items, administering the items to a development sample (i.e., conducting the pilot study), evaluating the items and deciding on the length of the scale (DeVellis, 2003, p. 60-100). An item pool was generated considering the literature in this field. The primary source was the report on qualifications in teaching generated by the Turkish Ministry of National Education (MoNE) and Educational Research and Development Head Office (ERDHO). Under the sub-heading of Observing Student Development and Evaluation, there are 24 qualifications. These qualifications were written in question format starting with the pattern of ―how much can you…?‖ or ―how well can you…?‖ In addition to these, 9 more items were constructed in consideration by examining preceding teacher efficacy scales. During 2007 fall semester, the draft scale was reviewed by graduate students of 36

Test Construction course in Middle East Technical University and by five experts from educational sciences, elementary education, and measurement and evaluation departments of Middle East Technical University and Hacettepe University. They mostly focused on wording of the items and made some contributions on how the items may be revised to become more clear and understandable. In fact, review of the experts contributed to content validity of the instrument in terms of agreement on the content to be covered to measure the intended construct, which is teacher self-efficacy toward measurement and evaluation practices. The suggestions of the experts let the researchers decrease the number of items from 33 to 24 because there were some items related to each other and these items seemed redundant measuring the same construct. After the items were generated, the rating scale was decided as a 9-point ranging from ―nothing‖ to ―a great deal.‖ The reason of selecting a 9-point scale was Bandura’s ―Guide for Constructing Self-efficacy Scales.‖ According to Bandura (2001, p. 7), ―People usually avoid the extreme positions so a scale with only few steps may, in actual use, shrink to one or two points. Including too few steps loses differentiating information because people who use the same response category may differ if intermediate steps were included.‖ Therefore, the scaling of the new instrument assessing teacher efficacy in measurement and evaluation practices was in between (i.e., neither a 100-point format nor 5-point Likert scale) as being 9-point. 3.5.2.2. Pilot Study The pilot study was conducted by administrating the instrument to 118 elementary and secondary school teachers in Ankara. Twenty-three percent (23%) of these teachers were teaching at elementary level, while 77% was working in secondary level. There were 65 female teachers and 53 male teachers. The average age and teaching experience in years was 40 and 16, respectively. Nearly, half of the 37

sample (49.2%) was composed of graduates of faculties of education. Approximately 24% of the participant teachers had a science (e.g. physics, biology, and chemistry) or mathematics major, whereas 76% of them were teaching social sciences (e.g., teaching history, languages like Turkish or English, or classroom teacher). Among all participants, 12% of them have taken a course on measurement and evaluation during their university education and 68% of them have participated an in-service training on measurement and evaluation. To examine the factor structure of TEMES, Exploratory Factor Analysis (EFA) was performed through SPSS 15.0. Before the analysis, the researchers checked the assumptions of Exploratory Factor Analysis, which were proof of metric variables, correlations above .30, Bartlett’s Test of Sphericity, KMO (KaiserMayer Olkin) value (>.60), multivariate normality, and absence of outliers (Hair, Anderson, Tatham, & Black, 2006). The instrument was a 9-point scale and the responses were regarded as efficacy scores (metric variable) for each participant. There was no correlation coefficient which was less than .30. Bartlett Test resulted in a significant value which meant that correlation matrix was significantly different than an identity matrix, i.e., none of the correlations between the items were zero (Tabachnick & Fidell, 2007). Moreover, KMO value (.93) was exceeding the criterion value of .60. Before examining multivariate normality, univariate normality was checked by observing skewness and kurtosis values, significance of Kolmogorov-Smirnov and Shapiro-Wilk Tests and histograms with normal curves. The skewness and kurtosis values were between +3 and -3 (Tabachnick & Fidell, 2007), but Kolmogorov-Smirnov and ShapiroWilk Tests were significant which indicated that distribution differed from normality. Yet Kolmogorov-Smirnov and Shapiro-Wilk Tests are conservative tests, the researchers continued to examine univariate normality by checking histograms and they noticed that univariate normality was not violated according to the histograms with normal curves. 38

In addition to univariate normality, existence of multivariate normality was tested by running norm test macro in SPSS 15.0. This analysis yielded Small Test with a significant result showing the violation of multivariate normality but this test was a kind of Chi-Square Test and it was sensitive to sample size. Cases which have Mahalonobis Distance values larger than the critical value (45.51 for α = .05 and df = 24) were checked to detect multivariate outliers. Only three out of 118 cases were extreme cases. Boxplots were also examined to determine whether there was any univariate outlier. It was seen that there were no serious outlier in any of the cases. These results showed that it is possible to continue factor analysis. Factor analysis resulted in that the new instrument had two factors which were named as: developing measurement and evaluation tools and applying and analyzing the results of measurement and evaluation tools. Approximately 62% of the variance in teachers’ efficacy toward measurement and evaluation tools was explained by these two factors. The scree plot, also suggesting two factors, is presented in Figure 3.1. Based on the finding of the pilot study, none of the items were eliminated. To report on reliability of the two factors, Cronbach Alpha Coefficients were calculated and resulted in following values .95 and .93, respectively.

39

14 13 12 11 10

Eigenvalue

9 8 7 6 5 4 3 2 1 0 1

2

3

4

5

6

7

8

9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Factor Number

Figure 3.1 Scree Plot Items loaded on the related factors with high values and this indicated exploratory factor analysis was appropriate for the instrument. Some items of the first factor with the factor loadings were as follows: ―How well can you develop appropriate questions for instructional content?‖ (-.95) and ―How well can you gauge student comprehension of what you have taught?‖ (-.80). Factor loadings found for some of the items of the second factor were: ―How well can you prepare individual measurement and evaluation activities (e.g. performance evaluation, project)?‖ (.81) and ―How well can you develop alternative measurement and evaluation tools (e.g., concept maps, constructed grid)?‖ (.87). Reliability analysis for Alternative-ME and Traditional-ME yielded following coefficients: .89 for Alternative-ME and .69 for Traditional-ME. Item total correlations ranged from .41 to .76 for Alternative-ME and from .34 to .68 for Traditional-ME, indicating that all the items were working as intended.

40

3.5.3. Scale for Measuring Frequency of Using Different Measurement and Evaluation Tools The purpose of developing a scale including all measurement and evaluation tools was to measure how frequently teachers use different measurement and evaluation tools. Herewith the researchers constructed an instrument which was a 5-point Likert scale (ranging from never to always) including 17 measurement and evaluation tools that were proposed by the Turkish Ministry of National Education (MoNE) in the latest curriculum (Erdoğan, 2007). Tools were classified as traditional and alternative measurement and evaluation in this scale. In order to see the difference between using alternative and traditional measurement and evaluation methods in terms of the effect of teachers’ sense of efficacy toward measurement and evaluation practices, the researchers derived two variables from this scale as Traditional-ME (mean score of the items including traditional assessment methods) and Alternative-ME (mean score of the items including alternative assessment methods). Teachers were asked to indicate their frequency of using listed measurement and evaluation tools out of five frequency choices as never, rarely, sometimes, frequently, and always. Ten items measuring Alternative-ME asked for the frequency of using word matching, written reports, interview with students and observation, drama, portfolio, concept map, constructed grid, performance evaluation, self-report, and peer evaluation. Traditional-ME was measured by seven items asking how frequently teachers used open-ended questions, short answered questions, multiple choice test, true/ false questions, matching questions, fill in type, and question-answer technique. In the pilot study, EFA was conducted to define whether items measuring frequency of using alternative tools could be differentiated from the ones measuring the frequency of using traditional tools. EFA findings indicated that this scale had two factors as having expected items relating to alternative and traditional separately. Reliability analysis revealed the following coefficients for 41

frequency of using alternative and frequency of using traditional tools respectively: .89 and .69. 3.5.4. Turkish Teachers’ Sense of Efficacy Scale The instrument (previously called as Ohio State Teacher Efficacy Scale, now known as Teachers’ Sense of Efficacy Scale which was developed by TschannenMoran and Woolfolk Hoy (2001), included three factors: efficacy for student engagement, efficacy for instructional strategies, and efficacy for classroom management. Tschannen-Moran and Woolfolk Hoy (2001) examined their scale in three studies with different pre-service and in-service teachers (the sample sizes were 224, 217, and 410, respectively). In consideration with the factor loadings, some items were extracted from the scale and the researchers decided to continue with 32-item scale after the first study. In the second study, number of items decreased to 18 and factor analysis resulted in a 3-factor structure, and the number of items in each factor was as follows: 8 items in efficacy student engagement (ESE), 7 items in efficacy for instructional strategies (EIS), and 3 items in efficacy for classroom management (ECM). Tschannen-Moran and Woolfolk Hoy (2001) designed one more study with 410 participants to refine Teachers’ Sense of Efficacy Scale. The final reported reliability coefficients for the 3-factor scale were as follows: .81 for ESE, .86 for EIS and .86 for ECM. Each factor has 8 items. Çapa, Çakıroğlu, and Sarıkaya (2005) adapted Teachers’ Sense of Efficacy Scale in Turkish by administering the translated version to 628 pre-service teachers in six faculties of education in Turkey. Çapa and her colleagues (2005) found that the adapted version of TSES was also composed of three factors as ESE, EIS, and ECM with similar reliability estimates ranging from .82 to .86.

42

3.6. Data Collection Procedure After the scale was developed, necessary documents were submitted to the METU Human Subjects Ethics Committee (HSEC). While waiting for the decision of the committee, the researchers made a random list of schools from the complete school list of the Turkish Ministry of National Education. The study was conducted in three different cities: Ankara (the districts of Çankaya and Sincan), Samsun (Center district), Ġstanbul (the districts of Eyüp, Bakırköy and Zeytinburnu). The instrument and proposal were submitted to Educational Research and Development Head Office (ERDHO) after the METU HSEC approved that the study has an applicable instrument and there is no problem with the design for the ethical considerations. Questionnaires were prepared in an optic format to make both data collection and entry process easier and quicker. Listed schools in Ankara, Samsun, and Ġstanbul were visited by the researchers and questionnaires were filled by the teachers. During the data collection process, the researchers observed the participants to see whether they responded the instrument independently and the researchers answered the questions of the participants to prevent missing data. Data collection lasted 10 to 15 minutes. 3.7. Data Analysis The following points suggested by Meyers, Gamst, and Guarino (2006) were considered before the data analysis: Is there any missing or incorrect data entry? Is there a pattern for missing data? Are there any extreme values that may affect the results of the study? Are the assumptions of the intended multivariate statistical techniques met? What can be done if any of these assumptions is violated somehow?

43

First of all, data were screened to check for missing values and for incorrect data entry if any existed. No incorrect entry was detected, but both in demographic variables and scale items, there were some missing values not exceeding 5 percent. Moreover, it was found that missingness followed a random pattern by running Little’s MCAR Test (Little & Rubin, 1987). Therefore, researchers decided to impute the missing values by using the Expectation Maximization (EM) algorithm (Tabachnick & Fidell, 2007). Tabachnick and Fidell (2007) reported that this method is a commonly used one when missing values are at random. In Expectation Maximization, two steps are followed: estimation of missing values and then estimation of parameters by regression analysis (Hair et al., 2006). In addition, Allison (2002) reported that EM was practical because it checked for all appropriate variables to impute missing values. Second, after missing value analysis was completed, unique scores were extracted for each scale, i.e., Teacher Self-Efficacy toward Measurement and Evaluation Practices Scale (TEMES), Frequency of Using Different Measurement and Evaluation Tools Scale (FMES) and Turkish Teachers’ Sense of Efficacy Scale (TTSES). Four mean scores were calculated for the participants: SE-Mean for selfefficacy toward measurement and evaluation practices, Alternative-ME for frequency of using alternative measurement and evaluation tools, Traditional-ME for frequency of using traditional measurement and evaluation tools and TTSESMean for teacher efficacy. Third, data were collected from teachers who were teaching at elementary and secondary schools in Ankara, Samsun, and Istanbul. Therefore, whether teachers’ responses differed in consideration with the city difference was examined by conducting One-way Analysis of Variances (One-way ANOVA) before further analyses. In this study, the researchers set the level of significance (α) at .05.

44

Fourth, to provide validation evidence for TEMES, Confirmatory Factor Analysis (CFA) was conducted by Analysis Moments of Structures (AMOS) 4.1. CFA has a deductive approach in that the aim is to find out the factorial structure which theoretical framework supports (Meyers et al., 2006). Bollen and Long (1993) summarized the steps of CFA which were model specification, model identification, model estimation, model evaluation, and model respecification, respectively. In the first step, model specification, researchers develop a model in consideration with the theory, and then check for whether the model can be identified in the model identification step. Model identification compares the number of variables in the analysis and the number of parameters estimated by the model. The difference between these two is known as degrees of freedom (df) and this value should be positive to indicate that the model is identified. In the third step, model estimation, the specified (theoretical) model is compared to what the data represents (observed model) by the statistical program, AMOS 4.1 in this research study. Then, model evaluation includes deciding about whether model fits the data by evaluating what the analysis yields, i.e., fit indices (e.g. NNNFI, CFI, and RMSEA), chi-square goodness-fit test results, unstandardized and standardized parameter estimates. According to these values, researchers can change or maintain the estimated model. When they add or delete some connections in the model, this is named as model respecification. Next, Cronbach’s Coefficient Alpha was computed to check for the internal consistency of TEMES, TTSES, Alternative-ME, and Traditional-ME. Estimated scale reliabilities in the case of any item deleted were also examined to check whether there is any problem with the items. Regarding the examination of whether TEMES is an appropriate instrument to measure teacher self-efficacy toward measurement and evaluation practices, Canonical Correlational Analysis was conducted the to examine the relationship between the factor scores of TTSES (Turkish Teachers’ Sense of Efficacy Scale) 45

and the factor scores of TEMES (Teacher Self-Efficacy toward Measurement and Evaluation Practices Scale). In canonical correlation analysis, correlations between variables in and between the two sets are examined to understand the relationship in and between the sets. In each set, variables are loaded on a related canonical variate and canonical correlations above .30 are the concern of a researcher. Then, to examine the effect of gender and teaching level on the factors of Teacher Self-Efficacy toward Measurement and Evaluation Practices Scale (TEMES), Multivariate Analysis of Variance (MANOVA) was run. These analyses were performed using SPSS 15.0. Finally, to find out answers for the research questions, Structural Equation Modeling (SEM) was conducted by AMOS 4.1. The structural model was specified according to the theoretical framework which is derived from the related literature on teachers’ sense of efficacy. The corresponding variables were year in teaching and frequency of using alternative and traditional measurement and evaluation tools in this study. The model is represented in Figure 3.2. SEM is advantageous in terms of assessing and controlling measurement errors (Meyers et al., 2006). In this analysis, there are mainly two models named as structural and measurement. While measurement model specifies the relationship between the latent (unobserved) and manifest (observed) variables, the structural model identifies the relationship among the latent variables (Byrne, 2001). SEM uses maximum likelihood method which estimates the values of parameters that would provide the maximum likelihood of observed data to the theoretical model. In SEM analysis, comparison is made between the theoretical model and the model which is presented by the observed data. This comparison is carried out by examining the fit indices, chi-square test, and correlational estimates to conclude whether the theoretical model fits the collected data (Meyers et al., 2006). In this study, the researchers checked chi-square statistic (Hoyle, 1995) and root mean square error of approximation (RMSEA; Steiger & Lind, 1980) known as absolute 46

fit indices; in addition to the comparative fit index (CFI; Bentler, 1990) and nonnormed fit index (NNFI; known as Tucker-Lewis Index; Bentler & Bonett, 1980) which were categorized as incremental fit indices (Hair et al. 2006). For both of the absolute and incremental fit indices, there are some criteria to evaluate the model fit. If Chi-square statistic results in significant value, then the specified model is different than observed data; that is, the model does not fit the data. However, chi-square measure is sample size dependent. Therefore, it is better to check for other fit indices to understand the model fit (Hair et al., 2006). Browne and Cudeck (1993) reported that close fit is indicated by RMSEA values lower than .05; mediocre fit is indicated by the values between .05 and .08; and poor fit is indicated by the values over .10. Later, the criteria of mediocre fit and poor fit for RMSEA were defined as values between .08 and .10 is an evidence for mediocre fit and the values higher than .10 indicate poor fit (MacCallum, Browne, & Sugawara, 1996). In addition to these criteria for absolute fit indices, CFI and NNFI changes between 0 and 1 (Hair et al., 2006), and should be greater than .95 to indicate good fitting model (Hu & Bentler, 1999).

47

Figure 3.2 Structural Model Displaying the Relationship between Variables

3.8. Limitations

The following limitations are associated with this study:

1. Correlational research was used in this study; therefore, no causal relationship can be made between the research variables. 2. The present study is limited with the relationship between year in teaching, frequency of using different measurement and evaluation tools, and teacher self-efficacy toward measurement and evaluation tools. There may be other variables related to teacher self-efficacy toward measurement and evaluation practices.

48

3. The present study is relied on self-report data. Resources such as observation reports, interview reports, or peer evaluation are not used, because of the quantitative nature of the study. 4. The present study is limited with the teachers who have the characteristics which are defined in section 3.4. Data were collected from the teachers who work in public primary and high schools in Çankaya and Sincan districts of Ankara, city center of Samsun, Eyüp, Zeytinburnu, and Bakırköy districts of Ġstanbul. Due to convenience sampling is preferred, the results does not represent all the teachers in Turkey.

49

CHAPTER IV

RESULTS

In this chapter, results of data analysis are presented under the following headings: Descriptive statistics of scale scores (for TEMES, FMES and TTSES scales), examination of related assumptions for further analyses, the results of one way ANOVAs, and separate Confirmatory Factor Analysis for TEMES and TTSES, reliability coefficients, additional validity evidences including results of MANOVA and Canonical Correlation Analysis, and results of Structural Equation Modeling. In this study, the purpose was to explore the relationship between teachers’ sense of efficacy beliefs toward measurement and evaluation practices, teachers’ sense of efficacy, year in teaching, and frequency of using different measurement and evaluation tools. Before the further analyses, the researchers supposed that it was practical to examine whether the items differed significantly when city was considered as an independent variable by conducting one way ANOVA for each item of three scales. This is performed because one-way ANOVAs provided the researchers an opportunity to evaluate mean differences between the data of three cities. To make sure that the data were appropriate for running separate one-way ANOVAs, the researchers checked for the corresponding assumptions which were independent observation, normality, and equality in population variances (i.e., homogeneity of variances) (Gravetter & Wallnou, 2007). The researchers prevented participants’ responses not to be affected each other by being present at where the data were 50

collected; therefore, the assumption of independent observation is verified. To check that normality assumption was met, skewness and kurtosis values for each item of three scales, and histograms with normality curves were examined. The researchers concluded that there was no problem with the normality assumption; hence, there were only two items which had kurtosis values exceeding the criteria of being in between -3 and +3 (Tabachnick & Fidell, 2007) and normality curves indicated

no

skewed

distributions.

Moreover,

Levene’s

Test

yielded

nonsignificant value and this indicated that there was no difference between error variances across the data of different cities. After preliminary analysis, one way ANOVAs for each item was run separately, and results indicated that only three of the 65 items differed significantly, but effect sizes were pretty low (ranging from .02 to .03). Therefore, data of three cities were gathered and totally 394 cases were analyzed in this study. Mean, standard deviation, minimum and maximum values for the study scales, TEMES, TTSES, Alternative-ME, and Traditional-ME were computed and displayed in Table 4.1. Table 4.1 Results of Descriptive Statistics for TEMES, TTSES, and FMES Variables

M

SD

Min

Max

TEMES

6.83

.98

1

9

TTSES

6.96

.82

1

9

Alternative-ME

2.85

.84

1

5

Traditional-ME

3.48

.69

1

5

51

Descriptive statistics indicated that the mean scores of teachers’ sense of efficacy (MTTSES=6.96) which was assessed by factors of efficacy in student engagement, instructional strategies, and classroom management), and teacher self-efficacy toward measurement and evaluation practices (MTEMES=6.83) are approximate to each other. TEMES (SD=.98) and TTSES (SD=.82) scores have approximately the same variation. Moreover, traditional (M=3.48) and alternative (M=2.85) measurement and evaluation tools are used in different frequency by the teachers who work in public elementary and secondary schools. Traditional measurement and evaluation tools were reported to be used more frequently than alternative ones. The variation of Alternative-ME scores (SD=.84) is slightly higher than the variation of Traditional-ME scores (SD=.69). 4.1. Confirmatory Factor Analysis Researchers proposed a five-factor structure for TEMES based on the levels of measurement and evaluation practices. These factors were determining assessment goals

and

techniques,

developing

measurement

and

evaluation

tools,

administering measurement and evaluation tools and evaluating the results, analyzing the results, and using and sharing results in other courses. CFA resulted in significant chi-square value (=221.42), CFI value of .99, and NNFI value of .97; but RMSEA value was close to .10 (=.095) and this indicated poor fit (MacCallum, Browne, & Sugawara, 1996). Therefore, researchers checked the modification indices (i.e., error covariance) of errors, and detected the ones with high values, i.e., most striking values among all (Arbuckle, 1999). The pairs with high error covariances were ε6- ε22, ε8- ε13, ε9- ε10, ε9- ε15, ε15- ε16, and ε20- ε21. The items related to these errors were examined in terms of belonging to the same factor or measuring related tasks in measurement and evaluation practices. The following item pairs loaded on same factors, namely item 8 and 13 loaded on the second factor, item 9 and item 15, item 15 and item 16 loaded on 52

the fourth factor, and item 20 and item 21 loaded on the fifth factor. Although two of the item pairs, 9-10 and 6-22, did not load on the same factors, these items measured similar or consequent tasks in measurement and evaluation practices. For example, both item 6 and item 22 asked for determining and developing alternative measurement and evaluation tools. In that sense, related error pairs were connected in the model and analysis was run again. After this change, RMSEA value decreased to .08 and this value indicated mediocre fit (MacCallum, Browne, & Sugawara 1996). In addition, resulting NNFI (.98) and CFI (.98) values supported good fitting model due to being higher than .95 (Hu & Bentler, 1999). Moreover, chi-square statistics resulted in a significant value of 870.60 (p

Suggest Documents