http://textproject.org/assets/tds/text-complexity-and-the-ccss/module-5/Module%25205-Qualitative%2520Measures-Instructor.pdf

CCR tools, standards, perspectives, indicators, solutions, concerns
Book Study Monday: Teaching Numeracy, Critical Habits 1 & 2

When we monitor our comprehension in literacy, we pay attention to whether or not we are understanding what we are reading.  The same must happen in a mathematics classroom.

Teacher Evaluation: What's Fair? What's Effective?:Observing Classroom Practice

November 2012 | Volume 70 | Number 3
Teacher Evaluation: What's Fair? What's Effective? Pages 32-37

Observing Classroom Practice

Charlotte Danielson

Classroom observations can foster teacher learning—if observation systems include crucial components and observers know what to look for.

Jennifer Lopez looks up quickly as Ms. Anderson, the principal, steps into her 5th grade classroom. She glances around nervously. What might this look like to Ms. Anderson?

At the beginning of the lesson—an introduction to the topics of buoyancy and density—the students pushed their desks together to make tables. On each table is a dishpan full of water. (Jennifer always hopes for the best on days like this; she's notorious with members of the custodial staff for various "adventures" in her classroom. But today all is well.) The students each have a lump of clay, and they've weighed their lumps on a pan balance to satisfy themselves that they all have roughly the same amount, or mass, of clay.

The students put their clay in the water and watch it sink. They're challenged to make it float, which they discover they can do if they fashion their clay into the shape of a boat. They find this exciting, and they immediately tackle the next challenge: Can they make a good boat—good meaning one that will hold a lot of cargo in the form of paper clips? They explore various questions: Should the boat have thin or thick sides? (Thin sides; it's possible to enclose more volume with the same amount of material.) Should they shape it like a bowl or like a canoe? (Like a bowl, for the same reason; canoes are for rapids.)

The students have become quite proficient. They're constructing boats with paper-thin walls and even tops so the water won't rush in, and they're sketching their designs on the board, showing the number of paper clips each one will hold. Jennifer is impressed and expresses her admiration. The boats hold 14, then 27, then 36, and finally more than 50 paper clips!

But here is Ms. Anderson in the room to do an (unannounced) observation. What must she be thinking? Now she comes over to Jennifer, motioning her aside and saying in a whisper, "I'll come back when you're teaching."

Instant Replay: The Principal's Point of View

Ms. Anderson pauses at Jennifer Lopez's door before stepping in. The classroom looks a little chaotic. What are these students doing? They look busy, for sure, and they seem to be having a good time. But Jennifer—what is she doing? The district is supposed to be getting started with the Common Core State Standards. Is this what they're supposed to look like? But it seems as though the teacher's not doing anything. Ms. Anderson decides to come back later, when Jennifer is teaching. And she hopes that when she does, she can be sure she is evaluating Jennifer's teaching correctly.

The Crux of the Problem

This episode gets to the heart of the issue facing both teachers and supervisors in this new era of high-stakes teacher evaluation. After all, for a system of teacher evaluation to be defensible (either professionally or legally) it must be fair—that is, the judgments that are made about a teacher's practice must accurately reflect the teacher's true level of performance. And because the quintessential skill of teaching is teaching, and it can be observed, we should conduct those observations with integrity and skill.

Identifying good practice through observation is less feasible with other job roles in education. For example, if you're trying to assess the skills of a principal, school nurse, or mentor, there's not one single place—such as a classroom—you could go to observe the essential skills embodied in that role; they're spread out over many locations. Principals interact with many different individuals—teachers, students, parents, and community members—and they engage in many different types of activities—conducting meetings, organizing the schedule, planning a budget, and so on—with such variety that no single item can be a stand-in for the entire job. In contrast, the work of teachers is easier to characterize as that which happens in their classrooms with students.

It's true that teaching is supported by a lot of behind-the-scenes work, but nevertheless, we can observe the interactive work with students, and this is the heart of teaching. Therefore, classroom observation is a crucial aspect of any system of teacher evaluation. No matter how skilled a teacher is in other aspects of teaching—such as careful planning, working well with colleagues, and communicating with parents—if classroom practice is deficient, that individual cannot be considered a good teacher.

Clear Standards of Practice

Precisely what the observer (supervisor, mentor, or coach) looks for in an observation is a function of the instructional framework that the school district or state has adopted. Unless there is a clear and accepted definition of good teaching, teachers won't know how their performance will be evaluated, and observers won't know what to look for.

For example, in the Danielson Framework for Teaching,1 two of the four domains of teaching (the classroom environment and instruction) are observable in a teacher's classroom practice. Each of those two domains contains five smaller components, which show observers exactly what to look for when they step into a classroom, such as whether the teacher has established an environment of respect and rapport, managed classroom procedures, used various questioning and discussion techniques, or engaged students in learning.

Research-Based and Validated

These teaching practices are grounded in a solid research base. Empirical studies have shown that each component of the Framework for Teaching is associated with improved student learning. It's also validated, as any instrument used for high-stakes teacher evaluation should be. That is, high levels of teacher performance on the instructional framework as a whole should predict high levels of student learning.

This imperative imposes significant demands on the developers of evaluation instruments because such research should be conducted by independent, disinterested parties using respected psychometric techniques. For example, the Danielson framework has been subjected to a number of such studies, including those conducted by the Measures of Effective Teaching (MET) Project and the Consortium on Chicago School Research.2

Highly Evolved

Any evaluation system used for high-stakes personnel decisions should be highly evolved. For example, does it clarify what will serve as evidence for each item in the instructional framework, such as observations, planning documents, or conferences? Are the words in the rubric clear enough to enable both teachers and supervisors to differentiate one level of proficiency from the next? The language must be sufficiently precise to enable observers to link specific teachers' or students' words or actions to specific elements or components of the instructional framework.

In defining good teaching, educators must also take into account major developments in state and national policy, such as the Common Core State Standards, which 45 states and the District of Columbia have formally adopted. The standards relate primarily to what students will learn and consequently have their greatest impact on issues of curriculum and student assessment. However, because the standards emphasize reasoning and problem-solving skills as well as developing deep conceptual understanding, they have implications for instruction. The methods a teacher uses to help students learn the techniques of argumentation, for example, are different from the methods he or she uses to teach low-level knowledge and skills by rote. Learning facts (such as Spanish vocabulary words or the multiplication tables) demands instruction that focuses on memorizing and using mnemonic devices. But teaching students to formulate and test hypotheses and to take and defend a position requires a broad repertoire of teaching strategies.

A definition of teaching that's responsive to evolving conditions in the field will impose different challenges for observers of practice. It's a dynamic environment in which the aspects of teaching deemed important to student learning and to a particular type of student learning—namely, high-level skills—evolve over time.

Having Clear Levels of Performance

Levels of performance describe how a teacher's practice progresses from inexperienced and inexpert to experienced and expert. With respect to the standards of practice, it's not that teachers either do them or don't do them—it's that they do them well or poorly. The levels of performance describe that continuum.

Because the levels of performance describe a teacher's skill in the various aspects of teaching, it's essential that observers be able to distinguish one level from the next. This, in turn, makes it more likely that any two trained observers will agree with each other. This is first a matter of clarity of language; the language used in the different levels should permit focused training for observers so their levels of agreement and accuracy are high.

Thus, in the Danielson Framework for Teaching, a statement at the proficient level in Component 3c (engaging students in learning) states that "the learning tasks and activities are designed to challenge student thinking, inviting students to make their thinking visible." This is more advanced than what the language describes at the basic level: "The learning tasks and activities require only minimal thinking by students and little opportunity for them to explain their thinking." These differences are clear and may be illustrated by specific examples during observer training.

There are other challenges concerning clarity of language. Some rubrics use the language of frequency; teachers do a certain thing "never," "occasionally," "frequently," or "always." This language suggests that an evaluator can observe the same teacher multiple times; it's not suitable for a single observation of teaching. For rubrics to apply to individual lessons, the language in the different levels of performance must be qualitatively, not quantitatively, different. For instance, in the example cited, learning tasks at the proficient level are "designed to challenge student thinking" whereas those at the basic level "require only minimal thinking by students." These are qualitative differences.

Finally, the rubrics must be robust enough to withstand the demands placed on the system as a whole; in particular, it must be possible to train observers to make accurate judgments regarding what they see and hear. For example, as a consequence of participating in the MET study, we found we had to tighten the language in the rubrics of the Framework for Teaching to attain high enough rates of inter-rater agreement and accuracy. It was not sufficient to say that a teacher demonstrated a "deep" understanding of the content; rather, revised language specifies that a teacher must be able to articulate connections between the topic being taught and other topics within and outside the discipline. Further, we discovered that providing teacher examples of the levels of performance facilitated observer training.

The Skills Observers Need

Observers need to acquire a number of skills to conduct fair and reliable observations of teaching. They need training, and possibly an assessment of their skills, to ensure they can conduct these observations with fidelity. Several states now require that evaluators be certified as observers before being permitted to evaluate teachers for high-stakes personnel decisions. This requirement makes good sense. After all, you can't obtain a driver's license without passing a test. Why should a supervisor be able to make high-stakes personnel decisions without demonstrating the skill to do so accurately?

So what are those necessary skills?

Collecting Evidence

When observing in a classroom, evaluators must note what they see and hear there. It's important that what they write down actually is evidence—and not opinion, interpretation, or bias. This is not a simple matter; it's challenging to record "just the facts, ma'am."

There are three types of evidence: words spoken by the teacher or students, such as, "Can anyone think of another idea?"; actions, such as, "The students took 45 seconds to line up by the door"; and the appearance of the classroom, such as, "Backpacks are strewn in the middle of the floor."

But it's difficult to record only evidence. Virtually all educators find they include some interpretation or opinion in their notes. For example, an observer might note that "the students are engaged" during the science lesson on buoyancy and density, but that's not, strictly speaking, evidence. It's not what the students or teacher said or did. Instead, it's an interpretation of what the observer heard and saw.

What the observer actually saw was students fashioning their clay into different shapes, leaning forward in their discussions with one another, and drawing sketches of their designs on the board. Those items would be the evidence, which the observer (probably correctly) has interpreted as student engagement. This distinction is important because when observers disagree about a teacher's level of performance, it's essential to know whether the differences stem from a difference in the evidence collected or in how the observer has interpreted that evidence.

Interpreting Evidence Against Levels of Performance

The evidence an observer collects in the classroom is not in itself good or bad. What leads to a judgment about the quality of teaching is interpreting that evidence against the rubric, or the levels of performance. The question for the observer is not what happened (that's the evidence), but what does it mean? That is, which collection of words in the rubric best summarizes or characterizes what the observer observed?

This question is at the heart of observer training. It's essential that different individuals, using the same framework, can agree on the level of quality of what they observe—that is, that they select the same level of performance for what they observed for the same reason. If the students, on their own initiative, pushed their desks together to make tables and gathered the materials they needed (the clay, the tubs of water), these items would be evidence of high levels of performance on Component 2e in the Danielson Framework (organizing physical space: the arrangement of furniture and use of physical resources) and Component 2c (managing classroom procedures: management of materials and supplies). Because the students did these things on their own, their actions would provide evidence of distinguished practice on the part of the teacher because the teacher would have established these routines and taught the students to follow them.

Of course, for low-inference items, it's easy to get high levels of inter-rater agreement. Observers can probably agree on whether the class started on time. But for anything more significant, such as whether the teacher used questioning and discussion to deepen understanding, there's likely to be less consensus among observers, even after some degree of training.

Conducting Professional Conversations with Teachers

Many supervisors, even when adequately trained to conduct classroom observations, confess to not knowing what to do next. "What now?" they say. "How do I have a conversation with a teacher that will result in learning and improved practice?"

Clearly, there's a role for feedback, as in "I noticed you directed two-thirds of your questions toward the right-hand side of the room. Were you aware of that?" But the overwhelming focus of a conversation following a lesson should be dialogue, with a sharing of views and perspectives. After all, teachers make hundreds of decisions every day. If we accept that teaching is, among other things, cognitive work, then the conversations between teachers and observers must be about the cognition.

Rather than being an opportunity for a supervisor to simply tell a teacher what he or she thought about the lesson ("I really liked the way you did X"), the conversations following an observation are the best opportunity to engage teachers in thinking through how they could strengthen their practice. Therefore, a comprehensive approach to observer training should include attention to the interactive skills of professional conversation, inviting teachers to reflect on their practice and strengthen it in ways described by the instructional framework they use.

The Teacher as Learner

Many teachers have been victims of an observation, supervision, and evaluation process in which the observation was something done to, rather than with, them. This is a shame and represents an enormous missed opportunity.

Although few teachers typically require remediation, the vast majority of teachers can strengthen their performance. In fact, because teaching is so demanding and complex, all teaching can be improved; no matter how brilliant a lesson is, it can always be even better. And unless we use the observation process for that purpose, it's fair to inquire why educators even engage in it. Compliance with state law may be an important legal reason, reflecting the acknowledged need to identify the few truly underperforming teachers. But if we don't use the observation process to strengthen practice overall, the system can't be called educative.

So how do schools make an observation process as educational as possible for teachers?

In answering this question, it's important to recognize that professional learning is learning—and that learning requires the learner to be an active participant in the process. With this in mind, it's instructive to review the typical observation scenario. Here, the observer goes to the classroom, takes notes on the events of the lesson, goes back to his or her office, writes up the notes, and then returns to the classroom and tells the teacher about the lesson. Sometimes the supervisor doesn't even talk with the teacher but simply leaves the observation report in the teacher's mailbox.

In this scenario, the teacher is doing nothing—except teaching the class, which he or she is under contract to do. In the observation process, the teacher is completely passive. So it's hardly surprising that teachers rarely learn much from the process.

Changing the Script

How could we strengthen the process so the teacher plays an active role? Let's go back to the classroom observation described at the beginning of this article and see how it looks when it's based in a clear framework and its goal is to strengthen the teacher's practice.

Ms. Anderson is taking detailed notes about what she sees the students doing in the classroom: They're creating different shapes for their boats, respectfully challenging one another to try different designs, adding paper clips until the boats sink, and drawing their designs on the board. She's also noting what the teacher is doing: Jennifer is circulating among the students, challenging them to consider other alternatives. Because they both understand the instructional framework, they know that the students' boat designs are evidence of Component 3c (engaging students in learning) and that Jennifer's circulating among the students offering insights and feedback is evidence of Component 3d (using assessment in instruction).

Afterward, Ms. Anderson gives a copy of her notes to Jennifer. Jennifer looks them over and points out that after making a new design for a boat, the students were also predicting how many paper clips it would hold before sinking. Ms. Anderson adds this piece of information to her notes.

Each of them then takes her notes and aligns each piece of evidence to components in Domains 2 and 3 of the Framework for Teaching, determining which level of performance—unsatisfactory, basic, proficient, or distinguished—they think the evidence reflects, linking the evidence for each component to the language of the rubric and the crucial attributes. They do this work independently, in preparation for their conversation, highlighting the words in the rubric they think best characterize the evidence.

The next day, the two meet for their post-observation (or reflection) conference, in which they compare their highlights and discuss the rationale for having selected the particular levels of performance. For example, they agree that because virtually all the students were intellectually engaged in the activity, Jennifer's performance for Component 3c is at the distinguished level.

The framework document represents a third point between the teacher and the observer. That is, the observer is not merely reporting to the teacher what he or she thought about the lesson but is also relating specific evidence from the lesson to specific words and phrases in the levels of performance.

Further, the observer must be sufficiently open-minded to adjust his or her interpretation of the evidence if the teacher makes a convincing case for an alternative view. After all, the observer can't be there every day, doesn't know what happened the day before, and doesn't know how a certain student usually behaves. For example, Jennifer knows that the students' respectful feedback to one another represents a big step for them in Component 2a (creating an environment of respect and rapport), and she points this out to Ms. Anderson.

What's Possible

Virtually every state requires observations of teaching as a significant contributor to high-stakes judgments about teacher quality. To be defensible, the systems that yield these observations must have clear standards of practice, instruments and procedures through which teachers can demonstrate their skill, and trained and certified observers who can make accurate and consistent judgments based on evidence.

In addition, it's possible to design approaches to classroom observation that yield important learning for teachers by incorporating practices associated with professional learning—namely, self-assessment, reflection on practice, and professional conversation. When these practices are put into place, classroom observation can make a dramatic contribution to the culture of a school.

Endnotes

1 For a complete discussion of the framework, see my book Enhancing Professional Practice: A Framework for Teaching (ASCD, 2007).

2 These studies include Rethinking Teacher Evaluation in Chicago: Lessons Learned from Classroom Observations, Principal-Teacher Conferences, and District Implementation (Consortium on Chicago School Research at the University of Chicago Urban Education Institute, November 2011) and Gathering Feedback for Teaching: Combining High-Quality Observations with Student Surveys and Achievement Gains (Bill and Melinda Gates Foundation, 2012).

Myth-Busting Differentiated Instruction: 3 Myths and 3 Truths

With delivery of instruction, one size does not fit all. John McCarthy launches his differentiated instruction series by busting three common myths about DI.

CCSSO Announces a New Webinar Series for Next Generation Learning Challenges this Fall

CCSSO invites you to participate in a "Back to Campus" series of webinars focusing on the underlying challenges in college readiness and completion and the role of technology as an enabler of change. The webinar series will provide background discussions on topics relevant to the upcoming Next Generation Learning Challenges grant program, so interested parties may get different perspectives framing these important issues in learning technologies that span across K-20.
Tips for supporting instructional coaches

Have you ever been coached? Coaches can impact form, structure, content and meaning, regardless of the field in which they coach.
The ACT® | Innovation and Continuous Improvement | ACT

ACT Test to Improve Readiness and Help Students Plan for Success  June 5, 2014

New Elements Include STEM Score, Career Readiness Indicator, and an Enhanced Writing Test

IOWA CITY, IOWA—Starting in 2015, students who take the ACT®, the nation’s leading college readiness assessment, will receive new scores and indicators designed to improve readiness and help students plan for the future in areas important to success after high school, such as STEM (science, technology, engineering and mathematics) and career readiness. These new indicators are among several innovations that ACT will introduce, including an enhanced writing test.

The new indicators will be reported to students in addition to the traditional ACT scores and ACT College Readiness Benchmarks. The indicators will describe student performance and predicted readiness levels in categories such as STEM, career readiness, English language arts and text complexity, giving students a greater and more specific understanding of both their preparation for success after high school and how to better meet their goals. This will, in turn, better inform teaching, planning and decision making.

In addition, ACT plans to enhance the scoring and approach of the optional ACT Writing Test, offering more insights to help students become college and career ready. Students’ essays will be evaluated on four domains of writing competency: ideas and analysis, development and support, organization, and language use. The test will measure students’ ability to evaluate multiple perspectives on a complex issue and generate their own analysis based on reasoning, knowledge and experience. This will allow students to more fully demonstrate their analytical writing ability.

“We are constantly seeking ways to bring new and innovative features to our customers,” said Jon Erickson, ACT president of education and career solutions. “Our goal is to continuously improve. These research- and evidence-based enhancements are designed to keep our products relevant and helpful. They will be introduced gradually and thoughtfully, so our customers don’t experience radical changes.”

ACT has previously announced other new developments the organization plans to deliver over the next few years, including a computer-based version of the ACT test and optional constructed-response questions. The digital ACT and constructed-response tests will be offered as options to select schools that participate in state and district testing starting in 2015. The computer-based ACT was successfully piloted in April with approximately 4,000 high school students across the United States.

The familiar 1-to-36 scores on the ACT will not change and will still be reported. The new readiness indicators will supplement those scores, giving students, parents and educators more detailed insights so that they may better plan for future success.

STEM Score—This score will represent the student’s overall performance on the science and math portions of the exam. The ACT is the only national college admission exam to measure science skills. This new score can help students connect their strengths to career and study paths that they might not otherwise have considered, particularly when used with their results from the ACT Interest Inventory.Progress Toward Career Readiness Indicator—This measure will help students understand their progress toward career readiness and help educators prepare their students for success in a variety of career pathways. It will provide an indicator of future performance on the ACT National Career Readiness Certificate™ (ACT NCRC®), an assessment-based credential that certifies foundational work skills important for job success across industries and occupations.English Language Arts Score—This score will combine achievement on the English, reading and writing portions of the ACT for those who take all three test sections, enabling students to see how their performance compares with others who have been identified as college-ready.Text Complexity Progress Indicator—This measure will tell students if they are making sufficient progress toward understanding the complex texts they will encounter in college and during their careers. The information will help students plan future study to improve their readiness.

ACT will also add new reporting categories in 2016. The new categories will align with the Common Core State Standards domains and conceptual categories.

“The ACT will continue to be the tried-and-true achievement exam that students, colleges and states have trusted for more than 50 years,” said Wayne Camara, ACT senior vice president of research. “We are simply expanding the information that we provide to give students a better, clearer map of the road to success. We are focused on helping the individual student. Our ongoing goal is to offer a wider range of relevant, personalized insights to each test taker.”

Christopher Emdin: Teach teachers how to create magic - YouTube

What do rap shows, barbershop banter and Sunday services have in common? As Christopher Emdin says, they all hold the secret magic to enthrall and teach at t...
Shanahan on Literacy: The New Bane of Beginning Reading Instruction: Phony Rigor

Tuesday, June 24, 2014
The New Bane of Beginning Reading Instruction: Phony Rigor

Wisdom from Tim Shanahan....

Shanahan on Literacy: Teaching My Daughters to Read -- Part II, Print Awareness

Teaching My Daughters to Read -- Part II, Print Awareness

Open Access News from the RSP team

Open Access News items
To Inspire Learning, Architects Reimagine Learning Spaces

As schools refocus on team-based, interdisciplinary learning, they're moving away from standardized, teach-to-test programs that assume a one-size-fits-all approach to teaching. Instead, there is a growing awareness that students learn in a variety of ways, and the differences should be supported. With that in mind, here's how one architecture firm is redesigning learning spaces.

Preschoolers Outsmart College Students In Figuring Out Gadgets

When researchers asked young children to figure out an experiment using cause and effect, they did a much better job than young adults. That may be because their thinking is more flexible and fluid.
The 10 Most Important Work Skills in 2020 - Infographic

Source: Top10OnlineColleges.org

The 10 Most Important Work Skills in 2020

The 6 Drivers of Change
○ All of the 10 skil(...)

DPG plc's curator insight,

An interesting look at the skills that are going to be needed in 2020 or are these skills needed right now? Where are the current gaps?

José Antônio Carlos - O Professor Pepe's curator insight,

Muito do que se precisará para trabalhar nos próximos anos nem sequer é percebido e muito menos ensinado. Este excelente infográfico mostra algumas destas competências emergentes. Confira e saia na frente.

Maureen Greenbaum's curator insight,

Based on 19 page report by Institute for the Future (IFTF) is an independent, nonprofit strategic research group with more than 40 years of forecasting experience for The University of Phoenix Research Institute

13 Implications for Educational institutions at the primary, secondary, and post-secondary levels, are largely the products of technology infrastructure and social circumstances of the past.

The landscape has changed and educational institutions should consider how to adapt quickly in response. Some directions of change might include:

»Placing additional emphasis on developing skills such as critical thinking, insight, and analysis capabilities

»Integrating new-media literacy into education programs

»Including experiential learning that gives prominence to soft skills—such as the ability to collaborate, work in groups, read social cues, and respond adaptively

»Integrating interdisciplinary training that allows students todevelop skills and knowledge in a range of subjects

Teacher Evaluation:Reducing Error in Teacher Observation Scores

November 2012 | Volume 70 | Number 3
Teacher Evaluation: What's Fair? What's Effective?
Pages 82-83

Art and Science of Teaching / Reducing Error in Teacher Observation Scores

Robert J. Marzano

Given current trends in teacher evaluation, one of teachers' main concerns relates to the accuracy of their scores. They have the right to be concerned, given the low reliabilities commonly reported in studies of various observation systems (Bill and Melinda Gates Foundation, 2011). Error is inherent in any type of observation system. Indeed, error is inherent in any type of measurement.

One type of error found in teacher observation scores is measurement error. This occurs when the person observing and scoring a teacher doesn't adequately understand or use the observation system. We can correct this type of error through rigorous observer training.

Another type of error is sampling error. This occurs when the rater observes a class that doesn't represent a teacher's usual behavior. For example, a teacher might typically ask a great many questions of all students but not on the day he or she is observed. Sampling error is more difficult to address than measurement error.

What to Do About Sampling Error

The obvious way to eradicate sampling error is to observe teachers every day they teach, which, of course, is impossible. The current convention is to do unannounced, random observations. Some districts and schools now require supervisors to do about five observations of each teacher. But because day-to-day lessons require different instructional strategies, far more than five observations are required to obtain an accurate representation of a teacher's pedagogical skill.

In the teacher evaluation model based on The Art and Science of Teaching (2007), I've identified three types of lessons: (1) those in which a teacher introduces new content, (2) those in which students practice and deepen their understanding of previously introduced content, and (3) those that require students to apply what they've learned. Each involves different instructional strategies.

This fact alone might add sampling error to an observation. If an observer is required to look for a long list of instructional strategies during every observation, but some strategies typically occur only in a specific type of lesson, he or she would have to note the absence of various strategies during the observed lesson even when those strategies wouldn't have been suitable.

Videos of classroom teachers have shown that teachers use lessons that introduce new content 60 percent of the time, lessons that help students practice and deepen their understanding 35 percent of the time, and lessons that ask students to apply what they've learned 5 percent of the time. If an observer made five random observations of a teacher's classes, the probability of seeing one lesson of each type would be only 18 percent.1 In other words, chances are good that teacher scores based on five random observations would contain a great deal of sampling error.

Five Steps That Help

At some point, K–12 evaluators might be able to conduct sufficient teacher observations to reduce sampling error. In the interim, I recommend five steps.

Use Teacher Self-Evaluation

Although having teachers rate themselves introduces the possibility of teachers scoring themselves too high, it can provide a useful reference point. In fact, in two of three possible outcomes, teacher self-evaluations help decrease the error in the observer's rating.

For example, if the teacher's self-rating is the same as the observer's, that's a good indication that the observer rating is accurate. If the teacher's self-rating is lower than the observer's, it's possible that the teacher has underrated his or her skill level, but it's more likely that the observer's rating is inflated; teachers will likely be more aware of their tendencies over the years than will observers. Finally, if the teacher's self-rating is higher than the observer's, the teacher may have an inflated view of his or her pedagogical skills, or the observer's score may be low as a result of sampling error or measurement error. In this case, the remaining strategies can provide additional information.

Use Announced Observations for Different Lesson Types

It's wise to schedule three announced observations during which the observed teacher demonstrates one of the three types of lessons. This procedure ensures that observers will see examples of instructional strategies specific to the different lesson types.

Of course, this might introduce another type of error—the teacher attempting to impress the observer by using strategies during announced observations that he or she typically doesn't use. If the rating scale describes specific levels of development for each instructional strategy (Marzano, 2012), the teacher will probably score low in terms of his or her skill in these rarely used strategies, thus defeating his or her purpose of using those strategies.

Use Brief Walk-Throughs as Unannounced Observations

Many schools routinely use brief, unannounced walk-throughs during which observers observe in teachers' classrooms for 3 to 5 minutes. Observers can collect information to resolve any uncertainties in teacher scores. For example, if a teacher's self-rating is higher than an observer's rating, ratings from walk-throughs might reconcile the differences.

Record Teachers' Classes on Video

Random recordings of teachers' classes are both easy and inexpensive to do using modern digital video cameras. Raters can score the recordings independently or in teams, and teachers can be included in scoring their own recordings.

Let Teachers Challenge Scores

Teachers should be allowed to challenge their final summative scores on specific elements by providing evidence—such as classroom videos, student artifacts, or student responses to survey questions—that shows they have effectively used those elements in the classroom. This gives teachers a say in the scores they receive.

A Useful Tool

Teacher observation is a useful and valid part of teacher evaluation. By incorporating some of the strategies I suggest, schools can reduce sampling error without requiring a great deal of additional resources.

References

Bill and Melinda Gates Foundation. (2011). Learning about teaching: Initial findings from the Measures of Effective Teaching project. Bellevue, WA: Author. Retrieved from www.gatesfoundation.org/college-ready-education/Documents/preliminary-findings-research-paper.pdf

Marzano, R. J. (2007). The art and science of teaching: A comprehensive framework for effective instruction. Alexandria, VA: ASCD.

Marzano, R. J. (2012). Evaluations that help teachers improve. Educational Leadership, 70(3), 14–19.

Endnote

1 I derived this probability by computing the probability of each possible way that five observations would include at least one instance of each lesson type using the multinomial distribution and then summing these probabilities.

