Revolution Lullabye

August 27, 2014

Newton, Value-added Modeling of Teacher Effectiveness

Newton, Xiaoxia A, et al. “Value-added Modeling of Teacher Effectiveness: Exploration of Stability across Models and Contexts.” Educational Policy Analysis Archives, 18 (23). 2010. Print.

Newton et al investigate measures of teacher effectiveness based on VAM (value-added modeling) to show that these measures, based on in large part on measured student learning gains, are not stable and can vary significantly across years, classes, and contexts. The study focused on 250 mathematics and ELA teachers and approximately 3500 students they taught at six high schools in the San Francisco Bay Area. The researchers argue that measures of teacher effectiveness based solely on student performance scores (those measures that don’t take into account student demographics and other differences) cannot be relied on to get a true understanding of a teacher’s effectiveness because so many other unstable variables impact those student test scores. Models of teacher evaluation that rely heavily on student performance scores can negatively impact teachers who teach in high-need areas. This is especially true with teachers who teach disadvantaged students or students with limited English proficiency.

Quotable Quotes

“Growing interest in tying student learning to educational accountability has stimulated unprecedented efforts to use high-stakes tests in the evaluation of individual teachers and schools. In the current policy climate, pupil learning is increasingly conceptualized as standardized test score gains, and methods to assess teacher effectiveness are increasingly grounded in what is broadly called value-added analysis. The inferences about individual teacher effects many policymakers would like to draw from such value-added analyses rely on very strong and often untestable statistical assumptions about the roles of schools, multiple teachers, student aptitudes and efforts, homes and families in producing measured student learning gains. These inferences also depend on sometimes problematic conceptualizations of learning embodied in assessments used to evaluate gains. Despite the statistical and measurement challenges, value-added models for estimating teacher effects have gained increasing attention among policy makers due to their conceptual and methodological appeal” (3).

Differences in teacher effectiveness in different classes: “An implicit assumption in the value-added literature is that measured teacher effects are stable across courses and time. Previous studies have found that this assumption is not generally met for estimates across different years. There has been less attention to the question of teacher effects across courses. One might expect that teacher effects could vary across courses for any number of reasons. For instance, a mathematics teacher might be better at teaching algebra than geometry, or an English teacher might be better at teaching literature than composition. Teachers may also be differentially adept at teaching new English learners, for example, or 2nd graders rather than 5th graders. It is also possible that, since tracking practices are common, especially at the secondary level, different classes might imply different student compositions, which can impact a teacher’s value-added rankings, as we saw in the previous section.” (12)

“the analyses suggested that teachers’ rankings were higher for courses with “high-track” students than for untracked classes” (13).

“These examples and our general findings highlight the challenge inherent in developing a value-added model that adequately captures teacher effectiveness, when teacher effectiveness itself is a variable with high levels of instability across contexts (i.e., types of courses, types of students, and year) as well as statistical models that make different assumptions about what exogenous influences should be controlled. Further, the contexts associated with instability are themselves highly relevant to the notion of teacher effectiveness” (16).

“The default assumption in the value-added literature is that teacher effects are a fixed construct that is independent of the context of teaching (e.g., types of courses, student demographic compositions in a class, and so on) and stable across time. Our empirical exploration of teacher effectiveness rankings across different courses and years suggested that this assumption is not consistent with reality. In particular, the fact that an individual student’s learning gain is heavily dependent upon who else is in his or her class, apart from the teacher, raises questions about our ability to isolate a teacher’s effect on an individual student’s learning, no matter how sophisticated the statistical model might be” (18).

“Our correlations indicate that even in the most complex models, a substantial portion of the variation in teacher rankings is attributable to selected student characteristics, which is troubling given the momentum gathering around VAM as a policy proposal. Even more troubling is the possibility that policies that rely primarily on student test score gains to evaluate teachers – especially when student characteristics are not taken into account at all (as in some widely used models) — could create disincentives for teachers to want to work with those students with the greatest needs” (18).

“Our conclusion is NOT that teachers do not matter. Rather, our findings suggest that we simply cannot measure precisely how much individual teachers contribute to student learning, given the other factors involved in the learning process, the current limitations of tests and methods, and the current state of our educational system” (20). 

Notable Notes

The problem of variables impacting the calculation of teacher effectiveness: the students’ background (socioeconomic, cultural, disability, language diversity), the effects of the school environment, how teachers perform year-to-year, the curriculum

VAM makes assumptions that schools, teachers, students, parents, curriculum, class sizes, school resources, and communities are similar.

The variables the researchers collected and measured included CST math or ELA scaled test scores, students’ prior test scores for both average and accelerated students, students’ race/ethnicity, gender, and ELL status, students’ parents’ educational level and participation in free or reduced school lunch, and individual school differences. Tries to look at the issue longitudinally by looking at student prior achievement (7). They were able to link students to teachers (8).

Advertisements

January 30, 2013

Hayles, How We Read

Hayles, N. Katherine. “How We Read: Close, Hyper, Machine.” ADE Bulletin 150 (2010): 62-79.

Hayles defines three kinds of reading – close, hyper, and human-assisted machine – and argues that all three, used synergistically, can help students and literary studies scholars discover patterns, meaning, and context in and across texts.  Her argument is written in resposne to the widely-held notion that digital, onscreen reading has a lasting detrimental affect on reading comprehension skills, as seen through K-12 testing scores, cognitive research on the brain, and anectodal evidence.

Hayles uses her own definition of hyperattention (as opposed to deep attention) to explain how hyperreading is different from close reading, which she argues is one of literary studies’ central values and practices.  Instead of condemning hyperreading, she argues that it is a valuable reading practice, helping students and scholars alike scan and skim large amounts of information quickly, thus identifying the most helpful sources and texts to use. 

Hayles also challenges the idea that human-assisted machine or computer reading (the use of algorithms to detect patterns across a large corpus) is inherently anti-humanistic; she cites Moretti’s application of distant reading principles and argues that the challenge for literary studies scholars is to find useful ways to integrate close, hyper, and machine reading in their interpretive work.

Notable Notes

critiques Carr’s argument that the Internet is changing brain structure (67, 71)

readers scan digital texts differently than print ones (F-shaped scan for digital text, more linear back and forth for print texts, Jakob Nielson) (66)

current students are ‘digitally native’ (62)

close reading defined the discipline of literary studies in the 1970s/1980s- it was a way for the field to congregate around a common value.  Does digital reading change that? Is close reading sufficient? (63)

problem: our classrooms don’t reflect the kinds of reading practices our students engage in – there’s a divide that is probably affecting how much our students learn (63; 65). Connection to Vygotsky’s learning theories.

James Sosonoski (1999) – hyperreading: “Examples include search queries (as in a Google search), filtering by keywords, skimming, hyperlinking, ‘pecking’ (pulling out a few items from a longer text), and fragmenting (163-172)” (66). – Hayles’ update includes juxtaposition (comparing across several open windows) and scanning (66).

what do we make of distraction of hyperreading? clicks, navigating, short bursts of info like tweets, tons of material (67)

hyperreading affects long-term memory (67-68), but is long-term memory a necessary part of the research process? Does every bit of information need to be committed to long-term memory for it to be valuable? (my response)

reference to Moretti as adopting machine reading-like characteristics to literary studies (73-74)

machine reading – relies on visualization, algorithms, mapping, diagramming  (73-75)

definition of a pattern: similarities as well as differences: “I therefore propose the following definition: a pattern consists of regularities that appear through a series of related differences and similarities” (74).

scale (close/hyper/machine) affects pattern, context, meaning (74) – the emphasis changes with the scale (74).

pedagogical examples of teaching hyper/machine reading (75-77): example of the analysis of Time Magazine covers.

Quotable Quotes

“I argue that we cannot do this effectively [convey to students our engagement with complex literary texts] if our teaching does not take place in the zone of proximal development, that is, if we are focused exclusively on print close reading. Before opinion solidifies behind a new version of close reading, I want to argue for a disciplinary shift to a broader sense of reading strategies and their interrelation” (65).

“In digital environments, hyperreading has become a necessity. It enables a reader quickly to construct landscapes of associated research fields and subfields; it shows  range of possibilities; it identifies texts and passages most relevent to a given query; and it easily juxtaposes many different texts and passages” (66).  scholars use these techniques – we need to teach them to students.

“Hyperattention is useful for its flexibility in switching between different information streams, its quick grasp of the gist of the material, and its ability to move rapidly among and between different kinds of texts” (72).

“The problem, as I see it, lies not in hyperattention/hyperreading as such, but rather in the challenges the situation presents for parents and educators to ensure that deep attention and close reading continue to be vibrant components of our reading cultures and interact synergistically with the kind of Web and hyperreading in which our young people are increasingly immersed” (72).

“Indeed, skimming, scanning, and pattern identification are likely to occur with all three reading strategies; their prevalence in one or another is a matter of scale and emphasis rather than clear-cut boundary” (72).

“The large point is that close, hyper, and machine reading each have distinctive advantages and limitations; nevertheless, they also overlap and can be made to interact synergistically with one another” (75).

Blog at WordPress.com.