Annotated Bibliography on Machine Grading of Essays, Part 1

Prepared by the NCTE Task Force on Writing Assessment

The following annotated bibliography on machine scoring and evaluation of essay-length writing is based on the 2012 published bibliography in the Journal of Writing Assessment 5 (compiled by Richard Haswell, Whitney Donnelly, Vicki Hester, Peggy O’Neill, and Ellen Schendel).

The bibliography was compiled by reviewing recent scholarship on machine scoring of essays, also referred to as automated essay scoring (AES), using databases such as ERIC and CompPile. Entries were selected for their attention to machine scoring of essays and publication in peer-reviewed venues (with exceptions noted). We also endeavored to cover the breadth of the issues addressed in the research without being overly redundant. We avoided publications that were very narrowly focused on highly technical aspects of assessment. The earliest research — such as Ellis Page’s 1966 piece in Phi Delta Kappan, “The Imminence of Essay Grading by Computer” — is not included because many more recent entries provide a review of the early development of machine scoring.

The bibliography is organized by publication date, with the most recent entries appearing first. Entries that have been excerpted from the published JWA bibliography are indicated by an asterisk.

Klobucar, Andrew, Deane, Paul, Elliot, Norbert, Raminie, Chaitanya, Deess, Perry & Rudniy, Alex. (2012). Automated essay scoring and the search for valid writing assessment. In Charles Bazerman et al. (Eds.) International Advances in Writing Research: Cultures, Places, Measures(pp. 103-119). Fort Collins, CO: WAC Clearinghouse & Parlor Press.

This chapter reports on an ETS and New Jersey Institute of Technology research collaboration that used Criterion, an integrated instruction and assessment system that includes automated essay scoring. The purpose of the research was “to explore ways in which automated essay scoring might fit within a larger ecology as one among a family of assessment techniques supporting the development of digitally enhanced literacy” (105). The study used scores from multiple writing measures including the SAT-W, beginning of the semester impromptu essays scored by Criterion, an essay written over an extended time line scored by faculty, end of semester portfolios, and course grades. The researchers compare the scores and conclude that when embedded in a course, AES can be used as “an early warning system for instructors and their students.” Authors also noted concerns that over-reliance on AES could result in a fixation on error and surface features such as length.

Perelman, Les. (2012). Construct validity, length, score, and time in holistically graded writing assessments: The case against automated essay scoring (AES). In Charles Bazerman et al. (Eds.) International Advances in Writing Research: Cultures, Places, Measures (pp. 121-150). Fort Collins, CO: WAC Clearinghouse & Parlor Press.  

An accessible critique of the writing tasks (the timed impromptu) and the automated essay scoring process. The author argues that while “the whole enterprise of automated essay scoring claims various kinds of construct validity, the measures it employs substantially fail to represent any reasonable real-world construct of writing ability” (p. 121).  He explains how length affects scoring: for short impromptus, length correlates to scores, but once more time is given to write and subjects are known in advance, the influence of length on scores diminishes.  He also explains how AES is different from holistic scoring in spite of a single number being generated because that number is generated by a set of analytical measures. These individual measures (e.g., word length, sentence length, grammar, and mechanics) are not the same construct it purports to measure (writing ability). The AES program discussed is primarily the ETS e-rater 2.0 system because ETS has been more transparent about it than other AES developers. Perelman draws on his own research into AES, many ETS technical reports and peer-reviewed research in making his argument. Continue reading

Machine Scoring Fails the Test

Approved by the NCTE Executive Committee, April 2013

[A] computer could not measure accuracy, reasoning, adequacy of evidence, good sense, ethical stance, convincing argument, meaningful organization, clarity, and veracity in your essay. If this is true I don’t believe a computer would be able to measure my full capabilities and grade me fairly. – Akash, student

[H]ow can the feedback a computer gives match the carefully considered comments a teacher leaves in the margins or at the end of your paper? – Pinar, student

(Responses to New York Times The Learning Network blog post, “How Would You Feel about a Computer Grading Your Essays?”, 5 April 2013)

Writing is a highly complex ability developed over years of practice, across a wide range of tasks and contexts, and with copious, meaningful feedback. Students must have this kind of sustained experience to meet the demands of higher education, the needs of a 21st-century workforce, the challenges of civic participation, and the realization of full, meaningful lives.

As the Common Core State Standards (CCSS) sweep into individual classrooms, they bring with them a renewed sense of the importance of writing to students’ education. Writing teachers have found many aspects of the CCSS to applaud; however, we must be diligent in developing assessment systems that do not threaten the possibilities for the rich, multifaceted approach to writing instruction advocated in the CCSS. Effective writing assessments need to account for the nature of writing, the ways students develop writing ability, and the role of the teacher in fostering that development.

Research1 on the assessment of student writing consistently shows that high-stakes writing tests alter the normal conditions of writing by denying students the opportunity to think, read, talk with others, address real audiences, develop ideas, and revise their emerging texts over time. Often, the results of such tests can affect the livelihoods of teachers, the fate of schools, or the educational opportunities for students. In such conditions, the narrowly conceived, artificial form of the tests begins to subvert attention to other purposes and varieties of writing development in the classroom. Eventually, the tests erode the foundations of excellence in writing instruction, resulting in students who are less prepared to meet the demands of their continued education and future occupations. Especially in the transition from high school to college, students are ill- served when their writing experience has been dictated by tests that ignore the ever-more complex and varied types and uses of writing found in higher education.

Note: (1) All references to research are supported by the extensive work documented in the annotated bibliography attached to this report. The bibliography is drawn from a body of independent and industry research that supports other critiques of machine scoring, such as the Professionals Against Machine Scoring Of Student Essays In High-Stakes Assessment Petition Initiative.

These concerns — increasingly voiced by parents, teachers, school administrators, students, and members of the general public — are intensified by the use of machine-scoring systems to read and evaluate students’ writing. To meet the outcomes of the Common Core State Standards, various consortia, private corporations, and testing agencies propose to use computerized assessments of student writing. The attraction is obvious: once programmed, machines might reduce the costs otherwise associated with the human labor of reading, interpreting, and evaluating the writing of our students. Yet when we consider what is lost because of machine scoring, the presumed savings turn into significant new costs — to students, to our educational institutions, and to society.

Here’s why: Continue reading

Open Letter from Robert Meister, CUCFA, to Daphne Koller, Founder of Coursera

On May 10th, CUCFA President Robert Meister sent the following open letter to Coursera founder Daphne Koller:


Can Venture Capital Deliver on the Promise of the Public University?

An Open Letter to Daphne Koller,
Co-Founder and Co-President of Coursera and Professor of Computer Science at Stanford University

Dear Professor Koller,

Because I share your vision of creating a world in which all have access to an excellent and empowering education, I would like to propose a new online course for you to make freely available through the Coursera platform. Its title is “The Implications of Coursera’s For-Profit Business Model for Global Public Education.”

The goal of the course will be for the students enrolled in it to understand the real relation between Coursera’s visionary mission—“to offer courses, in partnership with the worlds’ top universities, to everyone for free”—and the logic of the strategic business plan that led Coursera to be named “The Best Startup of 2012” by TechCrunch last January.

You and your company’s compelling pitch to consumers suggests that the private sector–that is, venture capitalists and not taxpayers–can deliver a more equal world in which income will be based on the skills and knowledge people actually acquire rather than the artificial scarcity of credentials for which they are eligible and can afford to pay. It is natural to hope that in this more equal, and also more productive, world incomes could rise for everyone willing to acquire the necessary academic knowledge and take the tests to prove it. This, in fact, was exactly what was promised by the original California Master Plan for Higher Education using taxpayers’ money when it was adopted in 1960.

My proposed Coursera course will ask students to discover for themselves how and why John Doerr, and your other Venture Capitalists, are willing to provide an even greater abundance of knowledge in the service of greater economic and social equality than is the State of California, which clearly has the means to spend much more than it has cost your company to reach a worldwide enrollment in the millions. Continue reading

Recent Graduate Testifies before Ohio Senate on Voter-Suppression Measure Affecting College Students

In an earlier post, “Please Sign Petitions Supporting the Voting Rights of College Students in Ohio and North Carolina” [http://academeblog.org/2013/05/11/please-sign-petitions-supporting-the-voting-rights-of-college-students-in-ohio-and-north-carolina/], I asked readers to sign a petition protesting against an attempt to discourage more than 32,000 out-of-state students attending Ohio universities from casting their ballots in Ohio.

What follows is the testimony of Stuart McIntyre, representing the Ohio Student Association to the Finance Committee of the Ohio Senate on this provision in the state’s budget bill.

OSA

Honorable Chairman Gardner and members of the committee,

My name is Stuart McIntyre.  I am a Columbus native, an Ohio State graduate, and an organizer with the Ohio Student Association.  The Ohio Student Association works with a diverse group of students at a dozen campuses across the state of Ohio to build power for young people so that we can be our own advocates for our social, economic and political well-being.  We are a non-partisan organization, and in the fall we registered more than 4,000 voters around the state, and knocked on more than 12,000 doors educating young people about critical issues, and encouraging them to participate in our democracy.  The post-election statistics speak for themselves.  Young voter turnout has reached an all-time high, and voters from 18-27 are quickly increasing as a share of the total electorate.  Our generation, which will be the largest in the history of the United States, are beginning to assert ourselves as citizens. Continue reading

The Mirror Is a Harsh Mistress

Want to do something to show that faculty can come together? Contact colleagues at the City University of New York and ask them to vote “no confidence” in the Pathways curriculum initiative, the abrogation of shared governance foisted upon the system by a central administration with no interest in working with–and no respect for–the faculty. If we can show that an absolute majority of CUNY faculty members agree that Pathways is poorly conceived and abysmally implemented, perhaps we can start up our own path to faculty solidarity and positive impact.

Most all I hear, these days, from faculty members, are complaints about others, outsiders… administrators, union leaders, politicians, even others on the faculty… anyone but ourselves. But, though we generally refuse to look in that mirror, we are as much to blame as anyone for any current mess in higher education. Commenting on my post Growing Timid: The Faculty in the 21st Century, “professor_at_large” wrote:

the senior faculty who could use their academic freedom to advocate for the preservation of the AAUP principles are instead by and large both self-absorbed and self-preserving. The collective nature of our enterprise, having been undermined by administration and the senior professoriate alike, is slowly unraveling into “single course” pieces of knowledge “delivery”

There’s one hell of a lot in those few lines, and I agree with it all.

Continue reading

Sexual Assaults on Campus

In a very recent post on guns on campus, I selectively surveyed the statistics on violent crime in the 2012 report on crimes reported on college campuses.

I cited the statistics on sexual assaults but noted that those crimes have apparently been very under-reported, at least on some campuses.

Female students on four campuses in particular—Amherst College, the University of North Carolina, Occidental College, and the University of Southern California—have organized formal protests against the ways in which their institutions have recently handled cases of rape and sexual assault.

I have read several dozen news articles on the protests at the four campuses, and I have to say that the issues have not been much clarified by that reading.

Here is what I believe that most people would assume would occur. Continue reading

The Truth about the IRS “Scandal”

The IRS scrutiny and delays aimed at Tea Party nonprofit groups has received enormous media and political attention. There’s nothing illegal, or scandalous, about these disturbing investigations of political groups. It’s been IRS policy and practice to do so for a long time, and more often targeted against liberal groups (and with much less justification).

Back in 2008, I wrote about the threat to freedom of speech posed by an IRS probe (under the Bush Administration) aimed at Barack Obama’s church for the thought crime of inviting Barack Obama to speak at his own church.

Continue reading

Talking Points: No. 1

As our chapters and conferences confront major issues, we often create “toolkits” that include sample letters to other constituencies within our institutions (administrators, staff, and especially students), to groups that may be potential allies, to legislators, and to newspapers and other online media sites.

But, beyond those salient issues, there is typically a multitude of issues that present themselves on a weekly, if not a daily, basis and that we might address to the benefit of our faculties, our institutions, and our profession–if we only had the time or, more precisely, if doing so did not consume quite so much time.

One possible solution is to devise ways of sharing not just ideas but succinct expressions of those ideas. As we read opinion pieces, we might get into the habit of taking special note of the effective arguments that their authors present.  Ideally, we might begin to create a store of carefully and cleverly expressed points on which we can draw as needed.

I hope to use a series of posts to this blog to serve this purpose.

I will begin with a recent letter to the editor that appeared in the Detroit Free Press. Continue reading

The Kent/Jackson Massacres and the Coming Discontent

May 14 marks the 43rd anniversary of the bloody massacre at Jackson State University. On this day in 1970, Mississippi cops fired a deadly barrage of over 450 bullets at unarmed black students in a women’s dormitory. Two were left dead and at least 12 wounded.

The murdered Jackson students were Phillip Gibbs, the son of a sharecropper with a wife and infant son, and James Earl Green, a 17-year-old high school senior.

Just three days earlier, Charles Oatman, a 16-year-old retarded black youth had been burned and tortured by white jail guards in Augusta Georgia. His ordeal sparked a rebellion in that city that left six black men dead, all shot in the back.

I learned of these atrocities while helping to organize the national student strike sparked by the shootings at Kent State University in Ohio. Ten days earlier, on May 4, 1970, I was an eyewitness to the National Guard assault at Kent State that left four dead – Allison Krause, Bill Schroeder, Sandy Scheuer and Jeffrey Miller. Nine were wounded.

* * *

Continue reading

Guns on Campus, Discouraging News

Although guns may not be allowed on Montana campuses (See “Several Indications of Common Sense on Guns on Campus,” http://academeblog.org/2013/05/13/several-indications-of-common-sense-related-to-guns-on-campus/#more-3089), five state universities in Pennsylvania are now allowing guns to be carried on their campuses.

The five universities are Edinboro University, Kutztown University, Millersville  University, Shippensburg University, and Slippery Rock University.

One wonders what statistics the presidents of those universities had been looking at when contemplating this decision. Either those campuses are extraordinarily violent places—in which case, this is not the best way to reassure prospective students about their safety—or the statistics simply do not support the need for such action.

In Fall 2011, there were 19.7 million students enrolled in U.S. colleges and universities.

Between 2009 and 2011, the years covered in the 2012 report on campus crime, there were 49 murders on college and university campuses nationwide: 18 in 2009, 15 in 2010, and 16 in 2011. Continue reading