Title: Short Answer Scoring at ETS
Speaker: Chris Brew (Educational Testing Service)
Place: Science Center. Rm 4102, CUNY Graduate Center. 5th Ave & 34th St.


The core capability of c-rater (ETS's automated short answer scoring
engine) is to detect specified elements of the content of a student
response. It does this by aligning concepts represented in a template
provided by the test designer with concepts found in the response.  Not
all students will express these concepts in the same way, so c-rater
must be able to align words with their synonyms and phrases and
sentences with alternative ways of saying essentially the same thing.

In the talk I will describe c-rater and present results of an
evaluation. Taken as a whole, the system is complex, and the quality of
the results dependent on multiple factors, including design choices made
before items were even considered for automated scoring. The analysis
reveals that (a) while conventional NLP considerations do affect the
results, they are not the primary reason for variability in the results,
and a simple baseline system can achieve broadly comparable results (b)
by contrast, a process that incorporates information from a corpus of
scored responses does make a substantial difference. The methods used
for the analysis of c-rater are from a recent paper presented by NCME,
and extend work in educational measurement by Tryon and Lewis. On the
basis of this analysis I make tentative recommendations about how to
make significant progress in short-answer scoring and its applications.


Chris Brew is a Senior Research Scientist at the Educational Testing
Service, specializing in Natural Language Processing (NLP).  He 
previously held positions at the University of Edinburgh, Sussex and
Sharp Laboratories of Europe and most recently as a Associate Professor
in Computer Science and Engineering at Ohio State University.   His
interests include semantic role labeling, speech processing, information
extraction, and psycholinguistics.  At ETS, Chris leads the c-rater
project as well as co-directing the content scoring vision for the NLP