Computer-graded essays?

PJ
PJS
Posts: 1185
Joined: Sun Jul 22, 2007 5:05 pm
Platform: Mac + Windows
Location: Upstate New York

Tue Apr 17, 2012 11:03 pm Post

A recent article from Associate Press elaborates on a topic that's already been kicked around some, but never yet, so far as I know, as hard as it's being kicked right now. Kaggle <http://www.kaggle.com/c/asap-aes/details/Background> is offering thousands upon thousands of dollars for people who can come up with an algorithm "that can automatically grade student essays."

Is this a good idea? Is the next step a tool for editors to evaluate submissions? And after that, an algorithm for writing essays. Poems. Stories. Screenplays (well, some of those already are the work of robots).

The whole Kaggle enterprise is a bit scary -- they're looking also for a way to determine the profiles of those persons most likely to end up in the hospital. As if the HMOs didn't already have one hand around your neck and the other at your...

It sounds more Orwellian than I like to think about.

Ideas?

ps
You can't conquer stupid — or cure it — with more stupid.

User avatar
Jaysen
Posts: 6235
Joined: Mon Dec 17, 2007 4:00 am
Platform: Mac + Windows
Location: East-Be-Jesus-Nowhere SC, USA

Wed Apr 18, 2012 12:08 am Post

You assume this isn't already ready happening in some shape or form already.
Jaysen

I have a wife and 2 kids that I can only attribute to a wiggle, a giggle, and the realization that she was out of my league so I might as well be happy with her as a friend. 26 years marriage later, I can't imagine life without her. -Me 10/7/09

ImageImage

PJ
PJS
Posts: 1185
Joined: Sun Jul 22, 2007 5:05 pm
Platform: Mac + Windows
Location: Upstate New York

Wed Apr 18, 2012 12:45 am Post

No, I know it (the essay thing) happens already, on a relatively small scale, and in a few places. It's small and few for now, but with a hundred grand being offered for someone to develop a successful algorithm, it's liable to take off. What concerns me is that this -- the bland assumption that it can be done -- is not only accepted, it is applauded, encouraged, rewarded.

As for the HMO thing, and the other money-grubbers looking for yet another edge, soso.

(Please note I said it "concerns" me. This is not an anti-tech rant.) (Yet.)

As a not-quite-irrelevant aside, I got a nice letter from Bank of America yesterday, asking me to open a "high yield" savings account. Their definition of "high yield" rolled in at .75%. Yes, they want to give me three-quarters of percent interest each and every year for letting them hold my money. I sent back the form, politely declining, and asked if, on the other hand, they would like to let me hold some of their money at the same rate.

I don't expect an answer.

ps
You can't conquer stupid — or cure it — with more stupid.

User avatar
Jaysen
Posts: 6235
Joined: Mon Dec 17, 2007 4:00 am
Platform: Mac + Windows
Location: East-Be-Jesus-Nowhere SC, USA

Wed Apr 18, 2012 2:18 am Post

How is this concept different from face detection in iPhoto or Google Picassa, or any other photo software? What about autocorrect? Voice recognition? Retinal detected auto focus? The auto-parking feature on the wife's new car?

My point is that "advancement" is predetermined by those who define what we need. The assumption that "it will be done" is simply following on the established history that technology has followed since the stone age. We want better and someone figure it out. Stone tools give way to iron which gives way to bronze which gives way to steel which gives way to silicon. The pattern that we will build it once we think it up is well established. Why not encourage it?

To me the larger issue here is the obvious move toward a more realized version of artificial intelligence. Maybe I am too close to the systems to see what everyone else see as so great in AI, but the very idea of it terrifies me. Once the computers start to "think" in a decisive manner, what use to humans really serve?

And yes, I am serious.
Jaysen

I have a wife and 2 kids that I can only attribute to a wiggle, a giggle, and the realization that she was out of my league so I might as well be happy with her as a friend. 26 years marriage later, I can't imagine life without her. -Me 10/7/09

ImageImage

User avatar
Siren
Posts: 759
Joined: Mon Mar 12, 2007 11:29 am
Platform: Mac + iOS
Location: U.K.

Wed Apr 18, 2012 7:07 am Post

That is a really depressing idea. Its underlying assumption appears to be that there is no "art" involved in creating essays or, indeed, in writing generally. An algorithm would be unable to recognise novelty of approach or argument, so could not reward those aspects of essay-writing that push the writer into the highest mark band.

Admittedly, there is little room for novelty in secondary-school essays, which I imagine will be the software's target market -- but that raises questions of its own. We hear complaints already about "teaching to the test", with pupils so geared towards specific exams that they end up knowing little about their subject; how will it be when they are trained to pass an essay-algorithm test and don't even know how to string their own words together independently?

In a way, though, I sympathise with the concept of standardising essay marking. My daughter has recently been through a remark/appeal process over a history exam, through which we had access to the marking comments from the exam board, and I have been shocked at the subjectiveness and non-standardisation of the marking and re-marking. In the absence of adequately consistent examiners, perhaps an algorithm might be preferable!
Literature & Latte support team

User avatar
pigfender
Posts: 2818
Joined: Tue Oct 12, 2010 10:25 am
Platform: Mac, Win + iOS
Location: I share a head with a great many personalities
Contact:

Wed Apr 18, 2012 8:31 am Post

Siren wrote: Its underlying assumption appears to be that there is no "art" involved in creating essays or, indeed, in writing generally. An algorithm would be unable to recognise novelty of approach or argument, so could not reward those aspects of essay-writing that push the writer into the highest mark band.


I would argue that anything below University level that has novelty of approach or original thought is a fail. I would be very surprised if human readers would give any conscious thought or credit to 'style points' or 'art' in marking academic papers. Content, by which I mean the regurgitation of facts learnt and opinions already established is the key to exam success at these levels.

Basically, the secondary school system in the UK is essentially designed to teach kids to write acceptable Wikipedia articles. I have no problem with Autobots reviewing Wikipedia articles. At least that way the grading is fair, based on content, knowledge and application of the correct facts to the question at hand, and not subject to an unconscious bias towards language or style that suggests a wealthier upbringing or a private school education.

Jaysen wrote:Once the computers start to "think" in a decisive manner, what use to humans really serve?

Well, if you want to put it that way, what purpose do humans really serve now?

And, yes, I am serious too.
Life is it's own purpose. I think one has to take that as res ipsa loquitur.
http://www.pigfender.com | http://www.novelinaday.com
"Some dice only have sixes." nom, 19 Oct 2013
Image Image

User avatar
Siren
Posts: 759
Joined: Mon Mar 12, 2007 11:29 am
Platform: Mac + iOS
Location: U.K.

Wed Apr 18, 2012 10:44 am Post

pigfender wrote:I would be very surprised if human readers would give any conscious thought or credit to 'style points' or 'art' in marking academic papers.

I agree with you that novelty of thought doesn't really apply at secondary school, and that facts are important. Secondary school education should be about building a foundation of actual knowledge and the skills to use it. But, assuming that the base is there, why ignore (or, worse, penalise) novelty of expression? As long as all the key arguments/facts are there, they don't have to be strung together in a set way as long as the essay makes sense internally and as a whole. It is that aspect that is most subjective.

Marking an essay is never going to be objective, algorithm or no algorithm. Even with an algorithm, the subjectivity is still there, built in to reflect the designer's preconceptions. My children are doing A-levels (final year of secondary school), and every syllabus I have seen (across a range of science, arts, humanities and technology subjects) acknowledges the impact of writing style, in that "quality of written communication" is taken into account. The exact terminology of the assessment category varies according to the exam board, but the principle is the same. And having seen examiners' actual comments on a real paper, defended by the exam board, it is very clear that assessment is not solely subject to conscious thought, and other intangibles play a significant (and apparently defensible) role.

pigfender wrote:At least that way the grading is fair, based on content, knowledge and application of the correct facts to the question at hand, and not subject to an unconscious bias towards language or style that suggests a wealthier upbringing or a private school education.

Now that really is bleak. It is fair only in that everybody is assessed by the same "individual", not in that the assessment criteria themselves are fair. What about budding writers with a flair for language, or for communication, or for imaginative linking of material and ideas? What about avid readers, influenced by things beyond the exam board? There are loads of writers from disadvantaged backgrounds who would never have flourished if restricted by their school to forumulaic modes of expression. (There are plenty who succeeded despite their schooling, as well, but that is hardly the point of education.) The lowest common denominator of realistic attainment should not mark the top end of the mark scheme. Unless we're talking about multiple-choice questions, of course, or something with a straight right/wrong answer -- not an essay, in other words.

It's probably an ideological position. School is where writing/communication skills should be taught, or at least fostered and encouraged, or at the very least not beaten out of you (I'd settle for the last one). There is insufficient time for teaching outside the syllabus, and schools tend to value their league table position too much to jeopardise their exam results by adding more work that won't count. If the syllabus says "write like this or you won't pass the algorithm-administered test", pupils are most unlikely to be encouraged to write in any other way -- except, of course, in the private schools and wealthy or better educated homes which your argument seeks to level out. In fact, marking by algorithm seems ideal for schools/families with sufficient time and resources: they will be able to "crack the code", or pay someone else to do it, so that their children are guaranteed top marks by the simple application of a few algorithm-compliant rules.

Besides, if you have all the creativity bashed out of you when you're young, how are you going to do well at university where the requirements for a Pass 1 or Distinction require "Special signs of excellence, for example: unusual clarity; excellence of presentation; originality of argument" (from the assignment booklet of an Open University undergraduate course)? Universities already argue that the school system isn't producing the right calibre of student, so they are having to dumb down first year classes to get students to catch up. Marking essays by algorithm is a slippery slope to shifting the burden of teaching flexible thinking/writing skills onto the higher education sector instead of positioning them as the life skills that they really are.

Bah! I'm depressed now. Time for a coffee. :D
Literature & Latte support team

User avatar
kewms
Posts: 6212
Joined: Fri Feb 02, 2007 5:22 pm
Platform: Mac

Wed Apr 18, 2012 4:34 pm Post

See, we're all writers here. Most of us probably could (and did) aspire to "art" in secondary school essays. Creating equal measures of joy and frustration for our teachers.

But most of us could also structure literate sentences and assemble coherent paragraphs. With very little nudging, we could construct an extended argument over several pages. For many of us, those skills came almost instinctively, the way a musician with perfect pitch can feel a wrong note.

Most secondary students aren't us. In most cases, correcting secondary school essays involves slogging through a morass of errors in everything from basic mechanics to the fundamentals of exposition. As pigfender said, in most cases getting a student to write a decent Wikipedia article is quite a challenging goal by itself.

I'm perfectly happy to let machines score the Wikipedia articles. It will give the teachers more time to actually read the few nuggets of art that they might encounter.

Katherine
Scrivener Support Team

User avatar
pigfender
Posts: 2818
Joined: Tue Oct 12, 2010 10:25 am
Platform: Mac, Win + iOS
Location: I share a head with a great many personalities
Contact:

Wed Apr 18, 2012 5:07 pm Post

I think it's a good idea to take the 'art' part out of the grading. A student can struggle at writing fragrant prose without it impacting on their ability to understand history or geography.

I *do* think that creative writing - hell, just writing let's be honest - should be an important part of the general curriculum, but it should be assessed in the English Language class, not imposed double jeopardy style across every subject a student studies.
http://www.pigfender.com | http://www.novelinaday.com
"Some dice only have sixes." nom, 19 Oct 2013
Image Image

User avatar
pigfender
Posts: 2818
Joined: Tue Oct 12, 2010 10:25 am
Platform: Mac, Win + iOS
Location: I share a head with a great many personalities
Contact:

Wed Apr 18, 2012 5:11 pm Post

On an unrelated point:

Dear Microsoft,

It would appear that your popular program, MS WORD, has over the past ten years slowly eaten away at my previously good standard of spelling to the point now where I struggle to spell the word February without checking for red squiggly underlining. And I was born in February.

I will accept $1,000,000 as without prejudice compensation in lieu of a civil suit.

Love and hugs,

Rog
http://www.pigfender.com | http://www.novelinaday.com
"Some dice only have sixes." nom, 19 Oct 2013
Image Image

User avatar
xiamenese
Posts: 4370
Joined: Mon Jan 29, 2007 1:32 am
Platform: Mac
Location: London or Exeter, UK.

Thu Apr 19, 2012 3:31 pm Post

All of this is very depressing! I'll admit, I hate marking essays. And it's not because they are original and therefore demanding of a more subjective judgement ... that is difficult to do, time consuming and nerve-racking. But what is soul-destroying is being faced with a pile of 120+ essays, all of which you know are going to trot out the same well-worn ... it's not as if I feel I can even call them "ideas", though they must have been that at some point in the mists of time, which all fundamentally share the same source, usually a coursebook full of truisms. So they all say the same things, and the only way to differentiate them is on their writing, the quality of expression. Then you're up against how to evaluate one piece full of grammatical errors and misspellings against another piece equally full, but of a different set and distribution of grammatical and spelling errors. And this is not a writing class ...

So there's the side of me that thinks, "I'd love an app that ran an algorithm that would relieve me of these end of semester nightmares". Then I think of one of the MA students I had at the University of Westminster. For her paper in the exam at the end of one of my courses — traditional 3-hour, closed book exam; this was in the late 80s — I gave her 85%, when the ceiling for a good distinction grade was 75%. Why? Because under exam conditions, she had written better answers than I myself could have done, sitting at my desk and taking my time over it with reference books handy, and certainly better than anything I would have given as the criteria to look for for the purposes of an automated marking system. So the automated system would have given her a low mark for not including my requirements and for writing a whole load of different stuff.

For the last few years, I've been teaching within a culture that for the last 3000 years has basically adhered to a line the equivalent of "You can only think independently when you have got your PhD. Until then, your job is to read, learn, reproduce the accepted wisdom of the masters". (Congratulations, Dr Nom, you are now allowed some original thoughts! ;) ). We in the west are already moving rapidly along that line, as is basically expressed in some of the previous posts.

But I think it's moving outside education. I had been thinking of writing a rant in the ANFTL forum ... "Professionally Designed Templates". It's not that I object to templates, or to sharing any template I have created if someone else thinks they will find it useful. It's the growing number of "creative" apps that take virtually all the creation out by offering "Professionally Designed Templates" ... iPhoto, iMovie ... "84 (or however many) Professionally Designed Templates for Pages, on special offer for only $49.99!" There's even a web-hosting company advertising here in the UK that is pushing, not just "professionally" designed pages but even that they have the text appropriate to your industry that they have already produced for you to choose from and enter in their template. What does that "Professional" mean? To me, it means "We work in the computer industry, therefore we are professionals, therefore our designs/texts are professional, therefore you should use them rather than using/writing your own ... therefore you too can have a website that looks like the website of all the other companies that have used our designs and texts."

I've never used a Pages template, I've never used a Numbers template, I've created my own Nisus templates ... and heavily modified Nisus New Page. I have tried a couple of Keynote templates ... I can't imagine ever finding a use for the majority of them, and of those that I have used didn't really work for my needs. I ended up taking the plainest of them and modifying the background, the font and text size, text boxes on the slides ... virtually everything. I could have done it as quickly starting from "Blank", but it was only afterwards that I realised that.

Please note, I'm not disparaging the templates that come with Scrivener. I think, in a real sense, they serve a different purpose, at least what I think is the important bit. That for me is largely the compile options ... if/when I write a paper which I want to submit to a journal or journals, and the publishers specify they want submissions in a specific style, Chicago, APA6, Harvard, whatever ... then it makes sense to use a template set up by Keith et al. which has all that set up already, rather than have to go through pages of style manual in order to set up the compile options myself. Or producing an epub ... same thing. But the point about those is that they are set up so you can meet someone else's requirements without trouble. The Keynote templates, the Pages templates ... they are not there to meet requirements set by others; they are there to take the creativity out of being creative, and I'm sure that the text most people insert into those templates is frequently as uninspired as merely taking someone else's design.

And before anyone jumps down my throat, I know that many of those templates and themes that so annoy me have actually been designed by people whose profession is design. But to me, design is as transient as fashion, and often as vacuous as Brit Art. And I admit that I couldn't code a website in HTML4 or 5 — I did code by hand in HTML3, but that was a decade and a half ago — and so would use Rapid Weaver or similar software ... that forces me into using a template; I'd rather do it all myself, but even after I retire, I'm not sure I'll have the energy and time to learn that. If I put together a little movie in iMovie — I have done that ... I liked iMovie HD, but no version since — I would hope it would stand on its own, not need to be wrapped in a "theme". But in no way is designing a layout in Pages or Keynote as daunting a task as coding a site in HTML5 ... mind you, if one hand coded XML or RTF or whatever underlies the page, it would be equally daunting.

[/RANT]

Jaysen wrote:... Once the computers start to "think" in a decisive manner, what use to humans really serve?

And yes, I am serious.


Jaysen, you must be. That, it seems to me, must be a disturbing thought for you. :)

X
The Scrivenato sometimes known as Mr X.
iMac 27" (late 2015) 10.15.4, 24GB RAM, 512GB SSID
MBP17" (late 2011) 10.13.6, 16GB RAM, 2TB SSID
2017 iPad, iPadOS 13.3, 128GB, Apple Pencil
Scrivener, Scapple, Nisus Writer Pro, Bookends …

User avatar
pigfender
Posts: 2818
Joined: Tue Oct 12, 2010 10:25 am
Platform: Mac, Win + iOS
Location: I share a head with a great many personalities
Contact:

Thu Apr 19, 2012 4:08 pm Post

xiamenese wrote:I gave her 85%, when the ceiling for a good distinction grade was 75%. Why? Because under exam conditions, she had written better answers than I myself could have done, sitting at my desk and taking my time over it with reference books handy


But she was still 15% wrong?
http://www.pigfender.com | http://www.novelinaday.com
"Some dice only have sixes." nom, 19 Oct 2013
Image Image

Hu
Hugh
Posts: 2444
Joined: Thu Mar 08, 2007 12:05 pm
Platform: Mac
Location: UK

Thu Apr 19, 2012 4:20 pm Post

pigfender wrote:
xiamenese wrote:I gave her 85%, when the ceiling for a good distinction grade was 75%. Why? Because under exam conditions, she had written better answers than I myself could have done, sitting at my desk and taking my time over it with reference books handy


But she was still 15% wrong?


Everyone is always at least 15 per cent wrong.
'Listen, some quiet night, when you've shirked your work that day. Do you hear
that distant, almost inaudible clicking sound? That's one of your
competitors, working away in the night in
Paris or London or Erie, PA.'

PJ
PJS
Posts: 1185
Joined: Sun Jul 22, 2007 5:05 pm
Platform: Mac + Windows
Location: Upstate New York

Thu Apr 19, 2012 4:44 pm Post

Hugh wrote:Everyone is always at least 15 per cent wrong.


Corollary 2.b of Sturgeon's Law.

ps
You can't conquer stupid — or cure it — with more stupid.

User avatar
Jaysen
Posts: 6235
Joined: Mon Dec 17, 2007 4:00 am
Platform: Mac + Windows
Location: East-Be-Jesus-Nowhere SC, USA

Thu Apr 19, 2012 4:55 pm Post

xiamenese wrote:
Jaysen wrote:... Once the computers start to "think" in a decisive manner, what use to humans really serve?

And yes, I am serious.


Jaysen, you must be. That, it seems to me, must be a disturbing thought for you. :)

X

And a painful one too! I don't like this "serious" stuff at all. Way to scary.

As I see it the problem, yours, mine, PJS, and Siren, maybe kewms (I always hate to lump her in with me), it isn't about creativity but a false sense of equality. All these efforts to standardize claim a fairness that is, by its very definition, unfair. Allow me to ramble for a bit.

Is it unfair that Micheal Jordan is a better basket ball player than me? Is it fair that I am a better "computer nerd" than Mr K? Is it fair that Mr X is a better linguist than my daughter? Is it fair that Mr K is a better humorist than … You get my point.

The real value in education, art, science, and everything I can think of that isn't economics, is inequality as demonstrated in an individuals ability to excel or fail. By being better than me in the art and science of linguistics Mr X establishes a unique identity that make him of value. My seeming instinctual understanding of compute systems makes me a unique value in my sphere, in some ways of more value than Mr X, but in other ways of less value. neither one of us would me of any value on a basket ball court especially with compared to Mr Jordan, but then how would be talk to folks in China or design a complex integration between a internet front end and a main frame?

All these attempts at standardized grading through AI, reduce mankind to a base point that is nothing more than a parrot taught to repeat the mantra of some ruling body (school councils in this particular case). It is the complexity of mankind that would seem to scream "WE CAN NOT BE LUMPED TOGETHER AS A HOMOGENEOUS MASS OF GREY MATTER!!!" that seems so obvious to me. Either I am truly missing the bigger picture or there is a real problem with a global society that wants us all to be "the same".

For those that want to bring up "factual instruction", I would counter that facts are only of value as a basis for complex mental exercise. All the important "facts" should be learned by 3rd, maybe 4th year. After that point education should become an abstract analysis and modeling of our world as seen through history, art, science and mathematics. Once you venture into the idea that we are no long teaching "facts" the very idea that you can standardize the grading of essays become a cruel joke.

Notice that I haven't even started on the AI problem yet?

I can sum it up in a quick paraphrase of every AI doomsday film: The most destructive creature on earth is mankind, AI would need to protect us and itself from mankind. Thus AI would arrive at that conclusion that mankind would need to be contained or eliminated. Try to prove that wrong.
Jaysen

I have a wife and 2 kids that I can only attribute to a wiggle, a giggle, and the realization that she was out of my league so I might as well be happy with her as a friend. 26 years marriage later, I can't imagine life without her. -Me 10/7/09

ImageImage