Friday, June 15, 2012

Massive Online Collaboration

I've been watching TED talks over lunch at work lately (thank you, wireless!) and this one in particular caught my eye, because of the way it combines necessary, repetitive tasks and crowdsourcing cultural benefits.




There are some good museum projects out there that use crowdsourcing in order to enhance their records - this post from the Center for the Future of Museums blog talks about a great project at the United States Holocaust Memorial Museum -  but nothing on this scale. Another of my favorites is Find a Grave, which combines the efforts of really hardcore researchers - who will spend entire days recording cemeteries - with casual genealogists, who may know specific information about a few people.  I know I have used it to my benefit in researching my own ancestors, and have felt the pull to contribute

In my current work at an archive, I can't help but wonder if there isn't some kind of automated system that could help us transcribe a recently digitized correspondence collection. Transcription of handwritten documents is a difficult skill, and requires more effort than the Captcha system that Van Ohn describes, but I believe that some of his principles can be carried over. (Many institutions have already done this, in fact - here's a good review of a panel at AHA this year that addressed the question.)

In particular, I liked the way he addressed safeguards. The great fear in relying on the unnamed public in order to produce historical information is that they won't do it "right" - fill in the definition of right according to your institution. The Captcha project requires ten correct identifications of a word, plus a confirmed correct transcription of a known Captcha.

I believe the greatest hurdle to accomplishing a project with this level of reliance on public aid is that fear. Will they do it right? What if they're better at it than we are? What will they need us for? In particularly sensitive collections, housed in institutions that are affiliated with a movement or with a person - what if they don't interpret it right? What if they use these words for ill?

My ideal system would have a document transcribed multiple times, by multiple people, and then a back-end system that compares those transcriptions and flags problems for a more experienced reviewer. More experienced reviewers could be users that have demonstrated a high level of accuracy through previous interactions with the system, or paid staff of the institution.

My own personal preferences in this line are to throw everything wide open, invite everyone to a big huge party, get them excited, and thank them for their help, showing them the incredible contribution they're making to history. I love amateur historians. I love reenactors. There are so many people who are so enthusiastic about history but who are blown off because they don't do it "right." At the same time, I am always mindful that my attitude is not shared, and that I can alienate colleagues whose help I need when I push hard to open things up so dramatically.

No comments:

Post a Comment