Source: http://davideharrington.com/?p=594
Nearly everything we buy nowadays is electronically scanned to ensure that we paid for the items in our bags—and under our coats. Stores began using sensor tags and security screens in the early 1970s. According to the New York Times, the market for anti-theft systems grew rapidly because they were viewed as “more reliable and less expensive” than having employees watch customers.
Students are being scanned as well to make sure that the words in their papers were not swiped from other sources. Scanning papers began a decade ago when anti-plagiarism software was created to compare the phrases of student papers with other sources. The leading anti-plagiarism software is Turnitin, which compares student papers with academic journals, Internet web pages and its library of previously submitted papers. On its home page, Turnitin quotes an instructor as saying, “I used to spend hours on Google searching for unusual wording when I suspected that the paper was not written by the student. Now, I can search quickly with Turnitin!”
Scanning store customers and student papers are touted as substitutes for labor, so that clerks and instructors can spend less time guarding against thievery and more time doing what they do best, serving customers and teaching students. Sounds great; sounds efficient; sounds easy!
In the years before sensor tags and security screens, the battle against shoplifters was waged with security guards and convex mirrors. An expert on store security—quoted in Shoplifting: A Social History—argues that hiring security guards may have actually increased shoplifting because other employees were likely to think, “Pete is here, so I don’t have to watch out for shoplifters.”
Today, store clerks may think, “our stuff is tagged, so I don’t have to watch out for shoplifters.” Indeed, stores have begun to question whether the substitution of security systems for labor has gone too far, reenlisting labor by having employees greet customers. On a recent shopping trip, my daughter Emma and I were greeted by a handsome teenager at Abercrombie, a well-dressed woman at J. Crew and an elderly guy at Wal-Mart. Sure, the first two were modeling clothes. (I hope the last guy wasn’t modeling them for me!) But, their real job is to make eye contact with customers to deter shoplifters.
Similarly, teachers might think, “I’m using Turnitin, so I don’t have to watch out for plagiarists.” The instructor quoted on Turnitin’s website certainly thinks so, implicitly arguing that Turnitin is a perfect substitute for her own investigations using Google. Not surprisingly, Turnitin encourages this belief. On its website—right next to her quote—Turnitin advertises that it has crawled and indexed “14+ billion web pages.” Choosing between Turnitin and instructor investigations seems like a no-brainer.
But wait, how many web pages are there on the Internet?
A few years ago, Google announced that it had crawled and indexed a trillion web pages. That makes TurnItIn’s crawlers look puny, having searched and indexed only 1.4 percent as much of the Internet as Google’s.
I wanted to test Turnitin but needed a suspicious manuscript. I had one in my hands—Shoplifting: A Social History. I suspected Kerry Segrave of plagiarism when I heard echoes of his book while reading New York Times articles he cites. He cites a lot of them—35 in the first 14 pages! To investigate my suspicions, I created a document containing the 14 pages stripped of direct quotations and another one containing the New York Times articles. I began by searching for identical phrases (of at least 6 words) in the two documents using the open source software Copyfind, which highlighted the matches it found in each document and produced the metric that 15 percent of the early pages of Shoplifting were taken verbatim from the New York Times. (Here is the document that highlights the matches.)
But this measure captures only the most flagrant form of plagiarism, where passages are copied from one document and pasted unchanged into another. Just as shoplifters slip the goods they steal under coats or into pocketbooks, most plagiarists tinker with the passages they copy before claiming them as their own. In other words, they cloak their thefts by scrambling the passages and right-clicking on words to find synonyms. This isn’t writing; it is copying, cloaking and pasting; and it’s plagiarism.
Kerry Segrave is a right-clicker, changing “cellar of store” to “basement of shop.” Similarly, he changes goods to items, articles to goods, accomplice to confederate, neighborhood to area, and women to females. He is also a scrambler, changing “accidentally fallen” to “fallen accidentally;” “only with” to “with only;” and, “Leon and Klein,” to “Klein and Leon.” And, he scrambles phrases within sentences; in other words, the phases of his sentences are sometimes scrambled.
I spent hours comparing the two documents, matching phrases and highlighting the ones that were copied, cloaked and pasted into Shoplifting. My estimate is that 32 percent of the early pages of Shoplifting are taken nearly verbatim from the New York Times. (Here is the document that highlights the matching phrases.)
To test Turnitin’s crawlers, I uploaded the document containing the New York Times articles to my website a few months ago. Google now matches many of the plagiarized phrases from Shoplifting to the New York Times articles on my website and some of the phrases to articles in the archives of the paper. Google also matches them to Shoplifting itself, which has been scanned into Google Books.
Turnitin fails to match the plagiarized phrases to any of these sources. I e-mailed Turnitin’s help desk, essentially asking, “What’s going on? Why can’t Turnitin find these things?”
A few hours later, a guy at Turnitin’s product support sent me a detailed answer that boils down to three basic points—the Internet is a big place and it takes our crawlers time to scan it; we can’t scan the New York Times because it requires a subscription; and, we can’t scan images of text like those used by Google Books. In other words, our crawlers are puny compared to Google’s.
I decided to give Turnitin a little help, so I submitted the document containing the New York Times articles as a student paper, causing the file to be catalogued in Turnitin’s library of student papers. This enabled Turnitin to find the file and then to compare Shoplifting with the New York Times articles. It produced an originality report that highlighted matched phrases and concluded that 25 percent of the phrases of Shoplifting were very similar to those of the newspaper articles. (Here is the document that highlights the matching phrases.)
Nearly all the passages highlighted by TurnItIn are also highlighted by me. However, I highlight a few more because my algorithm—embedded in my brain—casts a wider net than the one used by Turnitin. However, the differences are relatively minor—they both present compelling evidence that Shoplifting is an example of Wordlifting.
But Turnitin needed my help to find the original sources of the plagiarized phrases, making it a poor substitute for instructors who are willing to “spend hours on Google searching for unusual wording.” It needs the help of instructors who are willing to investigate suspicious papers; otherwise, greater reliance on Turnitin could lead to more plagiarism.
There are other ways that instructors may change their behavior if they believe that anti-plagiarism software insures them against the risk that their students are plagiarizing. Economists give a fancy name for changes in behavior induced by insurance—it’s called moral hazard.
One instructor told me that he used to devote an hour to discussing plagiarism with his class—what it is; why it’s wrong; and, where students go when they get caught. Now, he just tells them that he uses Turnitin and lets them infer that plagiarizing is not worth the penalty. He lauded the change, saying it saved him valuable class time.
Relying on students to weigh the benefits and costs of plagiarism in this way assumes that they are good stewards of their future selves. Just as some shoplifters may give too much weight to the thrill of shoplifting, some students may give too much weight to starting their weekends early.
Instructors may also change the way they write their essay assignments. One of the best ways to suppress plagiarism is to come up with creative assignments that are literally one-of-a-kind. For example, I like to rip mine from the headlines by asking my students to write op-eds on current legislative proposals. If I felt insured against plagiarism, I might not spend hours looking for unusual proposals and instead tell students to write their essays on any topic they found interesting.
Instructors, like all human beings, look for excuses to avoid doing things they don’t want to do. Grading essays is hard—often discouraging—work, so instructors look for excuses to avoid assigning them. One plausible excuse is that plagiarism is rampant, making in-class exams better measures of students’ performance. Anti-plagiarism software may make this excuse less credible, nudging some instructors to assign more essays. Hence, moral hazard can work in the opposite direction, something akin to moral security. Feeling insured against plagiarism, instructors may decide to do the right thing and assign more essays.
Turnitin is also being used to teach the wrong lessons concerning how to write well. Searching Google, I found syllabi of instructors who use Turnitin to teach students how to paraphrase well. In particular, they ask students to check the originality reports of their rough drafts and make any necessary changes to improve their paraphrasing of sources prior to submitting the essays to be graded. In the hands of a skilled instructor, it might teach students how to paraphrase well. But, I think it is more likely to teach students how to right-click words and scramble phrases to get acceptable scores on Turnitin.
I want to teach my students how to write well, not simply paraphrase well. I also fear that copying, cloaking and pasting is endemic. Hence, I would not allow my students to use originality reports to revise their drafts.
But I would have no choice because Turnitin offers another product called WriteCheck that allows students to “check [their] work against the same database as Turnitin.” I signed up and submitted the early pages of Shoplifting. WriteCheck matched many of Shoplifting’s phrases to those of the New York Times articles in its library of student papers. Remember, I submitted them as a student paper to help Turnitin find them; now WriteCheck has them too! WriteCheck warned me that “a significant amount of this paper is unoriginal” and advised me to revise it. After a few hours of right-clicking and scrambling, I resubmitted it and WriteCheck said it was okay, being cleansed of easily recognizable plagiarism.
Turnitin is playing both sides of the fence, helping instructors identify plagiarists while helping plagiarists avoid detection. It is akin to selling security systems to stores while allowing shoplifters to test whether putting tagged goods into bags lined with aluminum thwart the detectors.
I am not a Luddite. I use an online homework system in many of my courses and I plan to experiment with student response systems. And, I think that anti-plagiarism software is a useful tool, but should be used as a complement to, not a substitute for, instructor investigations of suspicious language, class conversations on plagiarism, and creative essay assignments.
This fall, I plan to say to people, “I’m using anti-plagiarism software, but I’m still watching out for plagiarists.”