Text Mining – THATCamp Leadership 2013 http://leadership2013.thatcamp.org The Humanities and Technology Camp Wed, 02 Apr 2014 14:30:48 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.12 Beyond Turnitin and anti-plagiarism softwares http://leadership2013.thatcamp.org/2013/10/06/beyond-turnitin-and-anti-plagiarism-softwares/ http://leadership2013.thatcamp.org/2013/10/06/beyond-turnitin-and-anti-plagiarism-softwares/#comments Sun, 06 Oct 2013 08:01:44 +0000 http://leadership2013.thatcamp.org/?p=235

In my university, the European University Institute, Florence, Italy, the Dean of Studies and the Academic Service decided recently to introduce systematically the use of an anti-plagiarism software. The reason is for single Ph.D. researchers to look at the various chapters and drafts of their dissertation during the four years research/writing process and verify the originality of the contents. We want to avoid to have researchers be shamed and expelled out of the community of scholars like this student in Norway!

So, at the end of the process, when the thesis is submitted, each supervisor should perform this new task against plagiarism directly on the manuscript of his/her supervise. This task -and the instruments that are available to perform it- are today an evidence of the worldwide shift towards digital. It is taken for granted that everything we write is somewhere in the virtual space and can be retrieved and analyzed to avoid using someone else’s ideas without acknowledging it. This is an extraordinary shift in the humanities sciences towards “other” humanities. It introduced a bit of digital humanities for everybody in a way!

At the EUI, this task which was performed by the staff of the Dean of Studies and the Academic Service, has now to be performed directly by the thesis supervisor before the decision taken by the departments to officially accept that a candidate submit a thesis for discussion with the jury. The software Turnitin has been chosen and new administrative rules introduced on how to use it. Now, scholars on both side of the Ph.D. writing process: he who writes it and he who is supervising it, are both involved with digital tools. This is something that never happened before.

Introductory courses to plagiarism, originality check, good academic practices and, finally, to Turnitin itself, have been organized for the first time this academic year 2013-2014 for all new doctoral researchers. As History Information Specialist, I was asked to give my contribution both to the general discussion about plagiarism and to the correct way to use quotations in one’s own research/writing activity. As far as the history department is concerned, I am helping to prepare all its members –researchers, fellows and professors- to understand how they should proceed with the software. I will teach some Atelier Multimédia courses about it. For doing so, I would like to have the input of THATcamp Leadership. The first introductory course, the 8th of October will be about Good Academic Practice and the Avoidance of Plagiarism. But it’s not this specific contribution -in the EUI context- that I would like to question here. I would like to bring to the attention of THATcamp Leadership participants, what were the many queries and reflections on the use of such a software that challenged –at least for me- a “simple” task: showing how to use Turnitin. This task became more complicated than I thought. I started to think beyond plagiarism and to look at what an “originality check” was meaning in a new digital scholarly process in the Humanities and History. What could we all do with Turnitin ?  And taken for granted that all EUI scholars will have to use it, what should I tell to those who never used any software before?

So my questions to TC Leadership would be to look at this software (and other similar softwares) from a different viewpoint. Is it possible to allow our community of humanists and social scientists to integrate one of the most important methods that enriched the process of document retrieval and document analysis in the field of Digital Humanities -“text-mining”- when teaching how to use a plagiarism software? Here are some possible issues to discuss during THATcamp:

  • Turnitin is a software against plagiarism. Are they any other softwares you would recommend and why? Anything in the OA/OS world ?
  • Do you use these softwares only for originality checking and fighting plagiarism?
  • Which other tasks could they perform ? Are they allowing us to know more and more easily about the deep web contents? And if so how and why?
  • How could we trace the originality of translated texts -from English to other languages and vice-versa-, using different languages corpora?
  • Could we think to use Turnitin to understand who is quoting what and in which contexts and the many other ways we interact with big online commercial textual databases like EEBO, ECCO, MOMW I & II, etc., or with open access web databases like Rousseau online ?
  • Up to which extend, these textual databases accessed through Turnitin, would allow contextualized keyword searching, similarity searching, frequency searching, etc., so to understand if a quotation we plan to use has already been used entirely or partially in other writings, how, where and by whom?
  • Could we perform with Turnitin a much more complex citations search then the one we were allowed to perform from years now with the Web of Knowledge (ISI) when, looking at the footnotes in a scholarly paper, we deduce that if somebody uses the same quotations, he/she may research in the same field and have similar ideas?
  • Which text-mining activities are allowed using these software’s if we accept the fact that Turnitin is a good Digital Humanities tool, able to perform one of the most important tasks within “big amount of data’s”: distance/close reading, searching for contexts, origin of quotations, places of words in millions of documents?
  • And, as a consequence, could we discuss if this is not only about plagiarism but if these kind of software’s may become a vector to introducing wider communities –not only the digital humanities community- to  new ways to perform their research activities? Are they taking care in a daily research activity -and even without knowing about it-, of some characteristics, of both the linguistic turn and the digital turn if we may use big concepts ?

Turnitin seems to be an instrument that allows new digital experiments with, unfortunately some technical limitations. Our session could try to problematically look at the systematic introduction of these tools in universities worldwide: now that you know how to use it and what’s in it, which tasks do you think you could perform with such a tool ? In what ways this instrument could become useful to you ? And, this is maybe the most important question, in a global world where digital documents and primary sources aren’t all written in English, how these experiments with digital texts could take care of  different cultural and linguistic frameworks ?

]]>
http://leadership2013.thatcamp.org/2013/10/06/beyond-turnitin-and-anti-plagiarism-softwares/feed/ 2