The “Show Me Your Data” session proposal from THATCamp CHNM 2012 summarized the open research notes movement this way:
There has been some move in research to not just publish papers with the final results but to also release the raw data sets and even software for other researchers to verify the results and further discovery. There are even some futuristic claims that the data sets will be viewed as the ultimate results of research and the actual paper will be a secondary product.
That session focused on institutional repositories as ways to present data, but we’d like to focus more on the challenges posed by releasing research data to the public.
What happens when data collected for a monograph is removed from context? Are there different scholarly and interpretive requirements for data presented at a single-record level? When the same data is of interest to scholars and the general public, but the goals of each constituency are radically different, what happens?
We’d like to kick off the conversation by discussing a collaboration in progress. In research for Take Care of the Living, Jeffrey McClurken compiled a database of census and civil war service records for Pittsylvania County, Virginia. As this database is of tremendous interest to local historians and genealogists, and since his own family is connected to that county, Ben Brumfield offered to put that database online. However, the process has been challenging, as the interests and expectations of the public may be quite different from peer researchers, and a database compiled in support of a particular scholarly project turns out to be very different from a general-purpose database compiled for public use.
As a historian and chair of the editorial board of the SC Historical Magazine, I have seen considerable disagreement between what professional historians want published in the journal and what local historians and genealogists would like to see printed. I would very much like to be part of this discussion.
I won’t be at THATcamp but would love to follow this discussion from afar. I’ve been thinking about similar problems arising from my own open notebook experiment, which is developing in the opposite direction—from notes into an article, a book, and other potential outputs.
I definitely want to hear about this, too, at the very least in the Dork Shorts (lightning talks) in the first session. I guess I wonder why the data necessarily has to be removed from its context; can’t it be plastered with indications of where it came from and why it was collected? Sort of like watermarking a photograph?