In:
Queue, Association for Computing Machinery (ACM), Vol. 12, No. 7 ( 2014-07), p. 30-41
Abstract:
Open data has tremendous potential for science, but, in human subjects research, there is a tension between privacy and releasing high-quality open data. Federal law governing student privacy and the release of student records suggests that anonymizing student data protects student privacy. Guided by this standard, we de-identified and released a data set from 16 MOOCs (massive open online courses) from MITx and HarvardX on the edX platform. In this article, we show that these and other de-identification procedures necessitate changes to data sets that threaten replication and extension of baseline analyses. To balance student privacy and the benefits of open data, we suggest focusing on protecting privacy without anonymizing data by instead expanding policies that compel researchers to uphold the privacy of the subjects in open data sets. If we want to have high-quality social science research and also protect the privacy of human subjects, we must eventually have trust in researchers. Otherwise, we’ll always have the strict tradeoff between anonymity and science illustrated here.
Type of Medium:
Online Resource
ISSN:
1542-7730
,
1542-7749
DOI:
10.1145/2639988.2661641
Language:
English
Publisher:
Association for Computing Machinery (ACM)
Publication Date:
2014
detail.hit.zdb_id:
2105039-9
detail.hit.zdb_id:
2105043-0
Permalink