Common methods of deidentification do not adequately protect data privacy, according to a study by University of Chicago computer scientist Aloni Cohen. The most common techniques involve k-anonymity, or redacting just enough data to make each individual indistinguishable from a certain number of other individuals in the dataset.
Cohen describes two attacks: downcoding or reverse-engineering of transformations performed on data, and predicate singling-out, which targets data anonymization standards under the European Union's General Data Protection Regulation. He combined deidentified data from the edX massively open online course platform with data extracted from resumes posted on LinkedIn to identify people who took edX courses without completing them, potentially violating federal law.
"If what you want to do is take data, sanitize it, and then forget about it — put it on the Web or give it to some outside researchers and decide that all your privacy obligations are done — you can't do that using these techniques," Cohen says.
From University of Chicago
View Full Article
Abstracts Copyright © 2022 SmithBucklin, Washington, DC, USA
No entries found