Researchers at Vanderbilt University have created an algorithm designed to protect the privacy of patients while maintaining researchers' ability to analyze vast amounts of genetic and clinical data to find links between diseases and specific genes or to understand why patients can respond so differently to treatments.
Medical records hold all kinds of information about patients, from age and gender to family medical history and current diagnoses. The increasing availability of electronic medical records makes it easier to group patient files into huge databases where they can be accessed by researchers trying to find associations between genes and medical conditions--an important step on the road to personalized medicine. While the patient records in these databases are "anonymized," or stripped of identifiers such as name and address, they still contain the numerical codes, known as diagnosis codes or ICD codes, that represent every condition a doctor has detected.
The problem is, it's not all that difficult to follow a specific set of codes backward and identify a person, says Bradley Malin, an assistant professor of biomedical informatics at Vanderbilt University and one of the algorithm's coauthors. In a paper published online today in the Proceedings of the National Academy of Sciences, Malin and his colleagues found that they could identify more than 96 percent of a group of patients based solely on their particular sets of diagnosis codes. "When people are asked about privacy priorities, their health data is always right up there with information about their finances," says Malin--and for good reason.
From Technology Review
View Full Article
No entries found