iStock

Genetic databases: when your relatives give you away

Embargoed until: Publicly released:

People who have not undergone genetic testing can often be identified through their DNA using data from open genealogy databases - data which is increasingly used by law enforcement agencies to track down criminals. A US study has found more than 60 per cent of people in the US with European ancestry can be traced through relatives, suggesting a need for rules to ensure genetic privacy and prevent the misuse of information, the authors say. The authors found that once a genetic database covers roughly 2 per cent of a target population, nearly any person within that group could be matched at least at a third cousin level.

Journal/conference: Science

Organisation/s: MyHeritage | Columbia University, USA

Media Release

From: AAAS

Crime and Privacy: Using Consumer Genomics to Identify Anonymous Individuals

Over 60% of individuals in the U.S. with European ancestry – including those that have not undergone genetic testing themselves – can be identified through their DNA using data from open genetic genealogy databases, a new study reports. The results underscore the power of rapidly growing consumer genomic databases and suggest a need for policies designed to both ensure people’s genetic privacy and to prevent the misuse of publicly available genetic information.

Direct-to-consumer genetic testing and related third-party services, particularly those that offer genetic genealogical ancestry (the identification of relatives through shared DNA), have witnessed a meteoric rise in popularity. However, these services are increasingly being used by law enforcement agencies for forensic purposes. Perhaps the most notable recent case resulted in identifying a suspect in the “Golden State Killer” case, where the suspect's identity was discovered by tracking down genetic relatives found in an open consumer genomic database, using crime scene DNA.

To better understand the forensic power such methods have in identifying unknown individuals, Yaniv Erlich and colleagues analyzed a dataset of over 1.2 million anonymous individuals who had undergone commercial sequencing with the consumer genetic provider My Heritage (a company for which Erlich is the Chief Science Officer). For over 60% of the individuals within the dataset, a family member with matching DNA segments roughly corresponding to a third cousin relation or closer was also found.

Furthermore, using publicly available genealogical records, Erlich et al. demonstrate that once one or more relatives are found, the identity of an individual can be determined through family lineages combined with specific demographic information, such as approximate age or area of residence.

To illustrate this potential, the authors used the method to reconstruct the identity of an anonymous woman whose DNA information was publicly available on the internet. The authors note that their results raise significant privacy concerns and they suggest that reevaluation of current DNA data practices is necessary at both commercial and federal levels. While the data used represents only a small portion of the U.S. population, Erlich et al. found that once a genetic database covers roughly 2% of a target population, nearly any person within that group could be matched at least at a third cousin level. Given the rapid growth of consumer genomics, such possibilities are likely achievable in the near future, according to the authors.

Attachments:

  • AAAS
    Web page
    Paper is open access and available here
  • Cell Press
    Web page
    Related study (paper available here)

News for:

International

Media contact details for this story are only visible to registered journalists.