ACM

Communications of the ACM

Home/News/DarkBert AI Was Trained Using Dark Web Data/Full Text

ACM News

DarkBert AI Was Trained Using Dark Web Data

By Tom's Guide
May 18, 2023
Comments

View as: Print Mobile App Share:

While the researchers don’t have any plans to release DarkBERT to the public, they are accepting access requests for academic purposes.

Credit: Shutterstock

Following the success of OpenAI's ChatGPT, Microsoft's Bing Chat and Google Bard, researchers have created a new AI model with a much darker twist.

While the large language models (LLMs) that power ChatGPT and Google Bard were trained on data from the open web, DarkBERT was trained exclusively on data from the dark web. Yes, you read that correctly, this new AI model was trained using data from hackers, cybercriminals and other scammers.

A team of South Korean researchers have released a paper (PDF opens in new tab) detailing how they made DarkBERT using data from the Tor network, which is often used to access the dark web. By crawling through the dark web and then filtering the raw data, they were able to create a dark web database that they used to train DarkBERT.

From Tom's Guide
View Full Article

No entries found