An algorithm developed by IBM researchers exploits anyone's last 200 Twitter postings to reveal their home city location with nearly 70-percent accuracy.
The researchers filtered the Twitter channel for tweets that were geotagged with any of the largest 100 U.S. cities between July and August 2011, until they had pinpointed 100 different users in each location. They then downloaded the last 200 tweets posted by each user, rejecting privately posted messages until more than 1.5 million geotagged postings from almost 10,000 people remained. This dataset was divided in two, with 90 percent of the tweets employed to train the algorithm while the remaining 10 percent were used for testing it.
The algorithm's underlying concept is that tweets contain key details about the user's likely whereabouts, and the researchers say tweet distribution throughout the day is roughly consistent across the country, so a user's pattern of tweets can offer solid clues to the tweeter's time zone.
Testing the algorithm with the remaining data demonstrated that it correctly predicted tweeters' home cities 68 percent of the time, their home state 70 percent of the time, and their time zone 80 percent of the time when excluding obvious travelers.
From Technology Review
View Full Article
Abstracts Copyright © 2014 Information Inc., Bethesda, Maryland, USA
No entries found