Cell Phone Data Mining Using Synthetic Records-Is It Really Safe?

Data Mining

Surprisingly, “One in three consumers now regard their personal information as a tradable commodity, according to stats from a DMA survey of 1,020 adults. These consumers are prepared to share their details for marketing purposes, as long as they trust the brand in question, while others would ‘sell’ their data for a discount” (Charlton, 2012).  However, for those of us who do not like to have our private data mined from any source, including mobile phone data, there is an alternative data collection method that will, at least, keep our personal information private. Its called synthetic records data mining.

In a recent article, How to Mine Cell-Phone Data Without Invading Your Privacy, posted by MIT Technology Review writer, David Talbot, gave a very thorough explanation on what synthetic records are and how they will play a major role in protecting mobile phone users private information while still utilizing data mining tools. “Researchers at AT&T, Rutgers University, Princeton, and Loyola University have devised a way to mine cell-phone data without revealing your identity, potentially showing a route to avoiding privacy pitfalls that have so far confined global cell-phone data-mining work to research labs” (Talbot, 2013).

So, what are synthetic records and how will they protect your mobile privacy? “The new approach starts by aggregating traces of real human movements, then identifying common locations that might indicate home, work, or school. Next, it creates a set of transportation models. These models generate route tracks of people that the researchers call “synthetic,” because they are merely representative of the aggregate data, and not of actual people” (Talbot, 2013).


What this means for the consumer is, personal identity remain anonymous while data miners will still be able to collect relevant data.  Although, the use of synthetic records data is still vulnerable and privacy cannot be guaranteed. “But building in guaranteed privacy protections represents the toughest hurdle to the growing number of research efforts that tap CDRs. Even if such records are stripped of names and numbers, the identity of the person can often be revealed through other means. For example, a single cell-tower ping at 4:12 a.m. could be connected to a public tweet made at 4:12 a.m. that includes the location and identity of the tweeter. Similar risks crop up for data belonging to people who live in a remote area or have unusual home-work commuting patterns” (Talbot, 2013).

The use of synthetic records seems like a viable alternative to the way personal data is mined presently and, although there are some privacy risks that seem beyond the data miners control, the process seems to offer a safer alternative than what is being used today.



Charlton, G. (2012, June 20). Econsultancy. [Infographic]. Retrieved from Consumer attitudes to data privacy:http://econsultancy.com/us/blog/10153-consumer-attitudes-to-data-privacy-infographic

Mente Errabunda. (2011, January 17). [Image]. Retrieved from Minería de datos en la inteligencia de negocios: http://menteerrabunda.blogspot.com/2011_01_01_archive.html

Riberio, R. (2012, July 06). Biztech. [Infographic]. Retrieved from There’s a Thin Line Between Data Love and Hate: http://www.biztechmagazine.com/article/2012/07/theres-thin-line-between-data-love-and-hate-infographic

Talbot, D. (2013, May 13). MIT Technology Review. Retrieved from How to Mine Cell-Phone Data Without Invading Your Privacy: http://www.technologyreview.com/news/514676/how-to-mine-cell-phone-data-without-invading-your-privacy/