About Me

My photo
I have a burning need to know stuff and I love asking awkward questions.

Wednesday, February 15, 2006

US Plans Massive Data Sweep

Thursday, February 9, 2006 excerpted from the Christian Science Monitor

by Mark Clayton

The US government is developing a massive computer system that can collect huge amounts of data and, by linking far-flung information from blogs and e-mail to government records and intelligence reports, search for patterns of terrorist activity. The system - parts of which are operational, parts of which are still under development - is already credited with helping to foil some plots. It is the federal government's latest attempt to use broad data-collection and powerful analysis in the fight against terrorism. But by delving deeply into the digital minutiae of American life, the program is also raising concerns that the government is intruding too deeply into citizens' privacy.

"We don't realize that, as we live our lives and make little choices, like buying groceries, buying on Amazon, Googling, we're leaving traces everywhere," says Lee Tien, a staff attorney with the Electronic Frontier Foundation. "We have an attitude that no one will connect all those dots. But these programs are about connecting those dots - analyzing and aggregating them - in a way that we haven't thought about. It's one of the underlying fundamental issues we have yet to come to grips with." The core of this effort is a little-known system called Analysis, Dissemination, Visualization, Insight, and Semantic Enhancement (ADVISE). Only a few public documents mention it. ADVISE is a research and development program within the Department of Homeland Security (DHS), part of its three-year-old "Threat and Vulnerability, Testing and Assessment" portfolio. The TVTA received nearly $50 million in federal funding this year. DHS officials are circumspect when talking about ADVISE. "I've heard of it," says Peter Sand, director of privacy technology. "I don't know the actual status right now. But if it's a system that's been discussed, then it's something we're involved in at some level."

A major part of ADVISE involves data-mining - or "dataveillance," as some call it. It means sifting through data to look for patterns. If a supermarket finds that customers who buy cider also tend to buy fresh-baked bread, it might group the two together. To prevent fraud, credit-card issuers use data-mining to look for patterns of suspicious activity. What sets ADVISE apart is its scope. It would collect a vast array of corporate and public online information - from financial records to CNN news stories - and cross-reference it against US intelligence and law-enforcement records. The system would then store it as "entities" - linked data about people, places, things, organizations, and events, according to a report summarizing a 2004 DHS conference in Alexandria, Va. The storage requirements alone are huge - enough to retain information about 1 quadrillion entities, the report estimated.

But ADVISE and related DHS technologies aim to do much more, according to Joseph Kielman, manager of the TVTA portfolio. The key is not merely to identify terrorists, or sift for key words, but to identify critical patterns in data that illumine their motives and intentions, he wrote in a presentation at a November conference in Richland, Wash. While privacy laws do place some restriction on government use of private data - such as medical records - they don't prevent intelligence agencies from buying information from commercial data collectors. Congress has done little so far to regulate the practice or even require basic notification from agencies, privacy experts say. Indeed, even data that look anonymous aren't necessarily so. For example: With name and Social Security number stripped from their files, 87 percent of Americans can be identified simply by knowing their date of birth, gender, and five-digit Zip code, according to research by Latanya Sweeney, a data-privacy researcher at Carnegie Mellon University. In a separate 2004 report to Congress, the GAO cited eight issues that need to be addressed to provide adequate privacy barriers amid federal data-mining. Top among them was establishing oversight boards for such programs.

3 comments:

Baconeater said...

I just found out the functions my site meter can do. It is fascinating, but I didn't realize how much info we divulge by just surfing. For example, I know the links that people use to get to my blog, where there server is, and their IP addy (not that I have a clue what to do with it but I'm sure many people do), and what time they viewed my blog.
You should get one, it doesn't look like you have one, you don't have to show how many hits you have: http://www.sitemeter.com/?a=home

Up until a few days ago, I just thought it was a pretty good estimate of my blog readers, and that is it.

JR said...

About ten years ago I met some people who lived "off the grid." They had children and never registered the births, or enrolled them in school. They only worked for cash and bartered, they lived on other people's property so that they wouldn't have a paper trail for rents/mortgages, utilities, etc. They didn't pay taxes, possess driver's licenses or insurance, and a whole host of things. I thought they were terribly paranoid about the government knowing too much about their citizens and being able to pinpoint where individuals were in this country. Okay, I still think they're extreme and paranoid, but I must admit that my doubts are beginning to grow.

CyberKitten said...

V V said: I must admit that my doubts are beginning to grow.

Mine too. Have you read 'The Traveller' by John Twelve Hawkes? If you weren't paranoid before you read it - you will be afterwards. Recommended.