As they walked along the busy, yellow-lit tiers of offices, Anderton said: “You’re acquainted with the theory of precrime, of course. I presume we can take that for granted.” — Philip K. Dick, The Minority Report
From COINTELPRO to the illegal targeting of antiwar activists and Muslim-Americans, the FBI is America’s premier political police agency. And now, from the folks who brought us Wi-Fi hacking, viral computer spyware and al-Qaeda triple agent Ali Mohamed comes the Bureau’s Department of Precrime!
A chilling new report by the Electronic Frontier Foundation (EFF) reveals the breadth and scope of the FBI’s Investigative Data Warehouse (IDW), the Bureau’s massive data-mining project.
With more than a billion records “many of which contain information on American citizens,” EFF is calling on Congress to demand FBI accountability and strict oversight of this Orwellian project. By all accounts IDW is huge and growing at a geometric pace. According to the Bureau’s own narrative,
The IDW received its initial authority to operate in September 2005, and successfully completed a Federal Information Security Management Act audit in May 2007. As of September 2008, the IDW had: 7,223 active user accounts; 3,826 FBI personnel trained on the system, and 997,368,450 unique searchable documents. The IDW transitioned to the operations and maintenance phase during FY 2008. (Federal Bureau of Investigation, “Investigative Data Warehouse,” no date)
EFF notes that “the Library on Congress by way of comparison, has about 138 million (138,313,427) items in its collection.”
Kurt Opsahl, EFF’s Senior Staff Attorney and the author of the new report said: “The IDW includes more than four times as many documents as the Library of Congress, and the FBI has asked for millions of dollars to data-mine this warehouse, using unproven science in an attempt to predict future crimes from past behavior. We need to know all of what’s in the IDW, and how our privacy will be protected.”
In 2008, the National Academy of Science’s National Research Council issued a stinging report that questioned the efficacy of data-mining as an investigative tool for combatting terrorism.
That report, “Protecting Individual Privacy in the Struggle Against Terrorists: A Framework for Assessment,” concluded that automated programs such as IDW that collect and mine data should be evaluated for their impact on the privacy rights of citizens. An NRC press release stated candidly:
Far more problematic are automated data-mining techniques that search databases for unusual patterns of activity not already known to be associated with terrorists, the report says. Although these methods have been useful in the private sector for spotting consumer fraud, they are less helpful for counterterrorism precisely because so little is known about what patterns indicate terrorist activity; as a result, they are likely to generate huge numbers of false leads. Such techniques might, however, have some value as secondary components of a counterterrorism system to assist human analysts. Actions such as arrest, search, or denial of rights should never be taken solely on the basis of an automated data-mining result, the report adds.
The committee also examined behavioral surveillance techniques, which try to identify terrorists by observing behavior or measuring physiological states. There is no scientific consensus on whether these techniques are ready for use at all in counterterrorism, the report says; at most they should be used for preliminary screening, to identify those who merit follow-up investigation. Further, they have enormous potential for privacy violations because they will inevitably force targeted individuals to explain and justify their mental and emotional states. (National Academy of Science, National Research Council, “All Counterterrorism Programs That Collect and Mine Data Should Be Evaluated for Effectiveness, Privacy Impacts,” Press Release, October 7, 2008)
Noting that the Bureau is withholding critical information from public scrutiny, and that mining data gleaned from dozens of disparate sources is at the heart of IDW, EFF reports that the FBI “has identified only 38 of the 53 ‘data sources’ that feed into the IDW,” and has refused to hand over remaining documents, the result of a 2006 Freedom of Information Act request.
In a subsequent court action over the Bureau’s document stonewall, the civil liberties’ group reported that the Department of Justice told the court that “no additional material will be disclosed,” despite Obama administration assertions that it has “new policies on open government.”
Indeed, a May 12, 2005 email obtained by EFF from “an unidentified employee in the FBI’s Office of the General Counsel to FBI General Counsel Valerie Caproni” notes that the author was “nervous about mentioning PIA [Privacy Impact Assessment] in context of national security systems.”
The author admitted that “It is true the FBI currently requires PIAs for NS [national security] systems as well as non-NS systems.” EFF reports that the author “thought that the policy might change.” Accordingly the anonymous writer “recommend[ed] against raising congressional consciousness levels and expectations re NS PIAs.” Caproni’s response is short: “ok.”
However, “congressional consciousness levels” were raised after an August 30, 2006 Washington Post piece exposed the intrusive nature of the IDW system.
The Bureau’s response? Several emails revealed the FBI’s desire to play down privacy concerns, noting cynically: “I’m with [Redacted] in view that if everyone [Redacted] starts running around with their hair on fire on this, they will just be pouring gas on something that quite possibly would just fade away if we just shrug it off.”
Given the corporate media’s snail-like attention span when it comes to anything other than puppies trapped in a well or the shenanigans of various “celebrities,” it’s a sure-fire bet something as mundane as the rights of ordinary citizens “would just fade away.”
IDW: A Web-Based Panopticon and Cash Cow for Corporate Spooks
The Electronic Frontier Foundation’s report, citing the Bureau’s own description, characterizes the Investigative Data Warehouse as “the FBI’s single largest repository of operational and intelligence information.”
In 2005, FBI Section Chief Michael Morehart said that “IDW is a centralized, web-enabled, closed system repository for intelligence and investigative data.” Unidentified FBI agents have described it as “one-stop shopping” for FBI agents and an “uber-Google.” According to the Bureau, “[t]he IDW system provides data storage, database management, search, information presentation, and security services.”
Documents released to EFF show that the FBI began spending funds on IDW in 2002 and that “system implementation was completed in FY 2005.” Version 1.1 was released in July 2004 “with enhanced functionality, including batch processing capabilities.”
But as with all things related to “national security,” early-on in the game the FBI forged a “public-private partnership” with spooky corporations in the defense and security industry, including Science Applications International Corporation (SAIC), Convera and Chiliad to develop the project.
As the Project on Government Oversight (POGO) notes in their Federal Contractor Misconduct Database, the San Diego-based SAIC has paid out some $14.5 million in fines on $5.3 billion in revenue largely derived from contracts in the defense, intelligence and security fields.
Misconduct ranged from false claims and defective pricing to conflict of interest violations. Last August, SAIC was forced to drop its bid with the Federal Emergency Management Agency (FEMA) for the agency’s TOPOFF 5 national disaster drill “after allegations of improprieties in the contracting process” were uncovered, according to Washington Technology.
Indeed, SAIC had been hired by the FBI to build an early version of IDW known as the Virtual Case File (VCF). According to Washington Technology, SAIC was contracted by the Bureau in 2001 to build VCF “but pulled the plug in 2005 after realizing the system would not work.”
The 2007 appropriations bill directed the Bureau to “retrieve as much as $104 million from the defaulted VCF contract” and in unusual language for the Senate, “expects FBI to use all means necessary, including legal action, to recover all erroneous charges from the VCF contractor,” Washington Technology revealed.
Federal Computer Week reported in 2005 that Aerospace, an independent contractor hired to evaluate the system concluded that SAIC “did a poor coding job” and that it was “virtually impossible to update the system.”
Despite these revelations, the San Diego defense and security giant has cornered billions of dollars in new contracts from the Defense, Homeland Security and Justice Departments.
Convera, describing itself as “the leading technology provider of intelligent search,” the Vienna, Virginia corporation claims it is “an established leader in the business of search technologies.” Apparently, the company is less than sanguine about trumpeting its products for the FBI. A search of their website returned zero hits on the terms “FBI-IDW.”
However, Washington Technology revealed in 2004, Convera won a contract worth more than $2 million to “provide an agency-wide search and discovery platform for the FBI.”
The contract “covers a perpetual license for the company’s RetrievalWare software as the search technology.” The 2004 award was “a follow-on from an earlier contract worth approximately $1.5 million … for search and categorization software for the FBI’s Investigative Data Warehouse,” the technology insider publication reported.
On the other hand Chiliad avers that they will help “organizations realize the full business value of all of their disparate information resources,” and their innovative products “in enterprise search and analysis technology, and virtual information sharing” will “help organizations ‘Connect the Dots’ and arrive at truly actionable intelligence.” In this spirit, Chiliad boasts that the FBI as the lead agency for “domestic counterterrorism” has purchased a “worldwide enterprise license to Chiliad’s software.”
Founded in 1999, the Washington, D.C.-based firm’s customer base include such spooky corporations as defense giant BAE, Booz Allen Hamilton, described by investigative journalist Tim Shorrock in Spies For Hire as a “revolving door” connecting the corporate security world and agencies such as NSA, General Dynamics, ITT, Northrop Grumman, SAIC and many, many more!
According to EFF, the FBI is busily putting these products to the test.
In addition to storing vast quantities of data, the IDW provides a content management and data mining system that is designed to permit a wide range of FBI personnel (investigative, analytical, administrative, and intelligence) to access and analyze aggregated data from over fifty previously separate datasets included in the warehouse. Moving forward, the FBI intends to increase its use of the IDW for “link analysis” (looking for links between suspects and other people–i.e. the Kevin Bacon game) and to start “pattern analysis” (defining a “predictive pattern of behavior” and searching for that pattern in the IDW’s datasets before any criminal offence is committed–i.e. pre-crime). (Kurt Opsahl, “Report on the Investigative Data Warehouse,” Electronic Frontier Foundation, April 2009)
Accordingly, EFF revealed that then-Assistant Director for the Counterterrorism Division, Willie Hulon said in 2004 that the FBI was “introducing advanced analytical tools” that would “make the most” of IDW data.
Hulon went on to state that when IDW is completed, “Agents, JTTF [Joint Terrorism Task Force] members and analysts,” using the new data-mining technology “will be able to search rapidly for pictures of known terrorists and match or compare the pictures with other individuals in minutes rather than days. They will be able to extract subjects’ addresses, phone numbers, and other data in seconds, rather than searching for it manually. They will have the ability to identify relationships across cases. They will be able to search up to 100 million pages of international terrorism-related documents in seconds.” EFF notes that since 2004, “the number of records has grown nearly ten-fold.”
According to an April 1 press release from the American Civil Liberties Union, FBI Joint Terrorism Task Forces and the related national nexus of Fusion Centers, comprised of the FBI, local police, the military (U.S. Northern Command) and private outfits in the corporate security world, relying heavily on data-mining and link analysis “have experienced a mission creep in the last several years, becoming more of a threat than a security device.”
Indeed, the ACLU noted that Fusion Centers have routinely targeted activists across the political spectrum, relying on specious data-mining techologies as well as paid provocateurs and informants (HUMINT) that label any and all government critics as “extremists” to be monitored and indexed in national security databases. The civil liberties’ group averred: “From directing local police to investigate non-violent political activists and religious groups in Texas to advocating surveillance of third-party presidential candidate supporters in Missouri, there have been repeated and persistent disclosures of troubling memos and reports from local fusions centers.”
Since 2004, EFF has identified 38 separate data sources feeding the FBI’s Investigative Data Warehouse. In addition to the FBI’s Automated Case System (ACS), soon to be replaced by the Sentinel Case Management System after the $170 million “Virtual Case File” fiasco briefly described above, IDW compiles information from the following sources:
Secure Automated Messaging Network (SAMNet). SAMNet consists of all message traffic sent by the CIA, Defense Intelligence Agency, including Intelligence Information Reports (IIRs) and Technical Disseminations (TD) to the FBI. These include Secret classified information but not those designated Top Secret and above, including Sensitive Compartmented Information (SCI), the highest security classification.
Joint Intelligence Committee Inquiry (JICI) Documents of “all FBI documents related to Islamic extremist networks between 1993 and 2002.”
Open Source News, collected from the MiTAP system run by San Diego State University. EFF describes MiTAP as a “system that collects raw data from the internet, standardizes the format, extracts named entities, and routes documents into appropriate newsgroups. This dataset is part of the Defense Advanced Research Projects Agency (DARPA) Translingual Information Detection, Extraction and Summarization (TIDES) Open Source Data project.”
Violent Gang and Terrorist Organization File (VGTOF), provided by the FBI National Crime Information Center (NCIC). It includes “biographical data and photos” of individuals “who the FBI believes to be associated with violent gangs and terrorism.” However, numerous abuses of the VGTOF classification system have been uncovered by the ACLU. According to the ACLU of Colorado, the FBI’s JTTF added anarchists and eight separate categories of “extremists” to the VGTOF, including “environmental extremist” and “Black extremist.” Indeed, Colorado antiwar activist Bill Sulzman, a campaigner against the weaponization of space, was listed in the VGTOF as a “terrorist,” according to an article in the Colorado Springs Independent.
CIA Intelligence Information Reports (IIR) and Technical Disseminations (TD), “designed to provide the FBI with the specific results of classified intelligence collected on internationally-based terrorist suspects and activities, chiefly abroad.”
Eleven IntelPlus scanned document libraries “related to FBI’s major terrorism-related cases.”
Eleven Financial Crimes Enforcement Network (FinCEN) Databases.
Selectee List: Copies of a Transportation Security Administration (TSA) “list of individuals that the TSA believes warrant additional security attention prior to boarding a commercial airliner.”
Terrorist Watch List (TWL): according to EFF, the “FBI Terrorist Watch and Warning Unit (TWWU) list of names, aliases, and biographical information regarding individuals submitted to the Terrorist Screening Center (TSC) for inclusion into VGTOF and TIPOFF watch lists. Also called the Terrorist Screening Database (TSDB), the database ‘contained a total of 724,442 records as of April 30, 2007’.” The TWL has balooned to 1,192,000 names as of May 3, 2009.
According to the ACLU, “members of Congress, nuns, war heroes and other ‘suspicious characters’ … have become trapped in the Kafkaesque clutches of this list, with little hope of escape.” Barry Steinhardt, director of the ACLU Technology and Liberty Project said last summer: “Putting a million names on a watch list is a guarantee that the list will do more harm than good by interfering with the travel of innocent people and wasting huge amounts of our limited security resources on bureaucratic wheel-spinning. I doubt this thing would even be effective at catching a real terrorist.” While true enough as far as it goes, perhaps the list’s true intent is not to prevent terrorism but rather to terrorize the American people.
At the heart of these systems is data mining, that is, the deployment of a vast infrastructure capable of receiving, processing, managing and analyzing data flowing into the system from disparate sources. Indeed, documents released to EFF disclosed that the Bureau’s 2008 budget justification explained that “[t]he Investigative Data Warehouse (IDW), combined with FTTTF’s [Foreign Terrorist Tracking Task Force] existing applications and business processes, will form the backbone of the NSB’s [National Security Branch] data exploitation system.” The FBI also requested “$11,969,000 … for the National Security Branch Analysis Center (NSAC).” The FBI claimed:
Once operational, the NSAC will be tasked to satisfy unmet analytical and technical needs of the NSB, particularly in the areas of bulk data analysis, pattern analysis, and trend analysis. … The NSAC will provide subject-based “link analysis” through the utilization of the FBI’s collection datasets, combined with public records on predicated subjects. “Link analysis” uses datasets to find links between subjects, suspects, and addresses or other pieces of relevant information, and other persons, places, and things. This technique is currently being used on a limited basis by the FBI; the NSAC will provide improved processes and greater access to this technique to all NSB components. The NSAC will also pursue “pattern analysis” as part of its service to the NSB. “Pattern analysis” queries take a predictive model or pattern of behavior and search for that pattern in datasets. The FBI’s efforts to define predictive models and patterns of behavior will improve efforts to identify “sleeper cells.”
When this request was submitted to Congress, NSAC said it would “bring together nearly 1.5 billion records created or collected by the FBI and other government agencies,” expected to quadruple by 2012. The House Science and Technology Committee was so alarmed that they demanded that the Government Accountability Office investigate the National Security Branch Analysis Center.
ABC News’ Brian Ross reported that lawmakers are “questioning whether a proposed FBI anti-terrorist program is worth the price, both in taxpayer dollars and the possible loss of Americans’ privacy.”
Noting that the the FBI has a history “of improperly–even illegally–gathering personal information on Americans, most recently through the widespread abuse of so-called National Security Letters,” ABC reported that congressional investigators are demanding to know “whether there are protections in place to make sure all the data in the program was legally collected.”
Given the track record of the Bureau when it comes to targeting political opponents, I wouldn’t hold my breath.
Two years later, EFF notes in a letter to Senator Patrick Leahy (D-VT) that the FBI has refused to release documents filed under the Freedom on Information Act and that the Bureau “has published neither a ‘system of records notice’ (as required by the Privacy Act) nor a ‘privacy impact assessment’ (as required by the E-Government Act) for the IDW, thus depriving the public of the kind of accountability that usually comes with the creation and maintenance of large database systems containing sensitive personal information.”
Citing Leahy’s own assertion that the IDW is a “system ripe for abuse,” EFF has called on the Judiciary Committee to examine IDW closely and “provide the public with needed assurances concerning its potential impact on the privacy rights of citizens.”