As part of the John S. and James L. Knight Foundation’s News Challenge on open government, the Center has submitted an idea for a project called “Hidden in Plain Sight” where we would pour through the Congressional Record to find instances of waste, fraud and abuse.
The Congressional Record is an enormous collection of the commitments, promises and action steps of members of Congress. While publicly available, its contents aren’t stored in a way that makes it easy for a machine to search through and find patterns that could represent instances of waste, fraud and abuse. Organizations could, of course, hire large teams of researchers to pour through the 30,000 pages a year, but funding a long-term team to do that would be very expensive.
Our proposal is to:
“Build a text mining tool with the capacity to analyze statements in the Congressional Record. Over the next two years, we will use the 112th Congressional Record and then, in real time, the 113th. We will analyze the Federal Register and related government transcripts and documents. We will make the resulting information available to an open community of readers through our award-winning website. We will make the methodology and tools available openly, so that researchers, journalists, and digital experts can use and adapt it.”
There’s about one more day left to give feedback on our proposal before it moves onto the next round where we will take your feedback and refine the proposal. Go to Knight’s News Challenge site to comment on our proposal, and the more than 800 other entries submitted. Additionally, feel free to weigh in on this page directly.