Is Search Too Hard?

August 26, 2014

I find the readers who send me links to the UK Daily Mail stories helpful. Are these referrers easily fooled?

The story in question has a Google friendly headline:

‘It’s all been a big lie!’ Obama administration lawyer now admits ‘missing’ Lois Lerner emails WERE backed up but claims it’s too hard to search for them”

The US government is a busy beaver when it comes to search. You can explore USA.gov at your leisure or seek information on myriad dot Gov Web sites without my inputs.

Here’s a passage from the write up. You determine if it is on the money:

‘The Department of Justice attorney told the Judicial Watch attorney on Friday.’ Fit ton said during a Monday afternoon Fox News broadcast, ‘that it turns out the federal government backs up all computer records in case something terrible happens in Washington and there is a catastrophe, so the government can continue operating.’ The catch, he added, is that the DOJ attorney also claimed ‘it would be too hard to go and get Lois Lerner’s emails from that backup system.’

This search and retrieval stuff seems to be difficult. Perhaps these folks should turn to a real expert like Dave Schubmehl, the Arnold surfer for real insight?

Stephen E Arnold, August

Short Honk: Surveillance Database Report

August 26, 2014

I wanted to document a report that ICREACH exists. For information, see The Intercept’s report. No further comment from Beyond Search.

Stephen E Arnold, August 26, 2014

Let Google Do It: Outsource the Government

August 1, 2014

I saw a discussion thread describing a proposed action to allow Google to become the National Technical Information Service. Although I served on the Board of NTIS years ago, I don’t think too much about the operation. Apparently Google does. Apparently there are some folks who ignore the repository aspect of NTIS. Google is search and pretty darned good the argument goes.

The idea is presented in S 2206, but the officials elected by the people are occupied with a number of weighty issues.

I did not this item about the efficacy of US government management. Take a look at “Poorly Managed HealthCare.gov Construction Cost $840 Million, Watchdog Finds.” A billion here and a billion there may not be a big deal as the economy improves according to some pundits.

What Google does not do, perhaps USA.gov will? What about the Library of Congress, various government document repositories, and, of course, the funding entities themselves?

Let Google do it? Why not?

Stephen E Arnold, August 4, 2014

Hidden from Google: Interesting but Thin

July 15, 2014

I learned about the Web site Hidden from Google. You can check out the service and maybe submit some results that have disappeared. You may not know if the deletion or hiding of the document is a result of the European Right to Be Forgotten action, but if content disappears, this site could be a useful checkpoint.

Here’s what the service looks like as of 9 21 am Eastern on July 15, 2014.

image

According to the Web site:

The purpose of this site is to list all links which are being censored by search engines due to the recent ruling of “Right to be forgotten” in the EU. This list is a way of archiving the actions of censorship on the Internet. It is up to the reader to decide whether our liberties are being upheld or violated by the recent rulings by the EU.

I noticed that deal old BBC appeared in the list, a handful of media superstars, and some Web sites unknown to me. The “unknown” censored search term is intriguing, but I was not too keen on poking around when I was not sure what I was seeking. Perhaps one of the fancy predictive search engines can provide the missing information or not.

When I clicked on the “source” link sometimes I got a story that seemed germane; for example, http://bbc.in/1xhjKyK linked to one of those tiresome banker misdeed stories. Others pointed to stories that did not seem negative; for example, a guardian article that redirected to a story in Entrepreneur Magazine. http://bit.ly/1jukI7T. Teething pains I presume or my own search ineptness.

I did some clicking around and concluded that the service is interesting but lacks in depth content. I looked for references to the US health care Web sites. I am interested in tracking online access to RFPs, RFQs, and agreements with vendors. These contracts are fascinating because the contractors extend the investigative capabilities of certain US law enforcement entities. Since I first researched the RAC, MIC, and ZPIC contractors, among others, I have noticed that content has become increasingly difficult to find. Content I could pinpoint in 2009 and 2010 now eludes me. Of course, I may be the problem. There could be latency issues when spiders come crawling. There can be churn among the contractors maintaining Web sites. There can be many other issues, including a 21st century version of Adam Smith’s invisible hand. The paw might be connected to an outfit like Xerox or some other company providing services to these programs.

Several questions:

First, if the service depends on crowdsourcing, I am not sure how many of today’s expert searchers will know when a document has gone missing. Unless I had prior knowledge of a Medicare Integrity Contractor statement of work, how would I know I could not find it? Is this a flaw the site will be able to work around.

Second, I am not sure the folks who filled out Google’s form and sent proof of their wants an archive of information that was to go into the waste basket. Is there some action a forgotten person will take when he or she learns he or she is remembered?

Third, the idea is a good one. What happens when Google makes its uncomfortable to provide access to data that Google has removed? Maybe Mother Google is toothless and addled with its newfound interest in Hollywood and fashionable Google Glass gizmos. On the other hand, Google has lots of attorneys in trailers not too far from where the engineers work.

Stephen E Arnold, July 15, 2014

Steps Offered to Improve Government Data Sites

July 8, 2014

The article on FlowingData titled How to Make Government Data Sites Better uses the Center for Disease Control website to illustrate measures the government should take to make their data more accessible and manageable. The first suggestion is to provide files in a useable format. By avoiding PDFs and providing CSV files (or even raw data), the user will be in a much better position to work with the data. Another suggestion is simply losing or simplifying the multipart form that makes search nearly impossible. The author also proposes clearer and more consistent annotation, using the following scenario to illustrate the point,

“The CDC data subdomain makes use of the Socrata Open Data API,… It’s weekly data that has been updated regularly for the past few months. There’s an RSS feed. There’s an API. There’s a lot to like… There’s also a lot of variables without much annotation or metadata … When you share data, tell people where the data is from, the methodology behind it, and how we should interpret it. At the very least, include a link to a report in the vicinity of the dataset.”

Overall, the author makes many salient points about transparency, consistency and clutter. But there is an assumption in the article that the government actually desires to make data sites better, which may be the larger question. If no one implements these ideas, perhaps that will be answer enough.

Chelsea Kerwin, July 08, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Bill Suggests Replacing NTIS with Google Search

May 15, 2014

The article titled There’s a ‘Let Me Google That For You’ Bill on Talking Points Memo relates the substance of a bipartisan bill (sponsored by Tom Coburn and Clair McCaskill). The bills purpose is to save the taxpayer money by resorting to Google and eliminating the National Technical Information Service (NTIS). The article states,

“The bill is meant to cut down on “the collection and distribution of government information” by prioritizing using Google over spending money to obtain information from the National Technical Information Service (NTIS). NTIS, run by the Department of Commerce, is a repository of 3 million scientific, technical, engineering, and business texts. The bill would abolish the NTIS and move essential functions of the agency to other agencies like the National Archives.”

If the bill’s name sounds familiar, you have probably heard of the website it is named after, in which the website redirects you to Google. The bill is put forward to prevent waste by federal agencies in obtaining government documents for money when they are available online free of charge. Sounds like a no-brainer, especially since NTIS was founded in 1950, decades before the Internet was even a possibility. You can read the full bill here.

Chelsea Kerwin, May 15, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

India: The Future of Search

April 28, 2014

I read “New RTI Search Engine Makes the Task Tougher.” A number of government sites have made changes that seem to make finding information more difficult. In some cases, locating information may be almost impossible. When I lived in Washington, DC, as a grade school student, I remember my father stopping at a government agency and walking in to obtain some information. I am not sure how my father’s approach would be received today.

In the Pune Mirror article, I noted this passage about India’s Right to Information finding system:

a new search engine has been put in place that makes it mandatory for visitors to know the specific date, topic, category and sub-category in order to track a particular circular. Also, information like mode of payment for RTI fees, circulars, advertisements and office memorandums, that were up front as per their date of issuance from the year 2005, have gone missing.

In my experience, most users are not able to provide sufficiently narrow terms or provide key details about a needed item of information. As a result, it is now trivially easy for a governmental entity to drop a old-school photographer’s cloak over some information. I noted this comment in the article:

“With the new system in place, you need to know the exact date, topic, category and sub category in order to find the circular. Considering the level of literacy in this country, who will know all details?” he demanded. “We are all stake-holders and they should have asked before making these changes. All political parties have opposed the RTI Act.

The article points to an opinion that the new Indian search system is designed to “harass” users. I don’t agree. More commercial and governmental entities are fearful of user access to some information.

Is the use of the word “transparency” a signal that finding information is not in the cards. For me, I am not too concerned. I have developed a turtle like approach to these “retrieval enhancements.” I no longer look for information online as often as I did when I was but a callow lad.

I am pulling my head in my shell now. There. That’s better. Predictive search delivers pizza and sports scores. What more does a modern person require?

Stephen E Arnold, April 28, 2014

Google Promptly and Quietly Erases Lists of Government Partners

April 21, 2014

A pair of articles at PandoDaily tell an interesting story. First they published a piece titled, “Google Distances Itself from the Pentagon, Stays in Bed with Mercenaries and Intelligence Contractors.” In that article, reporter Yasha Levine reveals that, despite Google’s attempts to dissociate itself from the military-industrial complex after last year’s NSA kerfuffle, the search giant is still working closely with several of those agencies, and their contractors. He writes:

“In some cases — like the company’s dealings with the NSA and its sister agency, the NGA — Google deals with government agencies directly. But in recent years, Google has increasingly taken the role of subcontractor: selling its wares to military and intelligence agencies by partnering with established military contractors. It’s a very deliberate strategy on Google’s part, allowing it to more effectively sink its hooks into the nepotistic, old boy government networks of America’s military-intelligence-industrial complex.

“Over the past decade, Google Federal (as the company’s DC operation is called) has partnered up with old school establishment military contractors like Lockheed Martin, as well as smaller boutique outfits — including one closely connected to the CIA and former mercenary firm, Blackwater.”

Levine goes into detail, and that article is an interesting read. However, it was his follow-up piece, “Google Apparently Scrubs Military Contractor Partner Listing, After Pando Report” that really caught our attention. This story shares screenshots taken before and after the revelatory article was posted a couple days before. These images show Google’s Enterprise- Government page displaying lists of government partners. The second shows a page in perpetual-load mode. Levine tells us:

“Later [on the day the first article was posted], I noticed a strange thing: The official Google ‘Enterprise Government’ webpage that had listed some of the company’s military contractor partners no longer loaded. The page worked just fine less than a week ago, but now all it shows is some text up top telling government agencies to ditch their dinosaur IT services and get with Google — ‘Help your agency move fast and innovate’! — and then nothing but empty white space….

“I’ve asked several people to access the page from different parts of the United States and they all come back with the same answer: the page framework partially loads, but all the information is missing. It appears to be the only Google Enterprise page that does not load. I’ve looked around, but could not find this missing list of contractors displayed anywhere else on the Google Enterprise website.”

So, was this glitch purposeful? Well, as of this writing, the page is functioning. However, it no longer includes lists of partners, just links to more info for potential customers. Like Levine, I can find no such list elsewhere on the site. (The closest I found is a page where city reps laud Google for use in running local governments — much less controversial.) Good catch, Pando.

Cynthia Murrell, April 21, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Secrecy News Talks Declassification

April 15, 2014

Declassified records are an interesting element to the public. But there is more to declassification than simply putting them out there for the public to find. Findability and search play a role also. Secrecy News focuses on the topic in their blog entry, “Putting Declassified Records to Good Use.”

The article says:

“The final, climactic step in the declassification of government records is not the formal removal of classification markings or even the transfer of the declassified documents to public archives. The culmination of the declassification process is when the records are finally examined by an interested reader and their contents are absorbed into the body of public knowledge.”

Secrecy News is an FAS project on government secrecy. They provide documentary resources on secrecy, intelligence, and national security. Interested readers can subscribe for regular updates. Secrecy is a hot topic due to the Snowden case, but this blog has been in business for years, and offers a steady flow of information, even if not completely original in scope.

Emily Rae Aldridge, April 15, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Government Tackles Acquisition Inefficiencies

April 6, 2014

Given evidence like the vile backlog on veterans’ benefits and the still-operating paperwork bunker in Pennsylvania, one could be forgiven for suspecting that no one in government is even trying to bring our bureaucracy into this century. You may be surprised to know there is plan in place for at least part of the problem, as evidenced by the Integrated Award Environment: the Path Forward from the U.S. General Services Administration (GSA). That document, which looks suspiciously like a Power Point presentation converted to PDF, outlines the GSA’s recommendations for improving the federal government’s acquisition procedures.

Anyone interested in the details should check out the document, but the list of “our principles” summarizes the organization’s targets:

  • Open (source code, data, APIs)
  • Data as an asset
  • Continuous improvement
  • Effective user experience
  • Measurable transactions
  • Security is foundational
  • Build value over maintaining status quo

The paper expounds on each of these points, defining the implications of each goal, a point or two on maintaining balance, and questions workers should ask themselves going forward. For example, the section on “Open” notes that users must balance the stability of, say, Oracle with the agility of open source solutions and security with openness. For the data-enthused among us, the section on “Data as an asset” reads:

“Accurate, timely, complete, and authoritative”

Implies:

*Significant effort to manage data quality; implementers must have data-oriented SLAs

*Change control of the data needs to be transparent

*Will follow the data->information->knowledge chain Implies

Balance:

*Our flexibility has to account for the strong change management of our data Balance

Ask ourselves:

*“How do we ensure that we are providing timely and accurate data?”

*“How are we enabling decision-making through use of our data?”

So, next time you’re tempted to think our government is doomed to be stuck in the 20th century, remember that some folks within the bureaucracy are on the case. Soon, it may be time for them to party like it’s 1999.

Cynthia Murrell, April 06, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Next Page »