May 18, 2013
I just returned from the UK. On my return I saw this news item: “Google’s Schmidt to Meet Britain’s Cameron as Tax Row Rages.” If the link goes dark, just run a query for “Google tax UK” and you will get some of the information. You can watch a video snippet at “Ed Miliband Accuses Google of Avoiding Fair Share of Tax” as of 6 am Eastern, May 18.
I watched a bit of the discussion between a UK elected official and a Googler on the telly before I had a wonderful flight back to the United States. I thought the discussion was one of those technical misunderstandings. I recall a phrase which suggested that Google was not communicating clearly. Wired Magazine, UK edition, ran this story: “MP to Google: You Do Evil When It Comes to Tax.”
As I understand the issue, Google pays what it owes within the boundaries of the regulations. The UK is struggling economically, which is evident in the number of folks who seem to be wandering about Hounslow without much to do at 10 30 am of a morning. My bus ride to Heathrow was an eye opener. The impression I had on the secondary streets to the airport was that High Street Kensington is different from the bus route from Hammersmith to Heathrow.
At a Public Accounts Committee hearing on 16 May, chairperson Margaret Hodges accused Google of “deliberately manipulating the reality of their business” and claimed to have whistleblower evidence that UK Google staff had sold advertising and invoiced UK-based customers. “You are a company that says you do no evil,” she told Google vice-president Matt Brittin. “I think that you do do evil in that you use smoke and mirrors to avoid paying tax.”
My view is that if rules and regulations exist, those rules must be followed. Some people are able to interpret the rules one way. Others see the rules differently. I think Google has its view of what is required, and the UK officials have another view.
If the quick trip by Google’s chairman is going to happen, will Google be able to explain its point of view and carry the day? My hunch is that there may be some further discussion about taxes which will require more than Google Glass to get the elected officials to see the world as Google perceives it.
Apparently millions of pounds are the point of the discussions. In my opinion, some countries do not understand how nation states should react to Google. Countries, in some situations, may be less influential than companies. Annoyed officials may be clinging to an outmoded view of what rules and regulations are supposed to do.
What’s clear is that Google’s comments reported on May 16, 2013, have sparked some phone calls and a possible meeting between the highest levels of the British government.
Quick actions such as buying Motorola and meeting with David Cameron can signal some of the consequences of quick thinking and even quicker actions. In my opinion, some countries and their officials don’t understand the Google systems and methods.
Stephen E Arnold, May 18, 2013
May 18, 2013
It is not a surprise that 97 percent of state and local IT professional expect their data to grow by more than 50 percent over the next two years. However, more than 75 percent of them are only somewhat or not very familiar with the term big data. These findings are found in a recent report by MeriTalk and GCN did a nice write up on the implications of the study in, “Is Big Data Big Trouble for State, Local Governments?”
A survey of 150 state and local government CIOs and IT managers taken in November and December 2012 comprise the respondents in “The State and Local Big Data Gap.”
The article lists more of the statistics gleaned from the study:
“Seventy-nine percent of responding agencies said it will be at least three years before they are able to take full advantage of big data, even though they see it improving overall efficiency (57 percent); increasing the speed and accuracy of the decision-making process (54 percent); and providing a greater understanding of citizens’ needs (37 percent). And although 79 percent said they were just somewhat or not very familiar with the term, they do report having the kind of problems that big data techniques are intended to solve.”
Are state and local governments able to tap the alleged power of big data? Maybe not yet? That is certainly the conclusions that the numbers speak to.
Megan Feil, May 18, 2013
May 17, 2013
A recent article from Business Insider reports that there is a large body of evidence supporting the idea that governments are using sophisticated spy software. ”This Powerful Spy Software is being Abused by Governments Around the World” has the details on the report and findings by The Citizen Lab, a digital research lab at University of Toronto that has found this software is being used against groups like human rights activists.
The report is called “For Their Eyes Only: The Commercialization of Digital Spying” and zeroes in on surveillance software called FinSpy. This technology remotely scans webmail and social media networks in real time. Additionally it collects encrypted data.
According to the article:
In December 2011 WikiLeaks began publishing FinFisher brochures and videos, which tout the software as enabling governments to monitor targets who ‘regularly change location, use encrypted and anonymous communication channels, and reside in foreign countries.’ Another remarkable thing about the FinSpy, Jean Marc Manach of OWNI notes, is that it can take control of any major operating system while none of the top 40 antivirus systems can recognize it.
There are 36 countries that host FinFisher Command and Control Servers including the United States. During the end of Mubarak’s rule, dissidents found a contract from Gamma mentioning a $380,000 license to run the software for five months. In addition to governments, we wonder what companies use FinFisher as well.
Megan Feil, May 17, 2013
April 19, 2013
Interesting. While some budgets are struggling with sequestration, semantic tech outfit Expert System is increasing its U.S. presence, we learn from the company’s press release, “Expert System Opens Washington, D.C. Metro Area Office and Appoints New Director of Federal Sales.” The move seems to be a natural response to the company’s considerable success in 2012. The new director of federal sales will be one Charlie Breeding. The write-up tells us:
“Located in Rockville, Maryland, the office will serve as the company’s technology headquarters, and reinforces its long-term commitment to serving a growing customer base in both the enterprise and Federal sector with semantic-based solutions for information management, business intelligence and customer service.
“Charlie Breeding brings a depth of Federal and enterprise business development experience that will support the company’s continued growth in strategic markets. He is a graduate of the U.S. Military Academy at West Point, and joins Expert System from Autonomy where he served as Director of Sales for Federal Civilian Agencies. In his new role as Federal Sales Director, he will manage the development of Federal and Fortune 1000 accounts with a focus on expanding the benefits of the Cogito semantic platform to enterprise and government agencies.”
Prominent businesses and government organizations around the world rely on tools from Expert System for data management, collaboration, and customer relationship management. Cogito is the company’s semantic analysis engine that underpins their roster of solutions.
Cynthia Murrell, April 19, 2013
April 16, 2013
Big data is awash in exciting new names, but some tend to stick out more than others. Thetus Is a name not many are familiar with yet, however that is poised to change. Especially, in the government. The small analytics upstart is making big inroads with governmental big data, as we discovered in a recent Thetus Blog post, “Government Big Data Forum.”
The story focuses heavily on explaining the forum:
“This year the forum will “examine emerging technologies and concepts designed to address the full spectrum of agency mission needs for Big Data.” With three keynote speeches and a strong list of vendors exhibiting their products this event will definitely be worth while for anyone interested in government analytical tools. If you’re already attending make sure to stop by our booth to see the latest capabilities in our core product, Savanna!”
Thetus has been busy with government work. In addition to this forum, they recently helped take an analytical look at Kenyan droughts, which brought them heaps of praise. The cornerstone of everything they do is Savanna, which has a fascinating demo available right now. This is a great opportunity for governments looking for big data help. Time to take a look and see how it fits.
Patrick Roland, April 16, 2013
April 4, 2013
With all of the things that are going on in the federal government world it seems that they can add paperwork overload to the list. The GCN article “How to Get on Top of the Federal Records Tidal Wave” sheds light on not only the increasing amount of paperwork that federal agencies are dealing with but also how they are exceeding their annual records management budgets by millions. The MeriTalk survey of federal records managers survey report “Federal Records Management: Navigating the Storm” found the following
“A single federal agency currently spends an average of $34.4 million per year on records management, and manages an average of 209 million records. That number is expected to increase as much as 144 percent — to 511 million records — by 2015.”
Survey participants felt that the inability for their agencies to properly manage their records posed as a hindrance to their agency operations. It was estimated that 18 percent of the annual budget is lost due to inefficient records management. The number of records is expected to grow from 8.4 billion to 20.4 billion so the lack of proper paperwork management could snowball if not gotten under control. 43 percent of survey respondents thought that records management personnel needed better training, 33 percent suggested more funding and 32 percent suggested more support from agency leadership as the best ways to improve managing records. The report recommended that records management should be a top priority and that agencies should invest in training. In addition “agency managers should adopt smart digitization methods and timely destruction of records.” The Federal Paper Work Act was supposed to be the answer to the paperwork overload but looks like big data has taken over. Only time will tell if this is really an improvement or just another shot in the dark.
April Holmes, April 04, 2012
March 29, 2013
The photo below shows the goodies I got for giving my talk at Cebit in March 2013. I was hoping for a fat honorarium, expenses, and a dinner. I got a blue bag, a pen, a notepad, a 3.72 gigabyte thumb drive, and numerous long walks. The questionable hotel in which I stayed had no shuttle. Hitchhiking looked quite dangerous. Taxis were as rare as an educated person in Harrod’s Creek, and I was in the same city as Leibnitz Universität. Despite my precarious health, I hoofed it to the venue which was eerily deserted. I think only 40 percent of the available space was used by Cebit this year. The hall in which I found myself reminded me of an abandoned subway stop in Manhattan with fewer signs.
The PPromise goodies. Stuffed in my bag were hard copies of various PPromise documents. The most bulky of these in terms of paper were also on the 3.73 Gb thumb drive. Redundancy is a virtue I think.
Finally on March 23, 2013, I got around to snapping the photo of the freebies from the PPromise session and reading a monograph with this moniker:
Promise Participative Research Laboratory for Multimedia and Multilingual Information Systems Evaluation. FP7 ICT 20094.3, Intelligent Information Management. Deliverable 2.3 Best Practices Report.
The acronym should be “PPromise,” not “Promise.” The double “P” makes searching for the group’s information much easier in my opinion.
If one takes the first letter of “Promise Participative Research Laboratory for Multimedia and Multilingual Information Systems Evaluation” one gets PPromise. I suppose the single “P” was an editorial decision. I personally like “PP” but I live in a rural backwater where my neighbors shoot squirrels with automatic weapons and some folks manufacture and drink moonshine. Some people in other places shoot knowledge blanks and talk about moonshine. That’s what makes search experts and their analyses so darned interesting.
To point out the vagaries of information retrieval, my search to a publicly accessible version of the PPromise document returned a somewhat surprising result.
A couple more queries did the trick. You can get a copy of the document without the blue bag, the pen, the notepad, the 3.72 gigabyte thumb drive, and the long walk at http://www.promise-noe.eu/documents/10156/086010bb-0d3f-46ef-946f-f0bbeef305e8.
So what’s in the Best Practices Report? Straightaway you might not know that the focus of the whole PPromise project is search and retrieval. Indexing, anyone?
Let me explain what PPromise is or was, dive into the best practices report, and then wrap up with some observations about governments in general and enterprise search in particular.
March 28, 2013
The U.K. addresses its workers on the issue of open-source software in the Open Source section of its Government Service Design Manual. The site is still in Beta as of this writing, but the full release is expected in April. These prescriptions for when and how to use open-source resources contain some good advice, pertinent even to those of us who don’t report to a U.K. government office.
For example, on preparing developers, the document counsels:
“Ensure developers have the ability to install and experiment with open source software, have environments to easily publish prototype services on The Web, have convenient access to a wide variety of network connected devices for testing Web sites, and have unrestricted access to collaboration tools such as GitHub, Stack Overflow and IRC.”
It is worth noting that the text goes on to recommend giving back to open source projects, as well as citing any open source code used. The document also notes that, where security is concerned, open source software actually has an advantage:
“Separation of project code from deployed instances of a project is good development practice, and using open source enables developers to easily fork and experiment with multiple development, and operations to quickly spin-up multiple test and integration environments. . . . A number of metrics and models attest to the quicker response to security issues in open source products when compared to closed source equivalents.”
See the clear and concise document for more of its perspective on open-source software. The conclusion also points to the site’s Open Standards and Licensing page for more information.
Cynthia Murrell, March 28, 2013
March 27, 2013
A reader sent me a link to a call for experts issued by one of the European Commission’s entities. The program is called horizon 2020 and a countdown timer on the Web site reports how many days until Horizon 2020 launches. The program is a “framework” for research and innovation. The Europa.eu Web site says:
The European Commission is widening its search for experts from all fields to participate in shaping the agenda of Horizon 2020, the European Union’s future funding programme for research and innovation. The experts of the advisory groups will provide high quality and timely advice for the preparation of the Horizon 2020 calls for project proposals. The Commission services plan to set up a certain number of Advisory Groups covering the Societal Challenges and other specific objectives of Horizon 2020. To reach the broadest range of individuals and actors with profiles suited to contribute to the European Union’s vision and objectives for Horizon 2020, including striving for a large proportion of newcomers, and to gain consistent and consolidated advice of high quality, the Commission is calling for expressions of interest with the aim of creating lists of high level experts that will participate in each of these groups.
The list of expertise required is wide ranging. What is fascinating is that in the lengthy list of what’s needed there is no call for search, big data, content processing, or analytics. The EC has funded Promise (more accurately PPromise) which has a focus on search from what strikes me as a somewhat traditional approach combined with a quest for “good enough” solutions. I suppose innovation can result from the pursuit of “good enough.” I wonder if the exclusion of search and its related disciplines form this call for experts is a reflection on the role of information retrieval or one the results which have flowed from previous EC support of findability projects. On the other hand, perhaps the assumption is that search is a slam dunk. If so, then those engaged in search and content processing have to do a better job of communicating the dismal state of search and its related disciplines.
Much work remains to be done, and calls for expertise which omit specific remarks about information retrieval trouble me. Maybe the “good enough” notion is more pervasive than I understood.
Stephen E Arnold, March 27, 2013
March 25, 2013
I don’t want to pick on government funding of research into search and retrieval. My goodness, pointing out that payoffs from government funded research into information retrieval would bring down the wrath of the Greek gods. Canada, the European Community, the US government, Japan, and dozens of other nation states have poured funds into search.
In the US, a look at the projects underway at the Center for Intelligent Information Retrieval reveals a wide range of investigations. Three of the projects have National Science Foundation support: Connecting the ephemeral and archival information networks, Transforming long queries, and Mining a million scanned books. These are interesting topics and the activity is paralleled in other agencies and in other countries.
Is fundamental research into search high level busy work. Researchers are busy but the results are not having a significant impact on most users who struggle with modern systems usability, relevance, and accuracy.
In 2007 I read “Meeting of the MINDS: An Information Retrieval Research Agenda.” The report was sponsored by various US government agencies. The points made in the report were, like the University of Massachusetts’ current research run down, were excellent. The 2007 recent influences are timely six years later. The questions about commercial search engines, if anything, are unanswered. The challenges of heterogeneous data also remain. Information analysis and organization which is today associated with analytics and visualization-centric systems could be reprinted with virtually no changes. I cite one example, now 72 months young, for your consideration:
We believe the next generation of IR systems will have to provide specific tools for information transformation and user-information manipulation. Tools for information transformation in real time in response to a query will include, for example, (a) clustering of documents or document passages to identify both an information group and also the document or set of passages that is representative of the group; (b) linking retrieved items in timelines that reflect the precedence or pseudo-causal relations among related items; (c) highlighting the implicit social networks among the entities (individuals) in retrieved material;
and (d) summarizing and arranging the responses in useful rhetorical presentations, such as giving the gist of the “for” vs. the “against” arguments in a set of responses on the question of whether surgery is recommended for very early-stage breast cancer. Tools for information manipulation will include, for example, interfaces that help a person visualize and explore the information that is thematically related to the query. In general, the system will have to support the user both actively, as when the user designates a specific information transformation (e.g., an arrangement of data along a timeline), and also passively, as when the system recognizes that the user is engaged in a particular task (e.g., writing a report on a competing business). The selection of information to retrieve, the organization of results, and how the results are displayed to the user all are part of the new model of relevance.
In Europe, there are similar programs. Examples range from Europa’s sprawling ambitions to Future Internet activities. There is Promise. There are data forums, health competence initiatives, and “impact”. See, for example, Impact. I documented Japan’s activities in the 1990s in my monograph Investing in an Information Infrastructure, which is now out of print. A quick look at Japan’s economic situation and its role in search and retrieval reveals that modest progress has been made.
Stepping back, the larger question is, “What has been the direct benefit of these government initiatives in search and retrieval?”
On one hand, a number of projects and companies have been kept afloat due to the funds injected into them. In-Q-Tel has supported dozens of commercial enterprises, and most of them remain somewhat narrowly focused solution providers. Their work has been suggestive, but none has achieved the breathtaking heights of Facebook or Twitter. (Search is a tiny part of these two firms, of course, but the government funding has not had a comparable winner in my opinion.) The benefit has been employment, publications like the one cited above, and opportunities for researchers to work in a community.,
On the other hand, the fungible benefits have been modest. As the economic situation in the US, Europe, and Japan has worsened, search has not kept pace. The success story is Google, which has used search to sell advertising. I suppose that’s an innovation, but it is not one which is a result of government funding. The Autonomy, Endeca, Fast Search-type of payoff has been surprising. Money has been made by individuals, but the technology has created a number of waves. The Hewlett Packard Autonomy dust up is an example. Endeca is a unit of Oracle and is becoming more of a utility than a technology game changer. Fast Search has largely contracted and has, like Endeca, become a component.
Some observations are warranted.
First, search and retrieval is a subject of intense interest. However, the progress in information retrieval is advancing just slowly in my opinion. I think there are fundamental issues which researchers have not been able to resolve. If anything, search is more complicated today than it was when the Minds Agenda cited above was published. The question is, “Maybe search is more difficult than finding the Higgs Boson?” If so, more funding for search and retrieval investigations is needed. The problem is that the US, Europe, and Japan are operating at a deficit. Priorities must come into play.
Second, the narrow focus of research, while useful, may generate insights which affect the margins of larger information retrieval questions. For example, modern systems can be spoofed. Modern systems generate strong user antipathy more than half the time because they are too hard to use or don’t answer the user’s question. The problem is that the systems output information which is quite likely incorrect or not useful. Search may contribute to poor decisions, not improve decisions. The notion that one is better off using more traditional methods of research is something not discussed by some of the professionals engaged in inventing, studying, or selling search technology.
Third, search has fragmented into a mind boggling number of disciplines and sub-disciplines. Examples range from Coveo (a company which has ingested millions in venture funding and support from the province of Québec) which is sometimes a customer support system and sometimes a search system to Palantir (a recipient of venture funding and US government funding) which outputs charts and graphs, relegating search to a utility function.
Net net: I am not advocating the position that search is unimportant. Information retrieval is very important. One cannot perform some work today unless one can locate a specific digital item in many cases.
The point is that money is being spent, energies invested, and initiatives launched without accountability. When programs go off the rails, these programs need to be redirected or, in some cases, terminated.
What’s going on is that information about search produced in 2007 is as fresh today as it was 72 months ago. That’s not a sign of progress. That’s a sign that very little progress is evident. The government initiatives have benefits in terms of making jobs and funding some start ups. I am not sure that the benefits affect a broader base of people.
With deficit financing the new normal, I think accountability is needed. Do we need some conferences? Do we need giveaways like pens and bags? Do we need academic research projects running without oversight? Do we need to fund initiatives which generate Hollywood type outputs? Do we need more search systems which cannot detect semantically shaped or incorrect outputs?
Time for change is upon us.
Stephen E Arnold, March 25, 2013