The Observatory of Economic Complexity

March 26, 2014

The future of search may just be here, in the form of a specialized search engine courtesy of MIT (quelle surprise!) The Observatory of Economic Complexity (ECI) is the result of a 2010 Master Thesis in Media Arts and Sciences by one Alexander Simoes, and enjoys the continuing support of the MIT Media Lab‘s consortia for undirected research. A history of the project’s contributions is available on Github. Some technical details from the project’s FAQ page:

“Where does the data come from?

“The observatory provides access to bilateral trade data for roughly 200 countries, 50 years and 1000 different products of the SITC4 revision 2 classification. For historical SITC classification data, we use data from The Center for International Data from Robert Feenstra. For up to date HS classification data, we use data provided by UN COMTRADE.

“Can I download this data?

“Sure! You can download the latest dump of the entire data (in MySQL format) here. Or if you are looking for data on a particular country or product, you can click the CSV download button on the right-hand side of all explore pages.”

The rest of the FAQ page lets users know how they can help the project improve by contributing translations, correcting errors, and reporting bugs. Besides the search functionality, there’s a Rankings page listing countries by their current ECI values. The site also offers profiles of different countries’ economic activity. As of this writing, though, I can’t seem to pull up a profile of a specific country, but rather click through a series of what seem like randomly presented entries. An interesting way to kill a few minutes of time, but not so good for finding specific information. If that’s a bug, I hope it’s fixed soon. If it’s a feature… I hope it’s fixed soon.

One more thing to note about this project—it has the potential to inform global policy in ways that make life better around the world. Their book “The Atlas of Economic Complexity: Mapping Paths to Prosperity” makes the case, and is free to download. Said a World Bank chief economist in 2011, “The ECI can play a very important role. It can help identify the role for developing countries.” We do hope the Observatory will live up to its potential.

Cynthia Murrell, March 26, 2014

Sponsored by, developer of Augmentext

Ami Enterprise Intelligence Software

March 25, 2014

In a routine update process, one of the goslings came across Ami, a company that offered Ami Enterprise Intelligence 6.0. A quick review of the company’s Web site at suggested that the company’s last update took place in 2013.

The flagship product in at Version 6.0. The company says:

Enterprise intelligence 6 is a platform for economic intelligence. Designed by AMI, [the system] includes separate modules for the acquisition, analysis and dissemination of information from external sources or internal company content. AMI Enterprise Intelligence is recognized by the community of business intelligence professionals as one of the platforms that ensures the most comprehensive and most innovative business intelligence.

In April 2013, just about one year ago, the company suggests that it participated in the International II SDV Conference. However, the link to the news item returns a 404 error.

Links to the company’s technology on its Web site are working as of March 25, 2014. The company lists four US patents for its core technology. The AMI patent portfolio consists of:

  • GMIL (Grammatical Markers Independent of Language) (# B-3851)
  • Enhancing online support (# B-3561)
  • Language Interface Natural (# B-3563)
  • Language interface for E-Commerce (# B-3562)

The list on the Ami Web site does not contain hyperlinks, however. The Crunchbase profile for the company and products has not been completed. See, for example,

The company appears to be participating in Documation, March 26 in Paris at CNIT Paris la Défense. See The company appears to be participating in Documation, March 26 in Paris at CNIT Paris la Défense. See The company asserts that it has more than 150 customers.

The company, like Lextek, maintains a low profile, although it reports that it has offices in the United Kingdom and Paris.

Stephen E Arnold, March 25, 2014

Crafting a Customized Search Application to SharePoint

March 25, 2014

A lot of organizations will hire an outside company to customize and implement their SharePoint infrastructure. Others are big enough to have staff onsite devoted to building and maintaining SharePoint. However, either way there are many individuals who have a vested interest in creating customized SharePoint components. Search Content Management covers one “how-to” in their article, “Building a SharePoint 2013 Search-Based Application.”

The article describes its objective:

“While this article doesn’t have the space to cover all aspects of how to build a Microsoft SharePoint 2013 search-based application, we will provide an overview. The key components are list and library structures to store content; metadata and metadata sources, including the Managed Metadata Service; search to crawl the content; user interface elements to surface the content and display templates to render the content with the required formatting.”

Stephen E. Arnold is a longtime follower of all things search, including SharePoint, on his Web site He focuses on the reality of the situation – how users can get the most out of search solutions. For SharePoint, he often finds that customization is key; so building unique components like this could be the difference between a frustrating deployment and a well-used and well-loved solution.

Emily Rae Aldridge, March 25, 2014

GitHub Search: Handy for Some Amazon Sportiness

March 24, 2014

GitHub, an open sourcey operation, is in the news again. Navigate to “AWS Urges Developers to Scrub GitHub of Secret Keys.” ITNews reports that some math club members—sorry, open source folks—have “inadvertently exposed their log-in credentials.”

The write up points out that a search of GitHub “for AWS keys returns almost 10,000 results.” The article notes:

GitHub is a community site where developers post their code and allow collaboration from other interested devs. The problem is developers aren’t taking enough care to ensure their credentials are properly protected.

With the management issues at GitHub, perhaps open source evidences some of the fissures in the open source approach to life, business practices, and, of course, search?

Stephen E Arnold, March 24, 2014

An Egyptian eBooks Search Engine

March 24, 2014

Most people think about the Amazon Kindle, iBooks, and other popular mobile book reading platforms when they hear eBooks. In the Middle East there is fierce competition to dominate eBook sales in the region. Wamda posted the article, “Egyptian eBooks Search Engine Al Kutub Ready To Face The Competition” that gives a rundown about a new player.

Al Kutub is a new book search engine and within twelve days has seen over 10,000 people subscribe. The creator Mohammed Nemat Allah designed Al Kutub to be the largest regional database of digital and audio books. Allah does not host any of the content, instead Al Kutub searches through online sources.

Allah only hosts the books’ bibliographic citation and directs the user toward legitimate book sellers, so he does not have to fear legal action:

“The thirty something Nemat Allah seems to believe in spreading knowledge and is confident of his legal stance, according to statements from his counselor. Whoever objects to the presence of any content, the statements say, should remove it from the source where it was originally posted.”

Al Kutub offers four different subscriptions that offer different services and incentives. There is also an internal social network. The eBook application market is booming! The common belief is that people do not read in this digital age, they just do not read paper.

Whitney Grace, March 24, 2014
Sponsored by, developer of Augmentext

Another Week, Another Enterprise Search System

March 21, 2014

Cloud? Check.

Azure chip consultant reference? Check.

Social angle? Check.

Support for distributed information? Check.

Consumerized interface? Check.

Reference to value? Check.

Automatic alerts? Check.

Customer reference? Check.

Big company pedigree? Check.

Open sourciness? Check.

Exotic technology? Check.

There you have the recipe for a new enterprise search system, at least according to eWeek’s “Highspot Brings Machine Learning to Enterprise Search.” Highpoint’s Web site describes itself this way:

Built for the cloud era, Highspot uses advanced machine learning to help organizations capture, share, and cultivate their most valuable working knowledge.

The pricing information, omitted from the eWeek story just as azure chip consultants omit enterprise search fees, begins at free and comes out of the gate at $20 per user per month or $240 per user per year. For an organization with 400 users, the annual fee works out to about $96,000 for an open source, machine-learning system, a bargain compared to the Google Search Appliance but more expensive than downloading Solr, Searchdaimon, or Elasticsearch and having one staff get search up and running. A less expensive option that works reasonably well is dtSearch, but you need to love the color blue for this search system. If you want an appliance, check out Maxxcat’s systems. These are far less expensive than other appliances, and the new systems are easy to set up and deploy. For cloud action, take a look at Blossom Software’s solution. Chances are your state, country, or municipal government is using the Blossom system built by a former Bell Labs’ whiz kid.

Net net: The enterprise search market is flooded with options. With big, waddling outfits like HP and IBM getting increasingly desperate to make their billion dollar bets pay off, you have high end options as well as free downloadable systems from organizations in Denmark, Norway, Russia, and elsewhere.

Will the pricing hold if a business licensee points the system at 50 million documents? My hunch is that there will be some fine print. Google charges about $900,000 for its appliance capable of processing tens of millions of documents with three years of support. You can check the latest US government discount prices at Just search for “Google Search Appliance” and peruse the government’s price. A commercial price may vary.

The key is that the engines of many systems are open source. The “solution” is software wrappers and checklists that hit the marketing hot buttons. Keep up with Highspot via the company’s blog at

Stephen E Arnold, March 21, 2014

Appen Uses Humans to Improve Non-English Search Relevance

March 21, 2014

The Appen explanation titled Query Relevance delves into the work that the language, search and social technology company has done recently to improve natural language search. Linguist PhD Julie Vonwiller founded the company in 1996 with her engineer husband Chris Vonwiller. In 2010, Appen merged with Butler Hill Group and began making strides in language resources, search, and text. The article explores the issues at hand when it comes to natural language search,

“Even a query as seemingly simple as the word “blue” could be looking for any of the following: a description or picture of the color, a television show, a credit card, a misspelling of an electronic cigarette brand, or a rap artist. By analyzing what the most likely user intent is and returning valid and appropriate results in the correct order of relevance, we encourage a relationship whereby the user will return again and again to our client’s search engine.”

Appen has established a “global network” of locals who are trained experts in the language and local culture. This team allows for the most accurate interpretations of queries from regional users. The company is continually working to improve their processes, both through collaboration with users and advances in the program to provide the best possible results.

Chelsea Kerwin, March 21, 2014

Sponsored by, developer of Augmentext

Lextek Onix Profile Now Available… Free

March 20, 2014

You may not know that profiles of vendors from IDC-type operations can cost $3,500 or more. Even more impressive are azure chip consulting firms’ penchant for using information from folks who provide reports for free. Hey, there are many former middle school teachers, failed Web masters, and even poetry majors who need a job. Have at it, I say.

If you are interested in search and content processing, you may know that I have been posting 15 to 30 pages profiles of information retrieval vendor systems. Today you can snag a PDF report about Lextek International and its Onix search toolkit.

You have not heard of Lextek?

I would wager a cup of tea made from water drawn from Harrods Creek that you have used the search function in Acrobat. If you have, you have experienced the thrills of the Onix toolkit used by Adobe to make it a delight to search a PDF file.

Lextek keeps a low profile. The company operates from a suburban home in Utah., As part of the founder’s diversification effort, the driving force of Onix opened a gourmet chocolate shop. Autonomy bought Verity and Interwoven. Lextek moved into chocolate and did not implement a search system for the new venture’s Web site. Interesting to me.

You can find the report, which is current through late 2008, on my site. The report is at There are 12 reports in the series. IDC has taken down the profiles of open source search systems that appeared between 2012 and March 2014. I will be posting the unfiltered versions of these reports in coming months.

My goal is to make the complete collection of more than 50 vendor profiles available without charge. The index to the free reports in the Xenky series is at

If you want to correct or complain about a particular report, please, use the Comments section of Beyond Search for the article announcing the availability of a profile.

Before writing baloney about vendor’s origin and core technology, I suggest you check out my reports. The misinformation about which company first used the phrase “content intelligence” or “linguistic search” is amazing. My profiles point out which company used a phrase and when. For example, have you heard about “information black holes”? Autonomy used the phrase in a remarkable marketing brochure in 1997. I know that some subsequent users of the phrase assumed it was a product of their fertile mind. Nope.

Enjoy the Lextek write up. You can try the system if you have Acrobat Reader 6 or higher. Did Adobe make optimal use of Onix? In my opinion, not by a long shot.

Stephen E Arnold, March 20, 2014

The HP View of Watson

March 19, 2014

I suppose IBM will respond with more than recipes at South by Southwest. If you enjoy big companies’ analyses of one another, you will want to gobble up “15 Reasons HP Autonomy IDOL OnDemand Beats IBM Watson.” This is not the recipe for making pals with a $100 billion outfit.

What does IBM Watson have as weaknesses? What does the reinvented (sort of) Autonomy technology have as strengths? I cannot reproduce the 15 items, but I can highlight five of the weaknesses and enjoin you to crack open the slideshow that chops up the IBM Watson PR stunt.

Here are the six weaknesses I found interesting:

  1. Reason 3. IBM Watson is a data scientist heavy platform. IDOL is not. My view is that HP paid $11 billion for Autonomy and now has to deal with the write down, legal actions related to the deal, and tossing out Mike Lynch’s revenue producing formula. Set aside the data scientists and the flip side “too few data scientists” and consider the financial mountain HP has to climb. A data scientist or two might help.
  2. Reason 4. HP has “an ultimate partner story.” I find this fascinating. Autonomy grew via acquisitions and an indirect sales model. Now HP wants to make the partner model generate enough revenue to pay off the Autonomy purchase price, grow HP’s top line faster than traditional lines of business collapse, and make partners really happy. This may be a big job. See IBM weakness 9, 11, 12, and 14. There is some overlap which suggests HP is having difficulty cooking up 15 credible weaknesses of Watson. (I can name some, by the way.)
  3. Reason 6. HP offers a “proven power platform for analytics.” I am not sure about the alliteration nor am I confident in my understanding of analytics and search. IBM Watson doesn’t have much to offer in either of these departments. IDOL, at least the pre HP incarnation, had reasonably robust security capabilities. I wonder how these will be migrated to the HP multi cloud environment. IBM Watson is doing recipes, so it too has its hands full.
  4. Reason 10. HP asserts that it offers a “potential app store.” I understand app store. Apple offers one that works well. Google is in the app store business. Amazon has poked its nose into the marketplace as well. I don’t think either HP or IBM have credible app stores for variants of the two companies’ search technologies. Oh, well, it sounds good. “Potential” is a deal breaker for me.
  5. Reason 13. HP “is focused on ramping up the innovation lifecycle.” I think this means coming up with good ideas faster. I am not sure if a service can spark a client’s innovation. Doesn’t lifecycle include death? Since IBM Watson seems a work in progress, I am not sure HP’s just released reinvention of Autonomy has a significant advantage because it too is “ramping up.”
  6. Reason 15. HP has “fired up” engineers. Okay, maybe. IBM has engineers, but I am not sure if they are fired up. My question is, “Is being fired up” a good thing. I want engineers to deliver solutions that work, are not “ramping up,” and not marketing driven.

My take on this slide deck is that it is nothing more than a marketing vehicle. I had to click multiple ads for HP products and services to view the 15 reasons. Imagine my disappointment that five of the IBM weaknesses related to partnering programs. Wow, that must be really helpful to a licensee of cloud Autonomy trying to deal with performance issues on an HP data center. HP is definitely countering IBM Watson’s recipe play with old fashioned cheerleading. Rah, rah.

Stephen E Arnold, March 19, 2014

Improving SharePoint Search Efficiency

March 17, 2014

For many users, search is pretty much the main point of SharePoint, yet many complain of the inefficiency and inaccuracy of the search function. Search Windows Server addresses the issue in a great article that highlights search features from SharePoint 2007 to SharePoint 2013. Read the details in “Five Ways to Make SharePoint Search More Efficient.”

The article begins:

“Admins and end users alike find that using the search feature in SharePoint is helpful, but it can be frustrating . . . We compiled the five best tips to help SharePoint users work through common questions and situations with SharePoint search. Covering multiple versions of SharePoint, these tips highlight how to make searching in SharePoint more efficient, how to improve search functionality and more.”

Stephen E. Arnold has an interest in search; in fact he has made a career of it. His Web site,, highlights the latest in search – the good and the bad. SharePoint gets a lot of coverage.

Emily Rae Aldridge, March 17, 2014

« Previous PageNext Page »