Training Your Smart Search System
August 2, 2014
With the increasing chatter about smart software, I want to call to your attention this article, “Improving the Way Neural Networks Learn.” Keep in mind that some probabilistic search systems have to be trained on content that closely resembles the content the system will index. The training is important, and training can be time consuming. The licensee has to create a training set of data that is similar to what the software will index. Then the training process is run, a human checks the system outputs, and makes “adjustments.” If the training set is not representative, the indexing will be off. If the human makes corrections that are wacky, then the indexing will be off. When the system is turned loose, the resulting index may return outputs that are not what the user expected or the outputs are incorrect. Whether the system uses knows enough to recognize incorrect results varies from human to human.
If you want to have a chat with your vendor regarding the time required to train or re-train a search system relying on sample content, print out this article. If the explanation does not make much sense to you, you can document off query results sets, complain to the search system vendor, or initiate a quick fix. Note that quick fixes involve firing humans believed to be responsible for the system, initiate a new search procurement, or pretend that the results are just fine. I suppose there are other options, but I have encountered these three approach seasoned with either legal action or verbal grousing to the vendor. Even when the automated indexing is tuned within an inch of its life, accuracy is likely to start out in the 85 to 90 percent range and then degrade.
Training can be a big deal. Ignoring the “drift” that occurs when the smart software has been taught or learned something that distorts the relevance of results can produce some sharp edges.
Stephen E Arnold, August 2, 2014
Let Google Do It: Outsource the Government
August 1, 2014
I saw a discussion thread describing a proposed action to allow Google to become the National Technical Information Service. Although I served on the Board of NTIS years ago, I don’t think too much about the operation. Apparently Google does. Apparently there are some folks who ignore the repository aspect of NTIS. Google is search and pretty darned good the argument goes.
The idea is presented in S 2206, but the officials elected by the people are occupied with a number of weighty issues.
I did not this item about the efficacy of US government management. Take a look at “Poorly Managed HealthCare.gov Construction Cost $840 Million, Watchdog Finds.” A billion here and a billion there may not be a big deal as the economy improves according to some pundits.
What Google does not do, perhaps USA.gov will? What about the Library of Congress, various government document repositories, and, of course, the funding entities themselves?
Let Google do it? Why not?
Stephen E Arnold, August 4, 2014
Exorbyte: Yep, eCommerce Search
August 1, 2014
Despite the mid tier consulting firms trying valiantly to convince some that Exorbyte is an enterprise search company, there is some evidence to the contrary. I noticed a tweet from a Twitter fan billed as MarketLinks. The link points to a service with which I was not familiar, Vidyours. On that page, Exorbyte presents a number of videos:
A word of possible interest: This video distribution search wanted to install a suspicious video viewer on my system. I declined.
Net net: Exorbyte has a number of eCommerce video for what appears to be the purpose of selling licenses to companies looking for a eCommerce search system. Perhaps the enterprise search videos are out there, just not on the dicey Vidyours.com site? Is Exorbyte performing a search pivot or just stretching its system to generate more leads and revenue?
Stephen E Arnold, August 1, 2014
More Knowledge Quotient Silliness: The Florida Gar of Search Marketing
August 1, 2014
I must be starved for intellectual Florida Gar. Nibble on this fish’s lateral line and get nauseous or dead. Knowledge quotient as a concept applied to search and retrieval is like a largish Florida gar. Maybe a Florida gar left too long in the sun.
Lookin’ yummy. Looks can be deceiving in fish and fishing for information. A happy quack to https://www.flmnh.ufl.edu/fish/Gallery/Descript/FloridaGar/FloridaGar.html
I ran a query on one of the search systems that I profile in my lectures for the police and intelligence community. With a bit of clicking, I unearthed some interesting uses of the phrase “knowledge quotient.”
What surprised me is that the phrase is a favorite of some educators. The use of the term as a synonym for plain old search seems to be one of those marketing moments of magic. A group of “experts” with degrees in home economics, early childhood education, or political science sit around and try to figure out how to sell a technology that is decades old. Sure, the search vendors make “improvements” with ever increasing speed. As costs rise and sales fail to keep pace, the search “experts” gobble a cinnamon latte and innovate.
In Dubai earlier this year, I saw a reference to a company engaged in human resource development. I think this means “body shop,” “lower cost labor,” or “mercenary registry,” but I could be off base. The company is called Knowledge Quotient FZ LLC. If one tries to search for the company, the task becomes onerous. Google is giving some love to the recent IDC study by an “expert” named Dave Schubmehl. As you may know, this is the “professional” who used by information and then sold it on Amazon until July 2014 without paying me for my semi-valuable name. For more on this remarkable approach to professional publishing, see http://wp.me/pf6p2-auy.
Also, in Dubai is a tutoring outfit called Knowledge Quotient which delivers home tutoring to the children of parents with disposable income. The company explains that it operates a place where learning makes sense.
Companies in India seem to be taken with the phrase “knowledge quotient.” Consider Chessy Knowledge Quotient Private Limited. In West Bengal, one can find one’s way to Mukherjee Road and engage the founders with regard to an “effective business solution.” See http://chessygroup.co.in. Please, do not confuse Chessy with KnowledgeQ, the company operating as Knowledge Quotient Education Services India Pvt Ltd. in Bangalore. See http://www.knowledgeq.org.
What’s the relationship between these companies operating as “knowledge quotient” vendors and search? For me, the appropriation of names and applying them to enterprise search contributes to the low esteem in which many search vendors are held.
Why is Autonomy IDOL such a problem for Hewlett Packard? This is a company that bought a mobile operating system and stepped away from. This is a company that brought out a tablet and abandoned it in a few months. This is a company that wrote off billions and then blamed the seller for not explaining how the business worked. In short, Autonomy, which offers a suite of technology that performs as well or better than any other search system, has become a bit of Florida gar in my view. Autonomy is not a fish. Autonomy is a search and content processing system. When properly configured and resourced, it works as well as any other late 1990s search system. I don’t need meaningless descriptions like “knowledge quotient” to understand that the “problem” with IDOL is little more than HP’s expectations exceeding what a decades old technology can deliver.
Why is Fast Search & Transfer an embarrassment to many who work in the search sector. Perhaps the reason has to do with the financial dealings of the company. In addition to fines and jail terms, the Fast Search system drifted from its roots in Web search and drifted into publishing, smart software, and automatic functions. The problem was that when customers did not pay, the company did not suck it up, fix the software, and renew their efforts to deliver effective search. Nah, Fast Search became associated with a quick sale to Microsoft, subsequent investigations by Norwegian law enforcement, and the culminating decision to ban one executive from working in search. Yep, that is a story that few want to analyze. Search marketers promised and the technology did not deliver, could not deliver given Fast Search’s circumstances.
What about Excalibur/Convera? This company managed to sell advanced search and retrieval to Intel and the NBA. In a short time, both of these companies stepped away from Convera. The company then focused on a confection called “vertical search” based on indexing the Internet for customers who wanted narrow applications. Not even the financial stroking of Allen & Co. could save Convera. In an interesting twist, Fast Search purchased some of Convera’s assets in an effort to capture more US government business. Who digs into the story of Excalibur/Convera? Answer: No one.
What passes for analysis in enterprise search, information retrieval, and content processing is the substitution of baloney for fact-centric analysis. What is the reason that so many search vendors need multiple injections of capital to stay in business? My hunch is that companies like Antidot, Attivio, BA Insight, Coveo, Sinequa, and Palantir, among others, are in the business of raising money, spending it in an increasingly intense effort to generate sustainable revenue, and then going once again to capital markets for more money. When the funding sources dry up or just cut off the company, what happens to these firms? They fail. A few are rescued like Autonomy, Exalead, and Vivisimo. Others just vaporize as Delphes, Entopia, and Siderean did.
When I read a report from a mid tier consulting firm, I often react as if I had swallowed a chunk of Florida gar. An example in my search file is basic information about “The Knowledge Quotient: Unlocking the Hidden Value of Information.” You can buy this outstanding example of ahistorical analysis from IDC.com, the employer of Dave Schubmehl. (Yep, the same professional who used my research without bothering to issue me a contract or get permission from me to fish with my identity. My attorney, if I understand his mumbo jumbo, says this action was not identity theft, but Schubmehl’s actions between May 2012 and July 2014 strikes me as untoward.)
Net net: I wonder if any of the companies using the phrase “knowledge quotient” are aware of brand encroachment. Probably not. That may be due to the low profile search enjoys in some geographic regions where business appears to be more healthy than in the US.
Can search marketing be compared to Florida gar? I want to think more about this.
Stephen E Arnold, August 1, 2014
The Perks and Pitfalls of a Yahoo/ AOL merger
August 1, 2014
An article on re/code titled Two Years and Still Stuck in a Revenue Rut, Will Yahoo’s Mayer Bite the AOL Bullet? attempts to separate fact from fiction when it comes to the possible merger. The sight of Mayer and AOL’s Armstrong chatting during the Allen & Co. conferences in Sun Valley proved too exciting for some reporters, who jumped to the conclusion that the two old friends were making a deal. While making it very clear that the merger is entirely hypothetical at this point, this article makes an argument for the perks of getting Yahoo and AOL together. The article states,
“AOL and Yahoo have very similar businesses and could easily combine them…Internally at both companies, this is not seen as a completely bad idea — both share numerous advertising overlaps, content overlaps, video overlaps and too many employees doing the same thing. In addition, Mayer could certainly use a decent exec she actually knows well like Armstrong at the top (presumably, him CEO, her chairman).”
However, the article also points out some hedging by Mayer who has made some comments about the possible merger being boring and “backward-looking.” On the other hand, if Mayer wants the Huffington Post, she may have to take AOL along with it. So the article is inconclusive. Will AOL and Yahoo finally get married? Can two Xooglers (former Googlers) make one scion named Revenue?
Chelsea Kerwin, August 01, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Niice – a Search Engine Bent on Aesthetic Appeal
August 1, 2014
A new search engine has appeared on the radar called Niice. It is focused on inspiration search and the presentation of quality images to spark ideas. The Niice blog lays out their mission statement (to be a sort of upscale Google for graphic designers, photographers, tasteful people in general) as well as stating their design principles. These include remaining safe work, not getting in the way, and being restrained in content choices. They also stress embracing serendipity,
“We want to give you results that you don’t expect, presented in a way that inspires your brain to make new connections. Niice isn’t for finding an image that you can copy, it’s for bringing together lots of ideas for you to combine into something new…. The internet is full of inspiration, but since Google doesn’t have a ‘Good Taste’ filter, finding it means jumping back and forth between blogs and gallery sites.”
Somewhat like Pinterest, Niice allows users to create “moodboards” or collections of images which can be saved, collaborated on, or downloaded as JPEGs. When I searched the term “water”, a collage of images appeared that included photographs of ocean waves and a sunset over a lake, a glass sculpture resembling a dew drop, and a picture that linked to a story on an artist who manipulates water with her mind among many others.
Chelsea Kerwin, August 01, 2014
Sponsored by ArnoldIT.com, developer of Augmentext