Interview with Dave Hawking Offers Insight into Bing, FunnelBack and Enterprise Search
December 9, 2014
The article titled To Bing and Beyond on IDM provides an interview with Dave Hawking, an award-winner in the field of information retrieval and currently a Partner Architect for Bing. In the somewhat lengthy interview, Hawking answers questions on his own history, his work at Bing, natural language search, Watson, and Enterprise Search, among other things. At one point he describes how he arrived in the field of information retrieval after studying computer science at the Australian National University, where he the first search engine he encountered was the library’s card catalogue. He says,
“I worked in a number of computer infrastructure support roles at ANU and by 1991 I was in charge of a couple of supercomputers…In order to do a good job of managing a large-scale parallel machine I thought I needed to write a parallel program so I built a kind of parallel grep… I wrote some papers about parallelising text retrieval on supercomputers but I pretty soon decided that text retrieval was more interesting.”
When asked about the challenges of Enterprise Search, Hawking went into detail about the complications that arise due to the “diversity of repositories” as well as issues with access controls. Hawking’s work in search technology can’t be overstated, from his contributions to the Text Retrieval Conferences, CSIRO, FunnelBack in addition to his academic achievements.
Chelsea Kerwin, December 09, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
SharePoint in a Nutshell
December 9, 2014
Business Management Daily specializes in business advice, and in their recent article, “Microsoft SharePoint in a Nutshell,” they focus on how to make the most of SharePoint for business.
The article begins:
“Simply put, SharePoint provides a platform on which to collaborate and share resources. It combines the best of the central repository idea of shared network drives and couples it with the tools that facilitate distribution, communication and the sharing of information. In the video below, Microsoft Certified Trainer Melissa Esquibel explains how to use this Microsoft program effectively.”
This is a good place to start for users who are interested in getting started with SharePoint. For users who need a little more than a brief overview, check out the SharePoint feed on ArnoldIT.com. Stephen E. Arnold has spent his career learning and sharing about search and all things enterprise, including SharePoint. He has dedicated a separate SharePoint feed to collocate all the latest news, tips, and tricks that may come in handy for SharePoint users at every level.
Emily Rae Aldridge, December 9, 2014
Another Good Enough Challenge to Proprietary Enterprise Search
December 8, 2014
The protestations of the enterprise search vendors in hock for tens of millions to venture funders will get louder. The argument is that proprietary search solutions are just better.
Navigate to “Postgres Full-Text Search Is Good Enough!” This has been the mantra of some of the European Community academics for a number of years. I gave a talk at CeBIT a couple of years ago and noted that the proprietary vendors were struggling to deliver a coherent and compelling argument. Examples of too-much-chest-beating came from speakers representing and Exalead and a handful of consultants. See, for example, http://bit.ly/1zicaGw.
The point of the “Postgres Good Enough” article strikes me as:
Search has became an important feature and we’ve seen a big increase in the popularity of tools like Elasticsearch and SOLR which are both based on Lucent. They are great tools but before going down the road of Weapons of Mass Search, maybe what you need is something a bit lighter which is simply good enough! What do you I mean by ‘good enough’? I mean a search engine with the following features: stemming, ranking/boost, multiple languages, fuzzy search, accent support. Luckily PostgreSQL supports all these features.
So not only are the proprietary systems dismissed, so are the open source solutions that are at the core of a number of commercialization ventures.
I don’t want to argue with the premise. What is important is that companies trying to market enterprise search solutions now have to convince a buyer why good enough is not good enough.
For decades, enterprise search vendors have been engaged in a Cold War style escalation. With each feature addition from one vendor (Autonomy), other vendors pile on more features (Endeca).
The result is that enterprise search tries to push value on customers, not delivering solutions that are valued by customers.
The “good enough” argument is one more example of a push back against the wild and crazy jumbles of code that most enterprise search vendors offer.
The good news is that good enough search is available, and it should be used. In fact, next generation information access solution vendors are including “good enough” search in robust enterprise applications.
What is interesting is that the venture funding firms seem content to move executives in and out of companies not hitting their numbers. Examples include Attivio and LucidWorks (really?). Other vendors are either really quiet or out of business like Dieselpoint and Hakia. I pointed out that the wild and crazy revenue targets for HP Autonomy and IBM Watson are examples of what happens when marketing takes precedent over what a system can do and how many customers are available to generate billions for these big outfits.
Attention needs to shift to “good enough” and to NGIA (next generation information access) vendors able to make sales, generate sustainable revenue, and solve problems that matter.
Displaying a results list is not high on the list of priorities for many organizations. And when search becomes job one, that is a signal the company may not have diagnosed its technological needs accurately. I know there are many mid tier consultants and unemployed webmasters who wish my statements were not accurate. Alas, reality can be a harsh task master or mistress.
Stephen E Arnold, December 8, 2014
Google Ads: And the Research Means…?
December 8, 2014
I read “5 Viewability Findings from Google.” Frankly I am not certain if the five results are good news or bad news.
Here’s an example:
56.1% of all display ad impressions never appeared on a screen, Google’s research found.
Does this mean that Google needs to do more to get ads viewed? One approach would be to use the incredibly annoying approach that displays an ad, hides the “skip” or “close” option, and uses flashing text to communicate its powerful message. Perhaps soon?
Another example:
Page position isn’t always the best indicator of viewability, Google’s research found. In fact, far from all above-the-fold ad impressions are viewable, and many below-the-fold ones are. The median viewability for ad units above-the-fold was 68%, Google said, compared with 40% below-the-fold. Perhaps counter intuitively, the most “viewable” ads were not placed at the top of publisher pages, but were actually located directly “above-the-fold,” at the bottom of the visible part of a webpage immediately after it loaded.
So ads can be anywhere to be viewed? I like the “counter intuitive angle” because it suggests that Google data are clarifying what users really do. Don’t users look for results that answer their question? I suppose that too is counter intuitive.
Please, work through the other three findings.
It seems to me that Google ads appear to be chugging along as long at the user is accessing search results using a desktop computer. Don’t most folks access Google and other online information via a mobile device? Less screen real estate, right?
Are there other source of revenue that will replace the difference between the ad power of a dinosaur type of access and the new breed of cat access?
Stephen E Arnold, December 8, 2014
Secret Hiring Tool for Silicon Valley Types
December 8, 2014
It is harder than ever to find a job for young graduates and seasoned workers. Yet according to the FitFrnd blog, Silicon Valley is having trouble finding good employees. The post “Silicon Valley’s Best-Kept Secret: How AngelList Is Slowly Disrupting The Hiring Industry” explains that rather than relying on “old-fashioned” job search engines, AngelList’s is proving to be more reliable in finding talent.
AngelList is primarily a crowdfundung Web site used by startups to raise money for new endeavors. AngelList, however, is proving to be a new resource to find a job or locate someone to fill the position. Other career Web sites fail to attract the right talent. The post explains how FitFrnd ad trouble finding a blogger/content marketer:
“We finally decided to give AngelList a serious try. We had tried it before, but our efforts had been half-hearted. This time we improved our copy, added information such as why the company is such an amazing place to work (it is!), details about salary and equity ranges, and even screenshots of the app. Within a few days, we have received about 80 resumes, including some really compelling candidates.”
What makes AngelList different is that it allows applicants to apply privately and know the salary up front. It also cuts out the middleman. While the information is searchable, you have to join AngelList. While it does not cost to join, it eventually might, but the price is you are paying for a service that works…for the moment.
Whitney Grace, December 08, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Blast toward the Moon With Rocket Software
December 8, 2014
YouTube informational videos are great. They are short, snappy, and often help people retain more information about a product than reading the “about” page on a Web site. Rocket Software has its own channel and the video “Rocket Enterprise Search And Text Analytics” packs a lot of details into 2.49 minutes. The video is described as:
“We provide an integrated search platform for gathering, indexing, and searching both structured and unstructured data?making the information that you depend on more accessible, useful, and intelligent.”
How does Rocket Software defend that statement? The video opens with a prediction that by 2020 data usage will have increased to forty trillion gigabytes. It explains that data is the new enterprise currency and that it needs to be kept organized, then it drops into a plug for the company’s software. The compare themselves to other companies by saying Rocket Software makes the enterprise search and text analytics as simple as a download and then it will be up and running. Other enterprise searches require custom coding, but Rocket Software explains it offers these options out of the box. Plus it is a cheaper product without having to sacrifice quality.
Software usage these days is about functionality and ease of use for powerful software. Rocket Software states it offers this. Try putting it to the test.
Whitney Grace, December 08, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Attivio Is Now an Oracle Competitor, Not a Search Vendor
December 7, 2014
I read “Oracle Competitor Attivio Promotes Stephen Baker to CEO.” Quite a surprise because Attivio is a search-and-retrieval company with a layer of analytics wrappers. Founded by former Fast Search & Transfer executives, the company ingested more than $30 million in venture funding and now has to generate a return for the stakeholders.
I am not sure if Oracle perceives Attivio as a competitor. MarkLogic, an XML data management vendor, also positioned itself as an Oracle competitor. After hitting a wall at about $60 million and grinding through some new presidents, MarkLogic is keeping a low profile in the markets I track.
Now Attivio may be following this MarkLogic path. Two of the founders of Attivio are moving up. Below Ali Riaz and Sid Probstein is Stephen Baker. Mr. Baker also was a Fast Search & Transfer professional. He worked at RAMP Holdings afar a stint at Reed Elsevier where he was responsible for—wait for it—search.
Attivio co-founder Will Johnson is now the chief technology officer. Mr Johnson is another Fast Search alum. He has worked at GetConnected as—wait for it—a search architect.
My thought is that saying Attivio is a competitor to Oracle is one way to connect semantically with “Oracle.”
But as MarkLogic’s trajectory has demonstrated, there is more to saying a company is “like” Oracle than generating revenue on the scale of Oracle.
Both Attivio and MarkLogic are information access companies. Both want to generate more revenue for their stakeholders. Perhaps a management shift will do the trick.
My view is that if Oracle thought either Attivio or MarkLogic offered a unique, high value service, Oracle would have acquired these companies. Oracle may buy Attivio and MarkLogic. I think the catalyst would be generating and demonstrating rapid revenue growth, expanding margins, and a track record of sustainable revenues. i look forward to a glowing analysis of each firm by IDC’s “expert” Dave Schubmehl in the next month or so. Maybe saying something does make reality change?
Stakeholders want a payback. Management change is a precursor to even more significant activity to benefit those who pumped tens of millions into what may be an old-school approach to information access.
Stephen E Arnold, December 7, 2014
Pi in the Sky: HP and IBM Race to Catch Up with NGIA Leaders
December 7, 2014
I read “HP Takes Analytics to the Cloud in Comeback to IBM’s Watson.” The write up is darned interesting. Working through the analysis reminded me that HP does not realized that Autonomy’s 1999 customer BAE Systems has been working with analytics from the cloud for—what?—15 years? What about Recorded Future, SAIC, and dozens of other companies running successful businesses with this strategy?
The article points out that two large and somewhat pressured $100 billion companies are innovating like all get out. I learned:
Although it [Hewlett Packard] may not win any trivia contests in the foreseeable future, the hardware maker’s entry into the world of end-of-end analytics does hold up to Watson where the rubber meets the road in the enterprise…But the true equalizer for the company is IDOL, the natural language processing and search it obtained through the $11.7 billion acquisition of Autonomy Corp. PLC in 2011, which reduces the gap between human and machine interaction in a similar fashion to IBM’s cognitive computing platform.
Okay. IBM offers Watson, which was supposed to generate a billion or more by 2015 and then surge to $10 billion in revenue in another four or five years. What is Watson? As I understand it, Watson is open source code, some bits and pieces from IBM’s research labs, and wrappers that convert search into a towering giant of artificial intelligence. Why doesn’t IBM focus on its next generation information access units that are exciting and delivering services that customers want. i2 does not produce recipes incorporating tamarind. Cybertap does not help sick teenagers.
HP, on the other hand, owns the Autonomy Digital Reasoning Engine and the Integrated Data Operating Layer. These incorporate numerical recipes based on the work of Bayes, LaPlace, and Markov, among others. The technology is not open source. Instead, IDOL is a black box. HP spent $11 billion for Autonomy, figured out that it overpaid, wrote off $5 billion or so, and launched a global scorched earth policy for its management methods. Recently, HP has migrated DRE and IDOL to the cloud. Okay, but HP is putting more effort into accusing Autonomy of fooling HP. Didn’t HP buy Autonomy after experts reviewed the deal, the technology, and the financial statements? HP has lost years in an attempt to redress a perceived wrong. But HP decided to buy Autonomy.
A New Spin on Search: Enterprise Listening Platforms
December 6, 2014
In my Yahoo Alert this morning, I saw an item which puzzled me.
When I clicked on the link, I was shown this item from Yahoo Finance: “Independent Research Firm Ranks Visible Technologies as a Leader in New Enterprise Listening Platforms Report/”
The article from MarketWired informed me:
In addition to securing a Leadership ranking among a pool of 11 enterprise listening software and service providers, Visible received among the second highest rankings in the Strategy category. Its road map was cited as including “self-service research tools and additional automation of client-specific data.” The report also stated that “Visible marries an intuitive dashboard that enables users to uncover insights and refine search with high-quality consulting.”
Yep, search is part of the Forrester “enterprise listening platform” functionality. I must admit that the azure chip consultants will resonate with this phrase. I am not sure what it means. I think I get the search part, but the mashing up dashboards, social media, advanced data processing capabilities, and partnership plans amuses me.
Whatever floats one’s boat and boosts one’s revenues is okay with me. I am not sure what Simpson’s mathematics means but it generates revenues and apparently helps sell newspapers.
Stephen E Arnold, December 6, 2014
A Plan for Achieving ROI via Text Analytics
December 6, 2014
ROI is the end goal for many big data and enterprise related projects and it is refreshing to see some information published in regards to if companies achieve it like we recently saw in a Smart Data Collective article, “Text Analytics, Big Data and the Keys to ROI.” According to a study released last year (further discussed in“Text/Content Analytics 2011: User Perspectives on Solutions and Providers”) the reason many businesses do not get positive returns has to do with the planning phase. Many report that they did not start with a clear plan to get there.
The author shares with us an example from his full-time work in text analytics. One of his clients that was focused on sifting through masses of social media data and data from government applications looking for suspicious activity needed a solution for a text-heavy application. The author responded by suggesting a selective cross-lingual process, one which worked with the text in its native language, and only on the text that was relevant to the topic of interest.
The following happened after the author’s suggestion:
Although he seemed to appreciate the logic of my suggestions and the quality benefits of avoiding translation, he just didn’t want to deal with a new approach. He asked to just translate everything and analyze later – as many people do. But I felt strongly that he’d be spending more and getting weaker results. So, I gave him two quotes. One for translating everything first and analyzing later – his way, and one for the cross-lingual approach that I recommended. When he saw that his own plan was going to cost over a million dollars more, he quickly became very open minded about exploring a new approach.
It sounds like the author could have suggested a number of similar semantic processing solutions. For example, Cogito Intelligence API enhances the ability to decipher meaning and insights from a multitude of content sources including social media and unstructured corporate data. The point is that ROI is out there and there are innovative companies like Expert System and beyond enabling it.
Megan Feil, December 6, 2014