The Future of Virtual Search Lies in Surprising Hands
October 12, 2017
The world of text-based search has its days numbered. At least, that’s what some experts are saying when they discuss virtual search engines. But should we be throwing today’s strongest text-based search giants on the scrap heap? It’s not that easy, according to Search Engine Watch in a new article called, “Pinterest, Google, or Bing: Who Has The Best Virtual Search Engine?”
Historically, we know that video, images, and articles have been cataloged in a text-based system for search. This keyword-based system that Google has perfected over the last few decades is, however, more limiting than much anticipated. These static keyword searches are ignoring a vast swath of search potential that some surprising sources are tapping into the virtual search market.
According to Search Engine Watch:
Already, specific ecommerce visual search technologies abound: Amazon, Walmart, and ASOS are all in on the act. These companies’ apps turn a user’s smartphone camera into a visual discovery tool, searching for similar items based on whatever is in frame. This is just one use case, however, and the potential for visual search is much greater than just direct ecommerce transactions.
After a lot of trial and error, this technology is coming of age. We are on the cusp of accurate, real-time visual search, which will open a raft of new opportunities for marketers.
So, who is going to lead the charge in this virtual search frontier? Google, right? They own search today and will probably own it tomorrow, right? Not so fast. According to the piece, Google Lens is still in BETA testing and not as robust as the competition. If they follow their historical trajectory, they will be a leader here. But it’s too early to tell.
Instead, the virtual search market is currently led by some surprising players. Pinterest and Bing both have platforms that provide different levels of accuracy in accumulating things like your search history and things you take pictures of to help search. All these companies are still pretty new at virtual search, but we like the odds of Bing and Pinterest to stake a serious claim for the future.
Patrick Roland, October 12, 2017
Dow Jones: Fake News As a Training Error
October 11, 2017
In the dead tree edition of the Wall Street Journal, I read an interesting but all too brief article; to wit: “Dow Jones Publishes Errant Headlines in Systems Snafu.” The main point is that Dow Jones pushed out “nearly 2,000 dummy headlines and articles.” The company, of course, is sorry, very sorry. The “false headlines” were disappeared. The small item on page B 5 at the bottom of the page of newsprint included this statement on October 11, 2017:
I take today’s inadvertent and erroneous publication of testing materials extremely seriously.
Fake news. Nah, just a digital flub from the proud Murdoch outfit. Mistakes happen. Perhaps the Dow Jones engine will factor in this human response when it next excoriates Silicon Valley outfits who stub their toes.
Oh, if you are looking for the story online, you have to search Google News for “fake news” and follow the links to everyone except the Wall Street Journal. Google does point to this item on the dowjones.com Web site. The publicist does not include the mea culpa, which I find interesting.
Stephen E Arnold, October 11, 2017
Online Real Time Tracking
October 11, 2017
Online is indeed an interesting business “space.” I noted a portable GPRS GPS real time tracking locator for about $10. The features of the item include a magnet for attaching the device to a vehicle, built in microphone, and Li polymer battery. The size of the tracker is about 1.5 inches square and a half inch thick. You can find the device listed at this link. I am not sure how long this listing will be online, however. Just two years ago, this type of device was not widely available in this form factor. What are the use cases for this gizmo? Use your imagination.
Stephen E Arnold, October 11, 2017
The Underside of the Internet, Just Slightly Off Base
October 11, 2017
Deutsche Welle ran a story about the Dark Web called “Darknet, The Shady Internet.” I found the approach interesting. Let me mention that I am the author of Dark Web Notebook, a guide for law enforcement and intelligence professionals. (Information about the Notebook is at this link.) I don’t want to work pedantically through the write up, pointing out issues I have with some of the assertions. I do want to highlight the conclusion of the article. DW points out that LE and intel professionals have to use methods which seem to be less than elegant. Here’s the passage I highlighted:
So what can police, federal law enforcement officials, secret police and international crime-fighting networks do to combat the darknet? Some tactics are surprisingly old fashioned. One is to purchase an illegal item from a darknet marketplace and then analyze the package and its contents when it comes in the mail. With enough data, police can hone in on the package’s source. Another tactic is to build rapport with the site’s owner, say a drug dealer, and to request a real-life meeting to exchange the goods.
I would point out that there are a number of companies which offer specialized products and services to assist LE and intel professionals with Dark Web investigations. These range from the Google and In-Q-Tel funded Recorded Future to the less well known Terbium Labs. There are other companies as well, and I profile a number of them in Dark Web Notebook.
I am surprised that the DW invested modest effort in its write up. Dark Web content is a tiny fraction of data available online. Nevertheless, as censorship in countries and at such firms as Facebook, Google, and Twitter-type companies increases, the Dark Web will experience some growth despite the hurdles the Dark Web puts in front of users.
I would point out that in the Dark Web Notebook we recount an anecdote involving a German policeman who explored the Dark Web and found himself caught in a digital bear trap. Thus, knowledge of the sophisticated tools available to LE and intel professionals is important. Leaving these out of an article from a respected “news” organization underscores the need for a bit more attention to detail and context.
Stephen E Arnold, October 11, 2017
Enterprise Search: Still Floundering after All These Years
October 11, 2017
Enterprise search conferences once had pride of place. Enterprise search or “search” was the Big Data, artificial intelligence, and cyber intelligence solution from 1998 to 2007.
But by 2007, the fanciful claims of enterprise search vendors were perceived as “big hat, no cattle” posturing. Unable to generate sustainable revenues, the high profile enterprise search systems began looking for a buyer. Those who failed disappeared. Do you know where Convera, Delphes, Entopia, and Siderean are today? What’s the impact of Exalead on Dassault? Autonomy on Hewlett Packard Enterprise? Vivisimo on IBM?
Easy questions to ignore. Time marches on. Proprietary search cost a bundle to keep working. The “fix” to the development, enhancement, and bug fix problems was open source.
A solution emerged. Lucene. That brings us to the title of this blog post: “Enterprise Search: Still Floundering after All These Years.”
The money from license fees is insufficient to make enterprise search work in a good enough way. Open source search, which seems to be largely free of license fees, allows vendors to offer search and highly profitable services to the organizations who want or need an “enterprise search system.”
This means that a vendor who makes more money offering search services can be perceived as a problem to an venture funded company built on promises and tens of millions in venture capital.
The truth of this observation was revealed in an article written by or for Search Technologies, a unit of a Fancy Dan consulting firm. If I understand the Search Technologies’ write up, Lucidworks (né Lucid Words) told Search Technologies that it was not welcome at a conference designed to promote Solr.
Here’s what Search Technologies said in “Why Wasn’t Search Technologies at Lucene/Solr Revolution 2017?”
Lucene/Solr Revolution’s organizer, Lucidworks, informed us that we were no longer welcome to exhibit or speak at the event. Lucidworks considered us a company that:
- Competes with their professional services group (maybe)
- Is not likely to resell Lucidworks’ platform exclusively (we are vendor-agnostic, after all), and,
- Has technology assets that compete with their Fusion platform (partially true)
I don’t care too much about venture funded outfits running conferences to make their “one true way” evident to the attendees. I don’t worry about a blue chip consulting firm’s ability to generate sales leads.
No.
I find that some of enterprise search’s most problematic weaknesses have not been solved after 50 years of flailing. Examples include:
- The cost of moving beyond “good enough” information access
- Revealing that enterprise search systems are expensive to tune and shape to the needs of an organization
- Developing solutions which keep indexes current and searches responsive
- Seamless handling different types of content, including video, engineering drawings, and data tucked inside legacy systems
- Keeping the majority of the users happy so bootleg search systems are not installed to meet departmental or operating unit needs.
The “search” problem is an illustration of innovation running out of gas. I have zero stake in Lucidworks, Search Technologies, or enterprise search. I am content to be an observer who points out that search vendors, their marketing, the consultants, and the conference organizers are their own worst enemy.
That’s why enterprise search imploded about a decade ago. Search today is pretty much “good enough.” Antidot, Lucene, Solr, dtSearch, X1, Fabasoft, Funnelback, et al. Each does “good enough” search in my opinion.
To make any system better takes consulting and engineering services. These deliver high margins. Users? Well, users want enterprise search to answer questions and work like Google. After 50 years of effort, no company has been able to meet the users’ needs.
That says more than two consulting firms trading digital jabs. What’s at stake is consulting revenue and proprietary fixes. Users? Yes, what about the users?
Stephen E Arnold, October 10, 2017
AI Predictions for 2018
October 11, 2017
AI just keeps gaining steam, and is positioned to be extremely influential in the year to come. KnowStartup describes “10 Artificial Intelligence (AI) Technologies that Will Rule 2018.” Writer Biplab Ghosh introduces the list:
Artificial Intelligence is changing the way we think of technology. It is radically changing the various aspects of our daily life. Companies are now significantly making investments in AI to boost their future businesses. According to a Narrative Science report, just 38% percent of the companies surveys used artificial intelligence in 2016—but by 2018, this percentage will increase to 62%. Another study performed by Forrester Research predicted an increase of 300% in investment in AI this year (2017), compared to last year. IDC estimated that the AI market will grow from $8 billion in 2016 to more than $47 billion in 2020. ‘Artificial Intelligence’ today includes a variety of technologies and tools, some time-tested, others relatively new.
We are not surprised that the top three entries are natural language generation, speech recognition, and machine learning platforms, in that order. Next are virtual agents (aka “chatbots” or “bots”), then decision management systems, AI-optimized hardware, deep learning platforms, robotic process automation, text analytics & natural language processing, and biometrics. See the write-up for details on each of these topics, including some top vendors in each space.
Cynthia Murrell, October 11, 2017
Palantir Settlement Makes Good Business Sense
October 11, 2017
Palantir claims it is focusing on work, not admitting its guilt over a labor dispute in a recent settlement. This is creating a divide in the industry about what it exactly does mean. We first learned of the $1.66 million settlement in How To Zone’s story, “Palantir Settles Discrimination Complaint with U.S. Labor Agency.”
How did we get here? According to the story:
The Labor Department said in an administrative complaint last year that it conducted a review of Palantir’s hiring process beginning in 2010. The agency alleged that the company’s reliance on employee referrals resulted in bias against Asians. Contracts worth more than $370 million, including with the U.S. Defense Department, Treasury Department and other federal agencies, were in jeopardy if the Labor Department had found Palantir guilty of discrimination.
Serious accusations. But this settlement might not signal what you think it does. Palantir said in a statement:
We settled this matter, without any admission of liability, in order to focus on our work.
This might be the smartest action on their behalf. Consider what happened to SalesForce when they got wrapped up in a legal battle earlier this year. It not only slowed down their sales, but some experts feel the suit may have altered enterprise search for good.
Something tells us Palantir, with its rich government contracts, wants to simply put this behind them and not get caught in a legal web.
Patrick Roland, October 11, 2017
Google: Doing Better?
October 10, 2017
I recall that Google was not able to pull together some salary data. I just read “Google, Facebook and Twitter Scramble to Hold Washington at Bay” and formed the opinion that Facebook and Twitter are equally challenged by data requests. The three companies are, however, able to “scramble” to use Bloomberg’s loaded word.
The write up states:
It’s a delicate balance for the companies, whose products reached massive scale because of their ability to transact advertising automatically, without much restriction. They must figure out how much responsibility to take and how much change to promise, without succumbing to costly regulation or setting a precedent that might be difficult to follow in other countries. In the context of political advertising, some lawmakers are already proposing new limits.
I also noted this passage:
Google executives expected Congress to be more receptive to its arguments that penalizing knowledge of trafficking might stop smaller internet companies from looking for it at all. They were caught off-guard by negative responses to the company’s lobbying, according to one Washington operative who works for the company.
I thought Google’s analytics capabilities could predict certain actions. Google “off-guard” suggests that more than high school math club antics may be necessary.
Stephen E Arnold, October 10, 2017
Does Google Want to Be Broken Apart?
October 10, 2017
There’s nothing like a 23 hour travel spree to make new thoughts flow. Sitting in a noisy airport in Tuscany, I read “Google CEO Sundar Pichai: ‘I Don’t Know Whether Humans Want Change That Fast’” surprised me.
The “management by ambiguity” leader of the GOOG granted another interview. The story, which appeared on October 7, 2017, and made its way to Tuscany on October 8, 2017, contained some statements I found thought provoking.
Let’s look at three I circled as memorable:
Item 1: Mistakes
“I recognize that, in the Valley, people are obsessed with the pace of technological change,” he says. “It’s tough to get that part right… We rush sometimes, and can misfire for an average person.”
Comment: Yep, there are impacts particularly when the outfit making the mistake may be the big dog in the kennel.
Item 2: Information control
“Once everybody has access to a computer and connectivity, then search works the same, whether you are a Nobel laureate or just a kid with a computer.”
It’s tough to find information when some of the data are [a] censored, [b] not indexed, and [c] not updated once indexed.
Item 3: We’re no big deal…
No single company or country can change the pace of progress. Nobody is trying to socially engineer anything [here] – we are trying to solve hard problems.
Interesting. I wonder if companies affected by Google’s last 20 years would agree?
Item 4: Break up my company, please!
As a big company, you are constantly trying to foolproof yourself against being big, because you see the advantage of being small, nimble and entrepreneurial. Pretty much every great thing gets started by a small team.
Will governments force Google to bunsha or will Google just break itself up?
Stephen E Arnold, October 10, 2017
Veteran Web Researcher Speaks on Bias and Misinformation
October 10, 2017
The CTO of semantic search firm Ntent, Dr. Ricardo Baeza-Yates, has been studying the Web since its inception. In their post, “Fake News and the Power of Algorithms: Dr. Ricardo Baeza-Yates Weights In With Futurezone at the Vienna Gödel Lecture,” Ntent shares his take on biases online by reproducing an interview Baeza-Yates gave Futurezone at the Vienna Gödel Lecture 2017, where he was the featured speaker. When asked about the consequences of false information spread far and wide, the esteemed CTO cited two pivotal events from 2016, Brexit and the US presidential election.
These were manipulated by social media. I do not mean by hackers – which cannot be excluded – but by social biases. The politicians and the media are in the game together. For example, a non-Muslim attack may be less likely to make the front page or earn high viewing ratings. How can we minimize the amount of biased information that appears? It is a problem that affects us all.
One might try to make sure people get a more balanced presentation of information. Currently, it’s often the media and politicians that cry out loudest for truth. But could there be truth in this context at all? Truth should be the basis but there is usually more than one definition of truth. If 80 percent of people see yellow as blue, should we change the term? When it comes to media and politics the majority can create facts. Hence, humans are sometimes like lemmings. Universal values could be a possible common basis, but they are increasingly under pressure from politics, as Theresa May recently stated in her attempt to change the Magna Carta in the name of security. As history already tells us, politicians can be dangerous.
Indeed. The biases that concern Baeza-Yates go beyond those that spread fake news, though. He begins by describing presentation bias—the fact that one’s choices are limited to that which suppliers have, for their own reasons, made available. Online, “filter bubbles” compound this issue. Of course, Web search engines magnify any biases—their top results provide journalists with research fodder, the perceived relevance of which is compounded when that journalist’s work is published; results that appear later in the list get ignored, which pushes them yet further from common consideration.
Ntent is working on ways to bring folks with different viewpoints together on topics on which they do agree; Baeza-Yates admits the approach has its limitations, especially on the big issues. What we really need, he asserts, is journalism that is bias-neutral instead of polarized. How we get there from here, even Baeza-Yates can only speculate.
Cynthia Murrell, October 10, 2017