Featured

Google Relevance: A Light Bulb Flickers

The Wall Street Journal published “Google Has Chosen an Answer for You. It’s Often Wrong” on November 17, 2017. The story is online, but you have to pay money to read it. I gave up on the WSJ’s online service years ago because at each renewal cycle, the WSJ kills my account. Pretty annoying because the pivot of the WSJ write up about Google implies that Google does not do information the way “real” news organizations do. Google does not annoy me the way “real” news outfits handle their online services.

For me, the WSJ is a collection of folks who find themselves looking at the exhaust pipes of the Google Hellcat. A source for a story like “Google Has Chosen an Answer for You. It’s Often Wrong” is a search engine optimization expert. Now that’s a source of relevance expertise! Another useful source are the terse posts by Googlers authorized to write vapid, cheery comments in Google’s “official” blogs. The guts of Google’s technology is described in wonky technical papers, the background and claims sections of the Google’s patent documents, and systematic queries run against Google’s multiple content indexes over time. A few random queries does not reveal the shape of the Googzilla in my experience. Toss in a lack of understanding about how Google’s algorithms work and their baked in biases, and you get a write up that slips on a banana peel of the imperative to generate advertising revenue.

I found the write up interesting for three reasons:

  1. Unusual topic. Real journalists rarely address the question of relevance in ad-supported online services from a solid knowledge base. But today everyone is an expert in search. Just ask any millennial, please. Jonathan Edwards had less conviction about his beliefs than a person skilled in the use of locating a pizza joint on a Google Map.
  2. SEO is an authority. SEO (search engine optimization) experts have done more to undermine relevance in online than any other group. The one exception are the teams who have to find ways to generate clicks from advertisers who want to shove money into the Google slot machine in the hopes of an online traffic pay day. Using SEO experts’ data as evidence grinds against my belief that old fashioned virtues like editorial policies, selectivity, comprehensive indexing, and a bear hug applied to precision and recall calculations are helpful when discussing relevance, accuracy, and provenance.
  3. You don’t know what you don’t know. The presentation of the problems of converting a query into a correct answer reminds me of the many discussions I have had over the years with search engine developers. Natural language processing is tricky. Don’t believe me. Grab your copy of Gramatica didactica del espanol and check out the “rules” for el complemento circunstancial. Online systems struggle with what seems obvious to a reasonably informed human, but toss in multiple languages for automated question answer, and “Houston, we have a problem” echoes.

I urge you to read the original WSJ article yourself. You decide how bad the situation is at ad-supported online search services, big time “real” news organizations, and among clueless users who believe that what’s online is, by golly, the truth dusted in accuracy and frosted with rightness.

Humans often take the path of least resistance; therefore, performing high school term paper research is a task left to an ad supported online search system. “Hey, the game is on, and I have to check my Facebook” takes precedence over analytic thought. But there is a free lunch, right?

Image result for there is no free lunch

In my opinion, this particular article fits in the category of dead tree media envy. I find it amusing that the WSJ is irritated that Google search results may not be relevant or accurate. There’s 20 years of search evolution under Googzilla’s scales, gentle reader. The good old days of the juiced up CLEVER methods and Backrub’s old fashioned ideas about relevance are long gone.

I spoke with one of the earlier Googlers in 1999 at a now defunct (thank goodness) search engine conference. As I recall, that confident and young Google wizard told me in a supercilious way that truncation was “something Google would never do.”

What? Huh?

Guess what? Google introduced truncation because it was a required method to deliver features like classification of content. Mr. Page’s comment to me in 1999 and the subsequent embrace of truncation makes clear that Google was willing to make changes to increase its ability to capture the clicks of users. Kicking truncation to the curb and then digging through the gutter trash told me two things: [a] Google could change its mind for the sake of expediency prior to its IPO and [b] Google could say one thing and happily do another.

I thought that Google would sail into accuracy and relevance storms almost 20 years ago. Today Googzilla may be facing its own Ice Age. Articles like the one in the WSJ are just belated harbingers of push back against a commercial company that now has to conform to “standards” for accuracy, comprehensiveness, and relevance.

Hey, Google sells ads. Algorithmic methods refined over the last two decades make that process slick and useful. Selling ads does not pivot on investing money in identifying valid sources and the provenance of “facts.” Not even the WSJ article probes too deeply into the SEO experts’ assertions and survey data.

I assume I should be pleased that the WSJ has finally realized that algorithms integrated with online advertising generate a number of problematic issues for those concerned with factual and verifiable responses.

Read more »

Interviews

Bitext: Exclusive Interview with Antonio Valderrabanos

On a recent trip to Madrid, Spain, I was able to arrange an interview with Dr. Antonio Valderrabanos, the founder and CEO of Bitext. The company has its primary research and development group in Las Rosas, the high-technology complex a short distance from central Madrid. The company has an office in San Francisco and a number of computational linguists and computer scientists in other locations. Dr. Valderrabanos worked at IBM in an adjacent field before moving to Novell and then making the jump to his own start up. The hard work required to invent a fundamentally new way to make sense of human utterance is now beginning to pay off.

Antonio Valderrabanos of Bitext

Dr. Antonio Valderrabanos, founder and CEO of Bitext. Bitext’s business is growing rapidly. The company’s breakthroughs in deep linguistic analysis solves many difficult problems in text analysis.

Founded in 2008, the firm specializes in deep linguistic analysis. The systems and methods invented and refined by Bitext improve the accuracy of a wide range of content processing and text analytics systems. What’s remarkable about the Bitext breakthroughs is that the company support more than 40 different languages, and its platform can support additional languages with sharp reductions in the time, cost, and effort required by old-school systems. With the proliferation of intelligent software, Bitext, in my opinion, puts the digital brains in overdrive. Bitext’s platform improves the accuracy of many smart software applications, ranging from customer support to business intelligence.

In our wide ranging discussion, Dr. Valderrabanos made a number of insightful comments. Let me highlight three and urge you to read the full text of the interview at this link. (Note: this interview is part of the Search Wizards Speak series.)

Linguistics as an Operating System

One of Dr. Valderrabanos’ most startling observations addresses the future of operating systems for increasingly intelligence software and applications. He said:

Linguistic applications will form a new type of operating system. If we are correct in our thought that language understanding creates a new type of platform, it follows that innovators will build more new things on this foundation. That means that there is no endpoint, just more opportunities to realize new products and services.

Better Understanding Has Arrived

Some of the smart software I have tested is unable to understand what seems to be very basic instructions. The problem, in my opinion, is context. Most smart software struggles to figure out the knowledge cloud which embraces certain data. Dr. Valderrabanos observed:

Search is one thing. Understanding what human utterances mean is another. Bitext’s proprietary technology delivers understanding. Bitext has created an easy to scale and multilingual Deep Linguistic Analysis or DLA platform. Our technology reduces costs and increases user satisfaction in voice applications or customer service applications. I see it as a major breakthrough in the state of the art.

If he is right, the Bitext DLA platform may be one of the next big things in technology. The reason? As smart software becomes more widely adopted, the need to make sense of data and text in different languages becomes increasingly important. Bitext may be the digital differential that makes the smart applications run the way users expect them to.

Snap In Bitext DLA

Advanced technology like Bitext’s often comes with a hidden cost. The advanced system works well in a demonstration or a controlled environment. When that system has to be integrated into “as is” systems from other vendors or from a custom development project, difficulties can pile up. Dr. Valderrabanos asserted:

Bitext DLA provides parsing data for text enrichment for a wide range of languages, for informal and formal text and for different verticals to improve the accuracy of deep learning engines and reduce training times and data needs. Bitext works in this way with many other organizations’ systems.

When I asked him about integration, he said:

No problems. We snap in.

I am interested in Bitext’s technical methods. In the last year, he has signed deals with companies like Audi, Renault, a large mobile handset manufacturer, and an online information retrieval company.

When I thanked him for his time, he was quite polite. But he did say, “I have to get back to my desk. We have received several requests for proposals.”

Las Rosas looked quite a bit like Silicon Valley when I left the Bitext headquarters. Despite the thousands of miles separating Madrid from the US, interest in Bitext’s deep linguistic analysis is surging. Silicon Valley has its charms, and now it has a Bitext US office for what may be the fastest growing computational linguistics and text analysis system in the world. Worth watching this company I think.

For more about Bitext, navigate to the firm’s Web site at www.bitext.com.

Stephen E Arnold, April 11, 2017

Latest News

Progress: From Selling NLP to Providing NLP Services

Years ago, Progress Software owned an NLP system. I recall conversations with natural language processing wizards from Easy Ask. Larry Harris developed a natural... Read more »

December 11, 2017 | | Comment

Cricket More Popular Than Koran

In the West, we tend to think that Islamic countries spend all waking hours of the day praying, reading the Koran, and doing other religious-based activities. ... Read more »

December 11, 2017 | | Comment

Big Shock: Social Media Algorithms Are Not Your Friend

One of Facebook’s founding fathers, Sean Parker, has done a surprising about-face on the online platform that earned him billions of dollars. Parker has begun... Read more »

December 11, 2017 | | Comment

Cloud Computing Resources: Cost Analysis for Machine Learning

Information about the cost of performing a specific task in a cloud computing set up can be tough to get. Reliable cross platform, apples-to-apples cost analyses... Read more »

December 8, 2017 | | Comment

Twitter Changes API Offerings and Invites Trouble

Twitter has beefed up its API offerings to users, but it comes with an increasing price tag. While that is not a huge issue for many people, it will invite some... Read more »

December 8, 2017 | | Comment

Personalizing a Chromebook Search Takes Some Elbow Grease

Chromebooks are a great laptop and cost a fraction of the price of an Apple or a Microsoft PC.  There is a learning curve for new users to Chromebooks, because... Read more »

December 8, 2017 | | Comment

Google and Amping the Pressure in the Ad Fire Hose

Screen real estate for mobile devices is limited. The number of queries on desktop boat anchor computers has flat lined, even for “real” researchers. What’s... Read more »

December 7, 2017 | | Comment

Some Think the Time Has Come for Government Regulation of Social Media

In this era of fake news and data hacking, some people think it’s time for the government to step in and help. As the stakes get higher, commentators think that... Read more »

December 7, 2017 | | Comment

Neural Network Revamps Search for Research

Research is a pain, especially when you have to slog through millions of results to find specific and accurate results.  It takes time and lot of reading, but neural... Read more »

December 7, 2017 | | Comment

Editorial Excitement: Will Facebook and Google Struggle with Social Responsibility

I noted three interesting news items. Both cast high profile companies as arbiters of social responsibility. The first item is “Facebook Is Banning Women for Calling... Read more »

December 6, 2017 | | Comment


  • Archives

  • Recent Posts

  • Meta