Progress: From Selling NLP to Providing NLP Services

December 11, 2017

Years ago, Progress Software owned an NLP system. I recall conversations with natural language processing wizards from Easy Ask. Larry Harris developed a natural language system in 1999 or 2000. Progress purchased EasyAsk in 2005 if memory serves. I interviewed Craig Bassin in 2010 as part of my Search Wizards Speak series.

The recollection I have was that Progress divested itself of EasyAsk in order to focus on enterprise applications other than NLP. No big deal. Software companies are bought and sold everyday.

However, what makes this recollection interesting to me is the information in “Beyond NLP: 8 Challenges to Building a Chatbot.” Progress went from a software company who owned an NLP system to a company which is advising people like me how challenging a chatbot system can be to build and make work. (I noted that the Wikipedia entry for Progress does not mention the EasyAsk acquisition and subsequent de-acquisition.) Either small potatoes or a milestone best jumped over I assume.)

Presumably it is easier to advise and get paid to implement than funding and refining an NLP system like EasyAsk. If you are not familiar with EasyAsk, the company positions itself in eCommerce site search with its “cognitive eCommerce” technology. EasyAsk’s capabilities include voice enabled natural language mobile search. This strikes me as a capability which is similar to that of a chatbot as I understand the concept.

History is history one of my high school teachers once observed. Let’s move on.

What are the eight challenges to standing up a chatbot which sort of works? Here they are:

  1. The chat interface
  2. NLP
  3. The “context” of the bot
  4. Loops, splits, and recursions
  5. Integration with legacy systems
  6. Analytics
  7. Handoffs
  8. Character, tone, and persona.

As I review this list, I note that I have to decide whether to talk to a chatbot or type into a box so a “customer care representative” can assist me. The “representative” is, the assumption is, a smart software robot.

I also notice that the bot has to have context. Think of a car dealer and the potential customer. The bot has to know that I want to buy a car. Seems obvious. But okay.

“Loops, splits, and recursions.” Frankly I have no idea what this means. I know that chatbot centric companies use jargon. I assume that this means “programming” so the NLP system returns a semi-on point answer.

Integration with legacy systems and handoffs seem to be similar to me. I would just call these two steps “integration” and be done with it.

The “character, tone, and persona” seems to apply to how the chatbot sounds; for example, the nasty, imperious tone of a Kroger automated check out system.

Net net: Progress is in the business of selling advisory and engineering services. The reason, in my opinion, was that Progress could not crack the code to make search and retrieval generate expected payoffs. Like some Convera executives, selling search related services was a more attractive path.

Stephen E Arnold, December 11, 2017

Filtered Content: Tactical Differences between Dow Jones and Thomson Reuters

December 5, 2017

You may know that Dow Jones has an online search company. The firm is called Factiva, and it is an old-school approach to finding information. The company recently announced a deal with an outfit called Curation. Founded by a former newspaper professional, Curation uses mostly humans to assemble reports on hot topics. Factiva is reselling these services, and advertising for customers in the Wall Street Journal. Key point: This is mostly a manual method. The approach was more in line with the types of “reports” available from blue chip consulting firms.

You may also know that Thomson Reuters has been rolling out machine curated reports. These have many different product names. Thomson Reuters has a large number of companies and brands. Not surprisingly, Thomson’s approach has to apply to many companies managed by executives who compete with regular competitors like Dow Jones but also among themselves. Darwin would have loved Thomson Reuters. The point is that Thomson Reuters’ approach relies on “smart” software.

You can read about Dow Jones’ play here.

You can read about Thomson Reuters’ play here.

My take is that these two different approaches reflect the painful fact that there is not clear path forward for professional publishing companies. In order to make money from electronic information, two of the major players are still experimenting. The digital revolution began, what?, about 40 years ago.

One would have thought that leading companies like Dow Jones and Thomson Reuters would have moved beyond the experimental stage and into cash cow land.

Not yet it seems. The reason for my pointing out these two different approaches is that there are more innovative methods available. For snapshots of companies which move beyond the Factiva and Thomson methods, watch Dark Cyber, a new program is available every Tuesday via YouTube at this link.

Stephen E Arnold, December 5, 2017

A Tale of Two Seattle Outfits: One Zippy, One Not So Zippy

December 1, 2017

I read “Microsoft Corporation Stacks the Deck Against AWS with Azure Stack.” The main idea from my point of view is:

Piper Jaffray analyst Alex Zukin said in a note this week that he believes Azure Stack will play a major role in the growth of Microsoft’s cloud business. He describes Azure Stack as “the first hybrid cloud platform with a direct connection to a pure hyperscale cloud,” which enables developers to “write once and use anywhere.”

Maybe so. I noted that Amazon is democratizing smart software with Sagemaker. (Hopefully it will do better than the company which used the name in the 1990s.) Also, Amazon is nosing into “real time” translation.

Amazon strikes me as having a better business model, more innovative consumer and enterprise products, and richer sustainable revenue streams.

Oh, Microsoft is going to do games which, I assume, someone will play on the wonky Surface desktop computer.

Stephen E Arnold, December 1, 2017

Mitsubishi: Careless Salarymen or Spreadsheet Fever?

November 27, 2017

I read “Mitsubishi Materials Says Over 200 Customers Could be Affected by Data Falsification.” Source of the story is Thomson Reuters, a real news outfit, in my opinion.

The main point of the story is to reveal that allegedly false data were used to obfuscate the fact that 200 customers may have parts which do not meet requirements for load bearing, safety, or durability.

When I was in college, I worked in the Keystone Steel & Wire Company’s mill in Illinois. I learned that the superintendent enforced on going checks for steel grades. I learned that there is a big difference between the melt used for coat hanger wire and the melt for more robust austenitic steel. Think weapons or nuclear reactor components made of coat hanger steel.

Mislabeling industrial components is dangerous. Planes can fall from the sky. Bridges can collapse. Nuclear powered submarines can explode. Or back flipping robots to crush Softbank/Boston Dynamic cheerleaders and an awed kindergarten class.

https://i.ytimg.com/vi/knoOXBLFQ-s/hqdefault.jpg

Reuters calls this a “quality assurance and compliance scandal.” That’s a nicer way to explain the risks of fake data, but not even Reuters’ olive oil based soft soap can disguise the fact that distortion is not confined to bogus information in intelligence agency blog posts.

Online credibility is a single tile in a larger mosaic of what once was assumed to be the norm: Ethical behavior.

Without common values regarding what’s accurate and what’s fake, the real world and its online corollary are little more than video game or Hollywood comic book films.

Silicon Valley mavens chatter about smart software which will recognize fake news. How is that working out? Now about the crashworthiness of the 2018 automobiles?

I think the problem is salarymen, their bosses, and twiddling with outputs from databases and Excel in order to make the numbers “flow.”

Stephen E Arnold, November 27, 2017

Amazon: The New Old AT&T

November 22, 2017

I read “AWS Launches a Secret Region for the U.S. Intelligence Community.” The write up does a reasonable job of explaining that Amazon has become a feisty pup in the Big Dog in the upscale Potomac Fever Kennels.

The main idea, as I understand it, is that Amazon is offering online services tailored to agencies with requirements for extra security. Google is trying to play in this dog park as well, but Amazon seems to have the moxie to make headway.

I would point out that there are some facets to the story which a “real” journalist or a curious investor may want to explore; specifically:

  • AT&T of Ashburn fame may be feeling that the attitude of the Amazon youthful puppy AWS is bad news. AT&T with its attention focused on the bright lights of big media may be unable to deal with Amazon’s speed, agility, and reflexes. If this is accurate, this seemingly innocuous announcement with terms like “air gap” may presage a change in the fortunes of AT&T.
  • IBM Federal Systems, the traffic disaster in Gaithersburg, may feel the pinch as well. What happens if the young pup begins to take kibble from that Beltway player? A few acquisitions here and few acquisitions there and suddenly Amazon can have its way because the others in the kennel know that an alpha dog with tech savvy can be a problem?
  • The consulting environment may also change. For decades, outfits like my former employer, the Boozer, have geared up to bathe, groom, and keep healthy the old school online giants like AT&T, Verizon, et al. Now new skills sets may be required for the possible Big Dog. Where will Amazon “experts” come from? Like right now, gentle reader.

In short, this article states facts. But like many “real” news stories, there are deeper and possibly quite significant changes taking place. I wonder if anyone cares about these downstream changes.

Leftover telecom turkey anyone?

Stephen E Arnold, November 22, 2017

Google Relevance: A Light Bulb Flickers

November 20, 2017

The Wall Street Journal published “Google Has Chosen an Answer for You. It’s Often Wrong” on November 17, 2017. The story is online, but you have to pay money to read it. I gave up on the WSJ’s online service years ago because at each renewal cycle, the WSJ kills my account. Pretty annoying because the pivot of the WSJ write up about Google implies that Google does not do information the way “real” news organizations do. Google does not annoy me the way “real” news outfits handle their online services.

For me, the WSJ is a collection of folks who find themselves looking at the exhaust pipes of the Google Hellcat. A source for a story like “Google Has Chosen an Answer for You. It’s Often Wrong” is a search engine optimization expert. Now that’s a source of relevance expertise! Another useful source are the terse posts by Googlers authorized to write vapid, cheery comments in Google’s “official” blogs. The guts of Google’s technology is described in wonky technical papers, the background and claims sections of the Google’s patent documents, and systematic queries run against Google’s multiple content indexes over time. A few random queries does not reveal the shape of the Googzilla in my experience. Toss in a lack of understanding about how Google’s algorithms work and their baked in biases, and you get a write up that slips on a banana peel of the imperative to generate advertising revenue.

I found the write up interesting for three reasons:

  1. Unusual topic. Real journalists rarely address the question of relevance in ad-supported online services from a solid knowledge base. But today everyone is an expert in search. Just ask any millennial, please. Jonathan Edwards had less conviction about his beliefs than a person skilled in the use of locating a pizza joint on a Google Map.
  2. SEO is an authority. SEO (search engine optimization) experts have done more to undermine relevance in online than any other group. The one exception are the teams who have to find ways to generate clicks from advertisers who want to shove money into the Google slot machine in the hopes of an online traffic pay day. Using SEO experts’ data as evidence grinds against my belief that old fashioned virtues like editorial policies, selectivity, comprehensive indexing, and a bear hug applied to precision and recall calculations are helpful when discussing relevance, accuracy, and provenance.
  3. You don’t know what you don’t know. The presentation of the problems of converting a query into a correct answer reminds me of the many discussions I have had over the years with search engine developers. Natural language processing is tricky. Don’t believe me. Grab your copy of Gramatica didactica del espanol and check out the “rules” for el complemento circunstancial. Online systems struggle with what seems obvious to a reasonably informed human, but toss in multiple languages for automated question answer, and “Houston, we have a problem” echoes.

I urge you to read the original WSJ article yourself. You decide how bad the situation is at ad-supported online search services, big time “real” news organizations, and among clueless users who believe that what’s online is, by golly, the truth dusted in accuracy and frosted with rightness.

Humans often take the path of least resistance; therefore, performing high school term paper research is a task left to an ad supported online search system. “Hey, the game is on, and I have to check my Facebook” takes precedence over analytic thought. But there is a free lunch, right?

Image result for there is no free lunch

In my opinion, this particular article fits in the category of dead tree media envy. I find it amusing that the WSJ is irritated that Google search results may not be relevant or accurate. There’s 20 years of search evolution under Googzilla’s scales, gentle reader. The good old days of the juiced up CLEVER methods and Backrub’s old fashioned ideas about relevance are long gone.

I spoke with one of the earlier Googlers in 1999 at a now defunct (thank goodness) search engine conference. As I recall, that confident and young Google wizard told me in a supercilious way that truncation was “something Google would never do.”

What? Huh?

Guess what? Google introduced truncation because it was a required method to deliver features like classification of content. Mr. Page’s comment to me in 1999 and the subsequent embrace of truncation makes clear that Google was willing to make changes to increase its ability to capture the clicks of users. Kicking truncation to the curb and then digging through the gutter trash told me two things: [a] Google could change its mind for the sake of expediency prior to its IPO and [b] Google could say one thing and happily do another.

I thought that Google would sail into accuracy and relevance storms almost 20 years ago. Today Googzilla may be facing its own Ice Age. Articles like the one in the WSJ are just belated harbingers of push back against a commercial company that now has to conform to “standards” for accuracy, comprehensiveness, and relevance.

Hey, Google sells ads. Algorithmic methods refined over the last two decades make that process slick and useful. Selling ads does not pivot on investing money in identifying valid sources and the provenance of “facts.” Not even the WSJ article probes too deeply into the SEO experts’ assertions and survey data.

I assume I should be pleased that the WSJ has finally realized that algorithms integrated with online advertising generate a number of problematic issues for those concerned with factual and verifiable responses.

Read more

Google Tries to Explain How to Make Another Google

November 15, 2017

Here’s the headline which snagged my attention: “How to Build the Next Google, According to a Google Executive.” In my three monographs about Google, I learned that Google was a result of several missteps and circumstances which Sergey Brin and Larry Page were able to seize upon. The exogenous factors I documented included:

  • The Clever method which IBM did nothing to commercialize
  • AltaVista’s unhappy campers who were looking for new gig
  • Yahoo and other “search” services bumbling and portal craziness
  • An understanding university
  • A vision for making information accessible on Web servers to users with modest expectations for precision, recall, timeliness, etc.

Google was in the right place at the right time, and it was able to obtain some cash from a Silicon Valley money guru. The company’s efforts to sell itself were going nowhere until the bright idea for standing on the shoulders of GoTo, Overture, and Yahoo ignited the online ad money machine. The rest, after the 2004 settlement with Yahoo over an intellectual property issue, has become the success story MBAs love. Well, it was until Facebook came along.

The Fortune article disappointed me. The Google story was not complete in my opinion. The scalable business model referenced in the article was not Google’s. Google emulated the pay for play and perfected putting ads in front of people who used certain key words. As I stated, this was the GoTo (later Overture) revolution.

The write up reports:

The idea of changing the world isn’t at odds with making a buck, Felten (a Googler) said. In fact, the latter is usually necessary. “If you want to solve really large problems in the world, unless it’s a sustainable business, it probably won’t scale,” she said. “So, finding those things where there’s both profit and purpose is sort of our sweet spot.”

Too bad Fortune did not probe into the exogenous factors which allowed Google to generate billions. But in the world of business mythmaking and the “you can do it” advice sought by would be billionaires, cooking up tips which provide the path to success is okay.

By the way, after 20 years, what percentage of Google’s revenues come from the GoTo, Overture, Yahoo online advertising model? Look it up, gentle reader. That means that Google itself has not been able to move beyond the Steve Ballmer analysis of a “one trick pony.” High school science projects do not seem to become scalable businesses. I admit there may be some buyers for the solution to death. But that seems to be just out of reach like Loon balloons providing comprehensive mobile service to the island of Puerto Rico.

Note to Googlers and Xooglers: Put your comments in the comments section of this blog. Don’t email me unless you have read The Google Legacy, Google Version 2, and Google: The Digital Gutenberg. Just a modest request.

Stephen E Arnold, November 15, 2017

Google Comes with an Olive Branch Because It Is Happy with What It Has and Publishers Should Be Happy Too

November 13, 2017

I read two Google items this morning (November 13, 2017). I found each interesting and useful in plotting Google’s evolution from Backrub to the behemoth it has become by selling ads.

The first item is “Google X’s Chief Business Officer Says You Can Achieve Happiness by Following One Simple Rule.” No, the rule does not mean that one does not reveal whether Google’s super secret Deep Mind is working with the GOOG’s own skunk works. The rule is, if the write up is accurate, “If you really start to appreciate what you have in your life, happiness becomes a much easier task to achieve.”

That’s good to know. I am confident that the people living in vans in Palo Alto are going to enjoy getting cleaned up at the McDonald’s much more. Hey, you can also have an Egg McMuffin after one’s morning ablutions.

The other article is “Google UK Chief Ronan Harris Says Digital Giant Is Not Stealing Advertising from Publishers Telling Editors: We Come in Peace.” I highlighted this passage from the story. The Googler is one Ronan Harris, who is in charge of Google in the UK:

“Every year we share billions of pounds in revenue with publishers globally. We also drove more than 10 billion clicks a month to publisher websites — for free — from Google Search and Google News.

He allegedly added:

And as more and more people interact with news in different ways, we need to take advantage of new digital tools and capabilities to develop new experiences and sustainable business models. “We’re eager to partner with you to create them.  To work with you to tackle the challenges head on, because having a healthy media ecosystem is crucial to your business, to ours and to society.

Yep, Google comes in peace to those who have spent 40 days and nights wandering in the wilderness. Let’s party, friends!

Stephen E Arnold, November 13, 2017

A Clever Take on Google and Fake News

November 8, 2017

I noted this story in the UK online publication The Register: “Google on Flooding the Internet with Fake News: Leave Us Alone. We’re Trying Really Hard. Sob.” The write up points out:

Google has responded in greater depth after it actively promoted fake news about Sunday’s Texas murder-suicide gunman by… behaving like a spoilt kid.

The Google response, as presented in the write up, warranted a yellow circle from my trusty highlighter. The Register said:

Having had time to reflect on the issue, the Silicon Valley monster’s “public liaison for search” and former Search Engine Land blog editor Danny Sullivan gave a more, um, considered response in a series of tweets. “Bottom line: we want to show authoritative information. Much internal talk yesterday on how to improve tweets in search; more will happen,” he promised, before noting that the completely bogus information had only appeared “briefly.”

image

The Register story includes other gems from the search engine optimization expert who seems to thrive on precision and relevance for content unrelated to a user’s query; for example, the article presents some “quotes” from Mr. Sullivan, the expert in charge of explaining the hows and whys of fake news:

  • “Early changes put in place after Las Vegas shootings seemed to help with Texas. Incorrect rumors about some suspects didn’t get in…”
  • Right now, we haven’t made any immediate decisions. We’ll be taking some time to test changes and have more discussions.
  • “Not just talk. Google made changes to Top Stories and is still improving those. We’ll do same with tweets. We want to get this right.”

Yep, Google wants to do better. Now Google wants to get “this” right. Okay. After 20 years, dealing with fake content, spoofs, and algorithmic vulnerability is on the to do list. That’s encouraging.

For more Google explanations, check out the Register’s story and follow the logic of the SEO wizard who now has to explain fake news creeping—well, more like flowing—into Google’s search and news content.

Does an inability to deal with fake news hint at truthiness challenges at Googzilla’s money machine? Interesting question from my point of view.

Stephen E Arnold, November 8, 2017

Free Services: What Happens When They Are Killed Off?

November 3, 2017

In the salad days of online, one paid for “time” (the online connection) and one paid for the “content” (the citations, data, full text). Today data are free. Hooray.*

For users of the the Google flight information, the news that Google was likely to shut down its flight data feed is bad news. Even worse, those nifty MBA inspired spreadsheets which happily omitted the cost of flight data are going to have to be re-imagined.

And Oath (remember Yahoo?) is, it seems, going to cut off the finance, if the story in Hacker News is accurate. The write up states:

Yahoo Finance has apparently killed is API. Zero warning. Lots of apps probably use this. Before, you could get stock information by using  http://download.finance.yahoo.com/d/quotes.csv Now, you get the following message: It has come to our attention that this service is being used in violation of the Yahoo Terms of Service. As such, the service is being discontinued. For all future markets and equities data research, please refer to finance.yahoo.com. What violation of TOS? People have been using this for years without any issues. If you are going to cut this off, how about a warning and heads up? Guess that’s what we should expect from OATH / Verizon.

The comments are interesting.

Net net: The online model from the 1969 to 1995 phase of online may be poking its nose from a Rip Van Winkle snooze.

And those spreadsheets? MBAs are crafty. The numbers will work out—at least in Excel. In real life? Hmmm. Good question.

Stephen E Arnold, November 3, 2017

* Editor’s update: Heads up. I last night (November 3, 2017) I received an impassioned and mom-like communication from a person who wanted confidentiality about the information he was about to impart via Gmail email. (Isn’t that type of email parsed by smart software for the purpose of collecting ad revenue and data?) The alleged former Googler (aka Xoogler) was unaware that I was at dinner with my wife enjoying a grilled squirrel burger with the cheese on the bottom in the approved Google manner. But this write up was an urgent matter in the mind of the agitated Xoogler eager to share confidential information with me. Lucky me! The email included numbers and a statement that I had to rewrite this article because I was, as I have noted on numerous occasions in the course of this 10 year old Beyond Search blog, an “addled goose”. The email made clear that killing Google services and products does no harm, and I was wrong, incorrect, off base, and a Bambi brained deer. Please, check out the source story from Marketwatch. Make up your own mind, gentle reader, because I try to present my opinion whilst separating the giblets from the goosefeathers.  My view is that abrupt, unilateral modifications of services is a good thing for some devlopers and users. But I do enjoy confidential communications about the inner workings of my favorite search engine as I munch my burger with cheese on the bottom in the Sundar Pichai approved manner. Plus, I enjoy recalling the Google Reader, Google Talk, Google Health, Knol, Google Buzz, and my favorite and the fave of some Brazilians, Orkut. You don’t? Well, you, unlike me, are not trying to be Googley. To refresh your memory, check out the Google Graveyeard. Do you have a problem with terminated services? In my opinion, termination with extreme prejudiced is in your best interests. Now put the cheese on the bottom of the meat patty.

Next Page »

  • Archives

  • Recent Posts

  • Meta