Bing Engineers Serendipity, Not Just Irrelevant Results
May 5, 2018
It is Saturday. Innovation in search never rests. I read “Bing: Search Engines Have a Responsibility to Get People Out of Their Bubbles.” The headline is one guaranteed to give me a headache.
My view is that when I use a search system I expect, want, and need the system to:
- Process my keyword query, accept Boolean logic (AND, OR, and NOT arguments), and generate a list of results that optimize relevance.
- If I need more results, synonyms, Endeca-like “facets”, I want a button or a menu option that allows me to specify what I think I need to get the information I seek.
- I want to have ads, sponsored content, and SEO skewed content flagged in a color which is easily visible and put within a ruled “box.”
- I want to know [a] the date at which the displayed result was indexed, [b] the date assigned by whoever wrote the item to the specific article, and [c] an explicit link to a cache in the event the page indexed has been removed or is otherwise unavailable.
I have other requirements for a commercial search system; for example, Diffeo’s or Recorded Future’s approach. But these are specialized and inappropriate for a Bing style Web index.
The Bing approach, according to the write up:
Bing has launched a new feature called Intelligent Answers. When you enter a question with several valid answers, the search engine summarizes them all in a carousel to give a balanced overview.
I don’t want answers. I want a list of relevant locations which may contain the information I seek. For example, I needed to identify the term for a penance device and access images of these gizmos. Bing, Google, Yandex, and even lesser known systems like iSeek.com and Qwant.com failed.
The systems returned everything from a church calendar to a correction of penance to pennant. I did not want baseball information.
Now Bing is going to identify from my query my “question” and provide a range of answers. I don’t want this to happen. If I search LOCA, I want information about a loss of coolant accident, not this:
I am happy to add a field code for power, nuclear if such a feature were supported by Bing. I would also add key words to get something close to my term.
The complete and utter silliness of Bing results exists right now. The company which has managed minimal progress in search now expects me to believe that its “smart software” can provide answers.
The write up states:
“Take a simple query like ‘Is coffee good for you?’” said Ribas. “There are plenty of reputable sources that tell you that there are good reasons for drinking coffee, but there are also some very reputable ones that say the opposite. Deep learning allows us to project multiple queries in the passages to what we call the semantic space and find the matches.
Based on my limited experience with whizzy 2018 search technology, I am not sure if Bing’s innovation will be helpful to me. When “semantic space” is concerned, the systems with which I am familiar, provide a number of other tools and functions to ensure relevance and accuracy.
Even with those tools, including state of the art systems from developers from Madrid to San Carlos, the user has to think, analyze, and run additional queries. Phone calls, interviews, and even visits to libraries are often required to obtain helpful information.
Bing promises “intelligent answers.”
Sounds like MBA infused marketing with a few notes added from engineers with better things to do than explain exactly what a content processing component can do with 80 percent accuracy.
Time out. The referee wants the coach to get the MBA marketers off the field for intellectual fantasizing. This is the same outfit which owns Fast Search & Transfer, created the racist chatbot, and missed the mobile phone business by a country mile. Why not ask Bing a question like, “How did these missteps occur?” Perhaps Watson would be able to take a crack at “intelligent answers”?
Stephen E Arnold, May 5, 2018
You Are Not Missing the Boat. You Cannot Buy a Ticket.
May 4, 2018
I read “New Technology Widening Gap Between World’s Biggest and Smallest Businesses.” The idea is that if one has money, that individual gets the good stuff. On food stamps? No, iPhone X for you.
Applied to business, the argument means that a local lawn service has zero chance to compete with the landscaping service maintaining the US government’s Camp David.
The write up asserts:
Companies investing in robotics, among other digital technologies, are seeing productivity and profits increase, but the cost involved risks creating an even wider gap between the world’s top companies and their smaller rivals, new research shows.
If the argument were substantive, a small start up would have zero chance to survive. Why? The big companies win. The little outfits lose.
Access to technology, even in countries with constrained citizens, is visible. I have not visited every country in the world, but I have been in more than a handful.
The barrier is not money. The hurdles are usually knowledge centric. Bad decisions at big companies can neutralize technology. The Cambridge Analytic matter illustrates the importance of knowing what to do. Get it wrong and the company suffered.
Technology is a tool and an enabler. Technology is not an automatic slam dunk just because a company is big and has money. The ingredients for success are information, timing, judgment, and luck.
The big versus small argument, if true, would mean that large publishers would dominate information. We know that is not the case.
Therefore, grousing about the unfairness of big versus small does not work for me. However, if one cannot buy a ticket, one cannot get on the boat.
Envy or technology? I go with envy.
Stephen E Arnold, May 4, 2018
Critique of IBM Watson: Complain, Complain, Complain
May 4, 2018
I read “The Fraudulent Claims Made by IBM about Watson and AI.” Harsh. As my late grandfather said when my grandmother told him to take off his boots, “Complain, complain, complain. That’s all you do, Maud.”
The write up takes issue with IBM’s claim that Watson does “cognitive computing.” I am not sure what cognitive computing means because most of the Fancy Dan artificial intelligence infused systems I have seen are works in in progress. Sure, if one does not know about the hassles of defining a domain, assembling a corpus, figuring out which of the AI building blocks to use from one’s computer science classes, and then fixing up the system so it generates 80 percent accuracy most of the time—then AI systems look pretty slick.
The problem for companies in the software game is that generating revenues is usually somewhat easier with the right selection of buzzwords, some marketing magic, and a trend strikes fear into the hearts and minds of the potential customer.
I learned fro the write up:
I [Roger Schank] invented a field called Case Based Reasoning in the 80’s which was meant to enable computers to compare new situations to old ones and then modify what the computer knew as a result. We were able to build some useful systems. And we learned a lot about human learning. Did I think we had created computers that were now going to outthink people or soon become conscious? Of course not. I thought we had begun to create computers that would be more useful to people. It would be nice if IBM would tone down the hype and let people know what Watson can actually do and stop making up nonsense about love fading and out thinking cancer. IBM is simply lying now and they need to stop. AI winter is coming soon.
I like the AI winter part.
Is artificial intelligence a field which deserves the “Complain, complain, complain” refrain?
What is interesting to me is the number of companies in the search and retrieval game now pitching their smart software. The idea is that the search system “knows” what the user wants.
Why not wear one of these cilices under your shirt when you attend an artificial intelligence conference?
Frankly, I don’t want a search and retrieval system to be smart. I want a system which returns relevant results for my keyword centric, Boolean query. No software “knows” what I need for my research.
Example: I recalled that 17th century clerics in Spain often starved themselves in order to experience religious visions. I could not recall the word one uses to describe the “vest” of sharp wire some of these individuals wore to enhance their suffering. I tried Bing. I tried Google. I tried Yandex. Finally I changed my angle of attack and poked around for redemptive suffering. Bingo. I saw a reference to cilice, and I remembered the word.
Conclusion: I will be long gone before smart software can anticipate what I need and refine my search so that I can locate the information I only vaguely remember.
Can Watson help me? Not yet. I know one thing. The craziness of the AI marketers is the 2018 equivalent of a cilice. Those gizmos are painful and make it easier to perceive the reality of software.
Stephen E Arnold, May 4, 2018
Web Archives
May 4, 2018
Short honk: Here’s a list of Web archives. These services allow one to find pages from a defunct or unavailable Web site or page:
- Archive Today at http://archive.is/
- Internet Archive at http://archive.org/web/
- Perma.cc at https://perma.cc/ (which is a collection of permalinks)
The Beyond Search goose has learned to generate a PDF of information. Quite a bit of content “disappearing” is taking place. To cite one example: Try to locate the list of MIC, RAC, and ZPIC vendors once engaged in locating health care billing fraud and similar misunderstandings. Enjoy your hunt for these items of information.
The source article was “Force Archive Websites to Pick up Webpages with This Handy Tool.”
Stephen E Arnold, May 4, 2018
Tech Giants Playing Hardball or Shadow Boxing?
May 3, 2018
I don’t have a dog in this very confused kennel. “Tech Giants Hit by NSA Spying Slam Encryption Backdoors.” I must admit that I had to read the headline twice. I think the “real news” outfit ZDNet is stating that the US government is spying. Therefore, the “tech giants” want to make it more difficult for the US government to access messages of “tech giants’” customers.
I may be wrong, but “hit,” “slam”, and “backdoors” are words that suggest the US government is a pretty bad outfit.
Okay, what does the “real news” outfit assert? I noted this passage:
A coalition of Silicon Valley tech giants has doubled down on its criticism of encryption backdoors following a proposal that would give law enforcement access to locked and encrypted devices.
I interpret this statement as a “tech giants” refusing to help the US government access encrypted, obfuscated, or otherwise secret content generated, housed, or stored on the giants’ systems.
The problem is that I noted these two developments in the last week or so:
- First, Amazon and Google are taking steps to prevent Signal from using these tech giants’ systems as a way to sidestep certain blocking actions. The spoof is up, if Amazon and Google follow through with their anti-Signal message.
- Second, Facebook witnessed the departure of an advocate of strong encryption. The individual wanted to beef up encryption, and someone in charge of WhatsApp wanted looser encryption.
These two examples suggest that not all tech giants are hitting back at the US government. On the contrary, I could easily interpret these actions this way:
- Amazon wants to become a player in policeware. The Signal move could be similar to one’s high school dreamboat fluttering her / his eyes at a potential prom date.
- The Facebook move could be interpreted as the equivalent of Marc Zuckerberg donning a barb wire or hair shirt to demonstrate his willingness to do wear a digital cilice to atone for his alleged data sins.
Could there be cooperation among tech giants and the US government when certain issues such as national security come into play?
What do you think? Hard ball or shadow boxing. Getting hit by a 90 mile per hour pitch can hurt. Getting nailed by a shadow is comparatively tame.
Net net: I am not sure I buy into the “hit back” argument.
Stephen E Arnold, May 3, 2018
Google: Innovation Desperation?
May 3, 2018
I have lost track of the ways Google tries to spark innovation. Years ago there was something called Google Ventures and before that “20 percent free time.” Today Google has demonstrated its hunger, need, and thirst for innovation by crating an investment mechanism for the Alexa killer, Google Assistant. “Google Starts Throwing Cash at Google Assistant Startups” explains:
Google is launching a new investment program for early-stage startups working to broaden Google Assistant hardware or features. The new program provides financial resources, early access to Google features and tools, access to the Google Cloud Platform, and promotional support in efforts to bolster young companies. Google says its investment program will also support startups focusing on Google Assistant‘s use in travel, hospitality, or games industries.
Like Apple, Google is watching the Alexa McLaren eat up the miles. I know it is silly to compare Amazon, Apple, and Google. Amazon sells books and plans to become a policeware hub. Apple sells hardware and wants to be a services vendor as iPhone X devices provide evidence that peak mobile phone day has arrived. And Google? It is after 20 years of trying to be different, still sells online ads.
The fix is to pay “entrepreneurs,” high school students, MBAs, and homeless FORTRAN programmers to build and expand the Google Assistant ecosystem.
Will the play work? My thought is that Google looks a bit wild eyed with its innovation efforts.
Perhaps it is true that I am worn out by Silicon Valley gyrations. Google, according to the write up, has “passion for the digital assistant ecosystem.”
That’s a plus.
But after 20 years of innovation, Google remains, as Steve Ballmer observed, a one trick pony. Throwing money at the pony is a long shot to change the beast into something different.
Worth watching the transformation attempt, however.
Stephen E Arnold, May 3, 2018
Emerdata: A Rose by Any Other Name Would Smell As Sweet
May 3, 2018
In August 2017, Emerdata came into being. CarolineO tweeted:
Say bye to Cambridge Analytica and hello to Emerdata, the data firm established in August by SCL/Cambridge Analytica executives. Rebekah Mercer and her sister are directors of Emerdata. Emerdata shares an address with Cambridge Analytica’s UK office.
Tweets are “real” news when messages originate with @RVAwonk.
The point is that Cambridge Analytica seems to have begun the process of disappearing: Bankruptcy in the US and shuttered offices.
The Register, a UK online publication, reported:
Cambridge Analytica dismantled for good? Nope: It just changed its name to Emerdata.
The death of a brand triggered a recollection from a novelist who was okay with a different identity. Here’s the quote:
“It is never too late to be what you might have been.” —George Eliot
George Eliot was not a cricket playing, boater wearing, gad fly. Nope, George was Mary Anne Evans. She / he wrote some darned exciting novels; for example, that page turner Mill on the Floss and my fave Daniel Derona.
The idea was that anyone who knew Mary Anne would never realize she was really a he.
Is there a lesson for Cambridge Analytica? Sorry, I meant Emerdata? No one will ever know that Facebook fan boy and customer of a certain Cambridge lecturer were one and the same.
I am completely fooled. I assume you are too.
MBA speak labels the move “repositioning” or “rebranding.” I go with the moniker a George Eliot move.
As George / Mary observed:
Our deeds are like children that are born to us; they live and act apart from our own will. Nay, children may be strangled, but deeds never: they have an indestructible life both in and out of our consciousness.
Nothing like deeds.
Stephen E Arnold, May 3, 2018
Cambridge Analytica: Those Greek Tragedians Understood Bad Decisions
May 2, 2018
Here in Harrod’s Creek, the echoes of the khoros are sometimes audible. For example, we heard that Cambridge Analytica, the zippy data outfit has folded its tents. The firm’s offices in the US and elsewhere are shuttered or in the process of turning out the lights and unplugging from the interwebs.
We noted this statement in Gizmodo:
The news was announced during a conference call led by Julian Wheatland, the current chairman of the SCL Group who was reportedly tapped to take over as Cambridge Analytica’s next CEO. Both companies will now close their doors. During the call, Wheatland said that the board determined that rebranding the company’s current offerings in the current environment is “futile.”
In the online information the Beyond Search team has been reviewing, there was no reference to this statement, allegedly crafted by Sophocles:
I would prefer even to fail with honor than win by cheating.
CNBC reported:
The firm is shuttering in part due to mounting legal fees associated with its investigation into whether there had been any wrongdoing with regard to Facebook data, according to the Wall Street Journal who first reported the shuttering.
That “real” news report did not include this statement, allegedly penned by Euripides:
Cleverness is not wisdom.
Will other clever individuals bump into Greek truisms?
The odds in Harrod’s Creek that trouble will be coming to some social media outfits are the same as those on Justified, the favorite in the Kentucky Derby.
Race tracks do need workers to clean their Augean stables. The work is less sporty than crunching data of mysterious origins, but it pays because horses are often more reliable than some humans. Plus there is a party for workers after the race on Saturday.
Stephen E Arnold, May 2, 2018
Amazon Dumps Domain Fronting
May 2, 2018
Short honk: Beyond Search noted this article: “Amazon Closes Anti-Censorship Loophole on Its Servers.” The main idea is that urls can be obfuscated. The purpose of domain fronting ranges from simplifying traffic flow for users or systems or to permit a VPN type function without using a user installed VPN. The write up points out:
Amazon Web Services (AWS) is cracking down on domain fronting, a practice that some folks use to get round state-level internet censorship of the likes seen in China and Russia (among other countries).
A couple of points:
- Facebook has taken some steps to make secret communications less secret. The founder of WhatsApp (which Facebook acquired) has apparently quit over this privacy affecting change.
- Google stopped supporting domain fronting
- A number of countries have taken steps to crack down on messaging which cannot be decrypted.
But the Amazon change is more interesting for the Beyond Search team. Is it possible that Amazon is streamlining its systems in order to create a new service platform?
Our colleagues who work on the DarkCyber news program have raised this possibility.
Is Amazon ready to reveal its next big thing? The success of that next big thing may pivot on becoming more government centric. Could that be happening to everyone’s favorite digital Wal-Mart?
Worth monitoring or attending my lecture about the possible Amazon play at the Telestrategies ISS conference in Prague in about three weeks. I will be taking a look at what’s called cross correlation. More information about that is located at this Wolfram Mathworld link.
Stephen E Arnold, May 2, 2018
Houston, We May Want to Do Fake News
May 2, 2018
The fake news phenomenon might be in the public eye more, thanks to endless warnings and news stories, however that has not dulled its impact. In fact, this shadowy form of propaganda seems to flourish under the spotlight, according to a recent ScienceNews story, “On Twitter, The Lure of Fake News is Stronger than Truth.”
According to the research:
“Discussions of false stories tended to start from fewer original tweets, but some of those retweet chains then reached tens of thousands of users, while true news stories never spread to more than about 1,600 people. True news stories also took about six times as long as false ones to reach 1,500 people. Overall, fake news was about 70 percent more likely to be retweeted than real news.”
That’s an interesting set of data. However, anyone quick to blame spambots for this amazing proliferation of fake news needs to give it a second look. According to research, bots are not as much to blame for this trend than humans. This is actually good news. Ideally, changes can be made on the personal level and we can eventually stamp out this misleading trend of fake news.
But if fake news “works”, why not use it? Not even humans can figure out what’s accurate, allegedly accurate, and sort of correct but not really. Smart software plus humans makes curation complex, slow, and costly.
That sounds about right or does it?
Patrick Roland, May 2, 2018