Bannon Threatens Antitrust on Google and Facebook

August 15, 2017

During a time when the left and right seem further apart than ever before an odd, unexpected leak from within the white house has emerged. According to The Atlantic,

Steve Bannon, the chief strategist to President Donald Trump, believes Facebook and Google should be regulated as public utilities, according to an anonymously sourced report in The Intercept. This means they would get treated less like a book publisher and more like a telephone company. The government would shorten their leash, treating them as privately owned firms that provide an important public service.

Previously, only the far left has voiced such opinions making this questionable. Are the motives altruistic or monetary in nature? If such a move actually were to happen the way business is done at Google and Facebook would drastically change.

The article goes on to point out why and how Bannon’s musings on tech giants will never happen under the current administration, but regardless of one’s political ways, the fact that antitrust and online giants are being discussed together might signal the end of an era.

Catherine Lamsfuss, August 15, 2017

Smartlogic: A Buzzword Blizzard

August 2, 2017

I read “Semantic Enhancement Server.” Interesting stuff. The technology struck me as a cross between indexing, good old enterprise search, and assorted technologies. Individuals who are shopping for an automatic indexing systems (either with expensive, time consuming hand coded rules or a more Autonomy-like automatic approach) will want to kick the tires of the Smartlogic system. In addition to the echoes of the SchemaLogic approach, I noted a Thomson submachine gun firing buzzwords; for example:

best bets (I’m feeling lucky?)
dynamic summaries (like Island Software’s approach in the 1990s)
faceted search (hello, Endeca?)
navigator (like the Siderean “navigator”?)
real time
related topics (clustering like Vivisimo’s)
semantic (of course)
topic maps
topic pages (a Google report as described in US29970198481)
topic path browser (aka breadcrumbs?)

What struck me after I compiled this list about a system that “drives exceptional user search experiences” was that Smartlogic is repeating the marketing approach of traditional vendors of enterprise search. The marketing lingo and “one size fits all” triggered thoughts of Convera, Delphes, Entopia, Fast Search & Transfer, and Siderean Software, among others.

I asked myself:

Is it possible for one company’s software to perform such a remarkable array of functions in a way that is easy to implement, affordable, and scalable? There are industrial strength systems which perform many of these functions. Examples range from BAE’s intelligence system to the Palantir Gotham platform.

My hypothesis is that Smartlogic might struggle to process a real time flow of WhatsApp messages, YouTube content, and mobile phone intercept voice calls. Toss in the multi language content which is becoming increasingly important to enterprises, and the notional balloon I am floating says, “Generating buzzwords and associated over inflated expectations is really easy. Delivering high accuracy, affordable, and scalable content processing is a bit more difficult.”

Perhaps Smartlogic has cracked the content processing equivalent of the Voynich manuscript.


Will buzzwords crack the Voynich manuscript’s inscrutable text? What if Voynich is a fake? How will modern content processing systems deal with this type of content? Running some content processing tests might provide some insight into systems which possess Watson-esque capabilities.

What happened to those vendors like Convera, Delphes, Entopia, Fast Search & Transfer, and  Siderean Software, among others? (Free profiles of these companies are available at Oh, that’s right. The reality of the marketplace did not match the companies’ assertions about technology. Investors and licensees of some of these systems were able to survive the buzzword blizzard. Some became the digital equivalent of Ötzi, 5,300 year old iceman.

Stephen E Arnold, August 2, 2017

Google: Making Search Better. But What Does Better Mean?

July 17, 2017

I read a darned interesting (no, not remarkable, just interesting) article called “The Google Exec in Charge of Designing Search: ‘There’s Always This Internal Debate about How Much Functionality Should We Add‘”. At first, I thought this was an Onion write up, but I was wrong. The article is a serious expression of the “real” Google. Now the “old” and now “unreal” Google is not applicable. That’s why I thought the write up was like the content I present in HonkinNews.

Here are the points I noted:

First, the write up points out that Google’s core business is its search engine. This surprised me because I thought the firm’s core business was selling ads. I know the “search” system is the honey which attracts the bees (95 percent or so in Europe, for example), but the “search” system is not about finding relevant and objective information. Sure, that happens for some queries, but for most queries, the searches are easy to cache and deliver with matching ads. Examples range from the weather to the latest in the dust ups and make ups between pop stars and starlets.

Second, the source of the write up is an “expert” in “design for search.” I am not sure what “design” means. I am old fashioned and prefer the trusty calculations of precision and recall, the stale bread of Boolean queries, and unfiltered content.


I prefer to do my own censoring, thank you. I noted this statement:

The whole goal is to try to organize information and deliver it to you. That’s the problem we’re trying to solve. The design has to accommodate multiple people, multiple expectations, and multiple situations. When you’re looking for whatever answer you want, how do we give you the right answer in a way that you’re like ‘oh yeah, that thing?

No, the “whole goal” consists of sub goals designed to deliver the following, based on my research for the three books in my Google Trilogy (alas, no longer in print but I can provide pre publication copies for those who want to buy a set):

  1. Minimize computational demands on the query matching system via caching frequent queries, partitioning indexes to get around the federation of disparate content like Google Scholar, videos indexed in Google Video, and the gusher of stuff emanating from Google Blogs
  2. With clicks on traditional desktops falling and small screen video queries from smart software or humans (imagine!), Google has to find a way to make ads out of everything. Thus, the need to keep revenue ticking upwards while driving costs down becomes a fairly significant sub goal. Some, like myself, say, “Hey, that’s the actual goal.” Others who enjoy watching billions flood into solving death, keeping Glass alive, and building a new puffy office part would disagree. That’s okay. I think I am right.
  3. Maintain the PR and marketing offensive that makes Google the innovation leader in finding information. The methods involve generating mumbo jumbo that disconnects precision and recall from what Google generates: Results that are often off point or some type of content marketing. (I know content marketing works because the Wall Street Journal told me it does. I assume that’s why Google pays some people to write really rah rah articles about Google. As I said in this week’s HonkinNews, “One must be able to tell the difference between a saint who helps people and a billionaire who rides flying car things.)

The write up identifies the experience “things” which Google is incorporating into its search results. Some of these are content objects like tweets. Others are pages which look like mini reports which cobble together “facts” to make it easy for a person to “know” the answer to the question he, she, or a software module had not yet asked. (Predictive results are part of the pervasive search movement in which Google wants to be a player who gets the biggest payday and the most media love.)

I noted this statement which is worthy of one of the New Age types I bumped into when I lived in Berkeley:

When asked if there are any similarities between the design for Search and the design for Google’s new offices in Mountain View and London, Ouilhet pointed to the fact that both are becoming “more open and more flexible.” He said they were also both becoming more “inclusive between people that belong to Google and people that don’t belong to Google.”

Net net: Google has yet to find Act 2 to its Yahoo/Overture/GoTo inspired business model. Setting up more VC operations, incubators, and buying companies in easy to reach places like Bengaluru, Karnataka, and smart software offices in cheery Edmonton, Alberta are not yet delivering on Act 2. If the European Union has anything to say about Google’s search business, we will have to wait for more action from that Google watcher Margrethe Vestager.

Stephen E Arnold, July 17, 2017

PS. For information about the Google Trilogy, write benkent2020 at yahoo dot com and put Google Trilogy in the Subject field.

Women in Tech Want Your Opinion on Feminism and Other Falsehoods Programmers Believe

July 14, 2017

The collection of articles on Github titled Awesome Falsehood dives into some of the strange myths and errors believed by tech gnomes and the issues that they can create. For starters, falsehoods about names. Perhaps you have encountered the tragic story of Mr. Null, who encounters a dilemma whenever inputting his last name in a web form because it often will be rejected or even crash the system.

The article explains,

This has all gotten to the point where I’ve developed a number of workarounds for times when this happens. Turning my last name into a combination of my middle name and last name, or middle initial and last name, sometimes works, but only if the website doesn’t choke on multi-word last names. My usual trick is to simply add a period to my name: “Null.” This not only gets around many “null” error blocks, it also adds a sense of finality to my birthright.

Another list expands on the falsehoods about names that programmers seem to buy into. These include cultural cluelessness about people having first names and last names that never change and are all different. Along those lines, one awesome female programmer wrote a list of falsehoods about women in tech, such as their existence revolving around a desire for a boyfriend or to complete web design tasks. (Also, mansplaining is their absolute favorite, did you know?) Another article explores falsehoods about geography, such as the mistaken notion that all places only have one official name, or even one official name per language, or one official address. While the lists may reinforce some negative stereotypes we have about programmers, they also expose the core issues that programmers must resolve to be successful and effective in their jobs.

Chelsea Kerwin, July 14, 2017

Mistakes to Avoid to Implement Hadoop Successfully

July 7, 2017

Hadoop has been at the forefront of Big Data implementation methodologies. The journey so far has been filled with more failures than successes. An expert thus has put up a list of common mistakes to avoid while implementing Hadoop.

Wael Elrifai in a post titled How to Avoid Seven Common Hadoop Mistakes and posted on IT Pro Portal says:

Business needs specialized skills, data integration, and budget all need to factor into planning and implementation. Even when this happens, a large percentage of Hadoop implementations fail.

For instance, the author says that one of the most common mistakes that most consultants commit is treated Hadoop like any other database management system. The trick is to treat data lake like a box of Legos and start building the model with one brick at a time. Some other common mistakes include not migrating the data before implementation, not thinking about security issues at the outset and so on. Read the entire article here.

Vishol Ingole, July 7, 2017

IBM Bans Remote Work

June 22, 2017

The tech blog SiliconBeat reveals a startling development in tech-related employment in, “IBM: So Much for Working from Home.” Thousands of professionals who have built their lives around their remote-work arrangements are now being required to come into the office. For many, the shift would mean packing up and moving closer to one of the company’s locations. As writer Rex Crum puts it:

That’s right. Find your way to an office cubicle, or hit the bricks. The Wall Street Journal reported that IBM began instituting the new you-can’t-work-from-home policy this week, and that the company is ‘quietly dismantling’ the program that has been in place for decades. The Journal said the retrenchment on its employees working remotely was being done so that IBM could ‘improve collaboration and accelerate the pace of work.’ It also happens to be taking place not long after IBM reported its 20th-straight quarter of declining year-over-year revenue. Legendary all-time investor Warren Buffett also said this month that Berkshire Hathaway has cut its holdings in IBM by one-third from the 81 million shares the company owned earlier this year.

But will herding all their talent into their buildings really solve IBM’s financial woes? Not according to this Forbes article. Crum recalls that Yahoo made the same move in 2013, when Marissa Mayer put a stop to remote work at that company. (How has that been going?) Will more organizations follow?

Cynthia Murrell, June 22, 2017

Google: The Full Back Up

June 21, 2017

I read “Google Drive Will Soon Back Up Your Entire Computer.” Sounds good, right? The GOOG will make a bit for bit copy of the data and programs on one’s computer. In the event of a crash, Mother Google will be there. One can search for a file and restore it. That email archive from Thunderbird circa 2011, no problem.

I learned from the write up:

There have been requests for Dropbox to add something like this for ages, and it’s yet to get around to it. Instead, like Drive, people have always had to store files directly in the app’s local folder. For anyone looking for a bit more flexibility in their syncing apps, Google seems like it’s about to become the winning option.

I like the “winning option” for a service about which some details are fuzzy.

My question is, “Will Google scan the backed up data in order to place ads in the service? What about the availability of these data to governments when appropriate documentation is provided to the Google? What happens if the data are part of a legal matter between a person and a corporation?”

Yep, convenient.

Stephen E Arnold, June 21, 2017

Alphabet Google: Just Jobs? Not Likely

June 21, 2017

The is “Connecting More Americans with Jobs.” Sounds good. People want to work, right? Sounds like the right idea even though the notion of universal basic income is floating around like a Loon balloon. With smart software poised to displace MBAs in some of the IPO process steps, jobs are a big deal. Here in Harrod’s Creek, there are quite a few people out of work. There are even some families in which there are two or more generations of people who have never held a full time job. But that’s not a problem.

Google states:

We’re taking the next step in the Google for Jobs initiative by putting the convenience and power of Search into the hands of job seekers. With this new experience, we aim to connect Americans to job opportunities across the U.S., so no matter who you are or what kind of job you’re looking for, you can find job postings that match your needs.

When I read about job aggregation, I thought about the numerous online job services which I have observed over the years. Does anyone remember the BNA’s love affair with job hunting services? And Monster? Love that Monster thing!

From my vantage point, there are several angles to this Google service:

First, aggregating jobs is a useful source of data about people, competitors, and hiring trends. Quick example: Decades ago I was involved in a database called Pharmaceutical News Index. The hot feature of this database was that a person in the pharmaceutical industry could look up a company and see what jobs big wheels and wizards were taking. The information had high value because hires provide direct information about certain types of research initiatives. Now imagine the value of the data of Google can scrape and crunch the job data its announcement references. Valuable information? Yep, definitely above average in my book.

Second, job aggregation is a foundation stone. The service makes it possible to take another step: Matching candidates to jobs. Hey, if you are in the Google system and you want a job, why not let Google’s smart software process your profile and generate a list of potential opportunities. Google has a mostly overlooked dossier function and the nifty analytic tools to make this a walk in the part. Employers might be interested in get information from Google about hiring trends, salaries, and Glassdoor-type insights into what a company is “really like.”

Third, Google’s smart software can knit together a number of items of information about a person or a company. This “federation” of data provides an opportunity for Google to use the Recorded Future technology or a similar home brew technology to predict what is likely to happen for sectors, companies, and even product innovations.

Should Microsoft / LinkedIn be worries?


Stephen E Arnold, June 21, 2017

Editorial Controls and Data Governance: A Rose by Any Other Name?

June 16, 2017

I read “Why Interest In “Data Governance” Is Increasing.” The write up uses a number of terms to describe what I view as editorial controls. The idea in my experience is that an organization decides what it okay and not okay with regards to the information it wants to process. The object is to know what content will be processed before the organization kick starts indexing, metadata tagging, or text analysis.

The organization then has to figure out and implement the rules of the game. Questions like “What do we do when entities are not recognized?” and “Who goes through the exceptions file?” must be answered. Rules, procedures, processes, and corrective actions have to be implemented in the work flow. One cannot calculate costs, headcount, or software expenses unless one knows what’s going to happen.

The write up explains that data governance is important. I agree. The write up hooks the notion of editorial controls and editorial process to a number of buzzwords. I don’t think this type of jargon catalog is particularly helpful. Jargon distracts some people from focusing on Job One; that is, putting appropriate controls in place before nuking the budget or creating the type of editorial craziness which Facebook and Google are now trying to contain and manage.

The notion that an organization has to perform “data program management” is fine. But this is nothing more than hooking the editorial rules of the road to the responsibilities of the people who have to set up, oversee, and change the work flow.

Jargon does not help implement editorial controls. Clear thinking and speaking do.

Stephen E Arnold, June 16, 2017

DARPA Progresses on Refining Data Analysis

June 12, 2017

The ideal data analysis platform for global intelligence would take all the data in the world and rapidly make connections, alerting law enforcement or the military about potential events before they happen. It would also make it downright impossible for bad actors to hide their tracks. Our government seems to be moving toward that goal with AIDA, or Active Interpretation of Disparate Alternatives. DARPA discusses the project in its post, “DARPA Wades into Murky Multimedia Information Streams to Catch Big Meaning.” The agency states:

The goal of AIDA is to develop a multi-hypothesis ‘semantic engine’ that generates explicit alternative interpretations or meaning of real-world events, situations, and trends based on data obtained from an expansive range of outlets. The program aims to create technology capable of aggregating and mapping pieces of information automatically derived from multiple media sources into a common representation or storyline, and then generating and exploring multiple hypotheses about the true nature and implications of events, situations, and trends of interest.

‘It is a challenge for those who strive to achieve and maintain an understanding of world affairs that information from each medium is often analyzed independently, without the context provided by information from other media,’ said Boyan Onyshkevych, program manager in DARPA’s Information Innovation Office (I2O). ‘Often, each independent analysis results in only one interpretation, with alternate interpretations eliminated due to lack of evidence even in the absence of evidence that would contradict those alternatives. When these independent, impoverished analyses are combined, generally late in the analysis process, the result can be a single apparent consensus view that does not reflect a true consensus.’

AIDA’s goal of presenting an accurate picture of overall context early on will help avoid that problem. The platform is to assign a confidence level to each piece of information it processes and each hypothesis it generates. It will also, they hope, be able to correct for a journalistic spin by examining variables and probabilities. Is the intelligence community is about to gain an analysis platform capable of chilling accuracy?

Cynthia Murrell, June 12, 2017

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta