Search Pinterest Pictures Without Pinterest

April 25, 2017

Pinterest is the beloved social media network, where users can post pictures, make comments, get decorating ideas, and recipes.  However, Recode tells us about a new implausible Google Chrome extension: “Pinterest Will Now Let You Search For Products Using Any Image You Find Online-Without Visiting Pinterest.”  Pinterest just launched a new Google Chrome extension that allows users to save images seen online as they browse.  The extension will work like this:

The new tool lets you select an item in any photograph online, and ask Pinterest to surface similar items using its image recognition software.  For example: If you see an image of sunglasses you like on Nordstrom.com, you could use the extension to browse similar glasses from Pinterest without ever leaving Nordstrom’s website.  If you click on one of the search results, you’ll then be taken to Pinterest.

Pinterest wants to leverage itself as an image search engine for all images, in real life and on the Internet.  Evan Sharp, Pinterest co-founder, said that users, should not “..have to put their thoughts into words to find great ideas.”  Visual search technology already exists, but only on Pinterest’s Web site.

Whitney Grace, April 25, 2017

Mother Google to Fix the Ad Problem

April 24, 2017

I read “Google Faces Competition Scrutiny over Plans to Build Ad Blocker into Chrome.” My recollection is that there has been some chatter about how Google displays ads in videos. I have also heard that some folks are wondering why certain ads appear in certain result sets. I enjoy no display videos which contain links to Web sites selling teen fashion; for example:

image

I am not sure what’s an ad and what’s a misfire.

The write up raises a different issue; namely,

Google introducing ad blocking, however, would have massive implications. The browser has a 58.6pc worldwide market share, according to NetMarketShare, against 19pc for Internet Explorer, the second-most popular. It could well attract interest from regulators given Google’s huge online advertising business. Google made $22.4bn (£17.5bn) in advertising revenue in the final quarter of last year, up 17pc annually, and undermining other adverts may be seen as an attempt to boost its own business.

What an unusual idea? Google possibly trying to “boost its own business.”

How could Google or any other Silicon Valley be viewed as acting in a way that is not fair, objective, and beneficial for customers?

We love Google and find the suggestion that Google would behave in an untoward manner unseemly.

Stephen E Arnold, April 24, 2017

The Big Dud

April 24, 2017

Marketers often need a fancy term periodically to sell technologies to large companies. Big Data and Hadoop was one such term. After years of marketing, adopters are yet to see any results, let alone any ROI.

Datamani recently published an article titled Hadoop Has Failed Us, Tech Experts Say in which the author says:

Many companies still run mainframe applications that were originally developed half a century ago. But thanks to better mousetraps like S3 (for storage) and Spark (for processing), Hadoop will be relegated to niche and legacy statuses going forward.

One of the primary concerns with Hadoop is that only handful of people know how to play it. For data scientists to make head and tail out of data, precise data queries and mining needs to be done. The dearth of experts, however, is hampering efforts of companies who want to make Big Data work for them. Other frameworks are trying to overcome problems put forth by Hadoop, but many companies have already adopted it and are stuck with it. And just like many fads, Big Data might fade into oblivion.

Vishal Ingole, April 24, 2017

Context: Are You Confused? What Is Your Context?

April 21, 2017

Human utterances can be difficult to figure out. When I departed the sunny climes of Washington, DC, to take a job with the Courier Journal & Louisville Times Company, I found myself in a new “context.” Working on money losing database products was a different context for me.

Obviously, Louisville was not the zip zip Right Coast. The shift from consulting to doing was a different context. And there were others. Each context shaped my talking and writing.

To most native speakers of English, in the database unit of the Courier Journal, the word “terminal” referred to one of the ever reliable gizmos that connected to the super user friendly DEC 20, TIPS typesetting, and, of course, to the home brew content management system used for the companies money losing databases.

The context of the work unit made clear to someone working with the DEC 20 that the word “terminal” did not mean the airport terminal, the relative who was dying of a rare blood disorder, or the weird little wire holding thingy on my model train’s Lionel transformer.

Language and understanding does depend on context.

I read “Bog Data Context: Targeting Relevant Data That’s Fit for Purpose.” Let me tell you that I was excited to find that context is getting some Big Data love. I learned:

Context is critical.

Well, I agree. It is 2017, and the context idea has been around for many years.

The write up includes a graphic to explain the challenge of context:

image

The idea is that an entity named John Doe appears in different databases and apparently uses a number of social media services. How does a human or smart software figure out what data goes with each John Doe.

Yep, this is a problem law enforcement and intelligence professionals have been considering for many years. Other people want to match up people with data pertinent to a specific entity; for example, financial institutions, online matchmakers, and government immigration officials.

Unfortunately putting a person in a context with pertinent data is a bit of a sticky wicket.

How does one solve this apparently tough problem? I learned from the write up:

There needs to be a focus on relevant data.”

No disagreement from me. But focus is not solving the context problem.

The article meanders through a number of ideas which do not strike me as directly related to the problem of figuring out context and then the meaning of utterances of a particular person. My thought is that the write up is not really about context. The article wants to use buzzwords and jargon to give the impression that context is going to less of a problem if someone implements many processes and procedures. These range from figuring out how trustworthy a source of data is to matching “representational effectiveness” with a model of context.

I learned that data lakes must not become “data graveyards.”

Okay, good idea. But I thought the article was tackling the problem of context, figuring out the meaning from its particular location among key signals like geography, behavior, and the nitty gritty of language itself.

How confused was I? Pretty confused. Here’s the last paragraph of the context write up:

There are a lot of starting points, a lot of pathways, in managing information in this rapidly changing data landscape. As McKnight said, “beyond the mountain is another mountain,” and Patricio reflected that this is a “continuous cycle of processing and evaluation.” Our data lakes will not be static; cannot afford to become data graveyards. But keeping them from becoming so requires us to continually reflect on the business problems we are trying to solve, to ask questions of the data, to understand the context of the data, and to measure and evaluate the fitness of the data for our purposes. With Big Data context in mind, we can mature our organizations and make more effective data-driven business decisions.

No wonder context remains a challenge. What is easy is writing headlines for what is:

  1. Cooking up an earthworm of quotes as a post conference rah rah
  2. Making the write up fit the title
  3. Moving beyond the obvious.

Wow.

Stephen E Arnold, April 21, 2017

AI Might Not Be the Most Intelligent Business Solution

April 21, 2017

Big data was the buzzword a few years ago, but now artificial intelligence is the tech jargon of the moment.  While big data was a more plausible solution for companies trying to mine information from their digital data, AI is proving difficult to implement.  Forbes discusses AI difficulties in the article, “Artificial Intelligence Is Powerful Stuff, But Difficult To Scale To Real-Life Business.”

There is a lot of excitement brewing around machine learning and AI business possibilities, while the technology is ready for use, workers are not.  People need to be prepped and taught how to use AI and machine learning technology, but without the proper lessons, it will hurt a company’s bottom line.  The problem comes from companies rolling out digital solutions, without changing the way they conduct business.  Workers cannot just adapt to changes instantly.  They need to feel like they are part of the solution, instead of being shifted to the side in the latest technological trend.

CIO for the Federal Communications Commission Dr. David Bray said that:

The growth of AI may shift thinking in organizations. ‘At the end of the day, we are changing what people are doing,; Bray says. ‘You are changing how they work, and they’re going to feel threatened if they’re not bought into the change. It’s almost imperative for CIOs to really work closely with their chief executive officers, and serve as an internal venture capitalist, for how we bring data, to bring process improvements and organizational performance improvements – and work it across the entire organization as a whole.

Artificial intelligence and machine learning are an upgrade to not only a company’s technology but also how a company conducts business.  Business processes will need to be updated to integrate the new technology, but also how workers will use and interface it.  Businesses will continue facing problems if they think that changing technology, but not their procedures are the final solution.

Whitney Grace, April 21, 2017

Voice Search and Big Data: Defining Technologies for 2017

April 20, 2017

I read “Voice Search and Data: The Two Trends That Will Shape Online Marketing in 2017.” If the story is accurate, figuring out what people say and making sense of data (lots of data) will create new opportunities for innovators.

The article states:

Advancements in voice search and artificial intelligence (AI) will drive rich answers that will help marketers understand the customer intent behind I-want-to-go, I-want-to-know, I-want-to-buy and I-want-to-do micro-moments. Google has developed algorithms to cater directly to the search intent of the customers behind these queries, enabling customers to find the right answers quickly.

My view is that the article is correct in its assessment.

Where the article and I differ boils down to search engine optimization. The idea that voice search and Big Data will make fooling the relevance algorithms of Bing, Google, and Yandex a windfall for search engine optimization experts is partially true. Marketing whiz kids will do and say many things to deliver results that do not answer my query or meet my expectation of a “correct” answer.

My view is that the proliferation of systems which purport to understand human utterances in text,and voice-to-text conversions will discover that the the error rates of 60 to 75 percent are not good enough. Errors can be buried in long lists of results. They can be sidestepped if a voice enabled system works from a set of rules confined to a narrow topic domain.

Open the door to natural language parsing, and the error rates which once were okay become a liability. In my opinion, this will set off a scramble among companies struggling to get their smart software to provide information that customers accept and use repeatedly. Fail and customer turnover can be a fatal knife wound to the heart of an organization. The cost of replacing a paying customer is high. Companies need to keep the customers they have with technology that helps keep paying customers smiling.

What companies are able to provide higher accuracy linguistic functions? There are dozens of companies which assert that their systems can extract entities, figure out semantic relationships, and manipulate content in a handful of languages.

The problem with most of these systems is that certain, very widely used methods collapse when high accuracy is required for large volumes of text. The short cut is to use numerical tricks, and some of those tricks create disconnects between the information the user requests or queries and the results the system displays. Examples range from the difficulties of tuning the Autonomy Digital Reasoning Engine to figuring out how in the heck Google Home arrived at a particular song when the user wanted something else entirely.

Our suggestion is that instead of emailing IBM to sign a deal for that companies language technology, you might have a more productive result if you contact Bitext. This is a company which has been on my mind. I interviewed the founder and CEO (an IBM alum as I learned) and met with some of the remarkable Bitext team.

I am unable to disclose Bitext’s clients. I can suggest that if you fancy a certain German sports car or use one of the world’s most popular online services, you will be bumping into Bitext’s Digital Linguistic Analysis platform. For more information, navigate to Bitext.com.

The data I reviewed suggested that Bitext’s linguistic platform delivers accuracy significantly better than some of the other systems’ outputs I have reviewed. How accurate? Good enough to get an A in my high school math class.

Stephen E Arnold, April 20, 2017

How to Use a Quantum Computer

April 20, 2017

It is a dream come true that quantum computers are finally here!  But how are we going to use them?  PC World discusses the possibilities in, “Quantum Computers Are Here—But What Are They Good For?”  D-Wave and IBM both developed quantum computers and are trying to make a profit from them by commercializing their uses.  Both companies agree, however, that quantum computers are not meant for everyday computer applications.

What should they be used for?

Instead, quantum systems will do things not possible on today’s computers, like discovering new drugs and building molecular structures. Today’s computers are good at finding answers by analyzing information within existing data sets, but quantum computers can get a wider range of answers by calculating and assuming new data sets.  Quantum computers can be significantly faster and could eventually replace today’s PCs and servers. Quantum computing is one way to advance computing as today’s systems reach their physical and structural limits.

What is astounding about quantum computers are their storage capabilities.  IBM has a 5-qubit system and D-Wave’s 2000Q has 2,000 qubit.   IBM’s system is more advanced in technology, but D-Wave’s computer is more practical.  NASA has deployed the D-Wave 2000Q for robotic space missions; Google will use it for search, image labeling, and voice recognition; and Volkswagen installed it to study China’s traffic patterns.

D-Wave also has plans to deploy its quantum system to the cloud.  IBM’s 5-qubit computer, on the other hand, is being used for more scientific applications such as material sciences and quantum dynamics.  Researchers can upload sample applications to IBM’s Quantum Experience to test them out.  IBM recently launched the Q program to build a 50-qubit machine.  IBM also wants to push their quantum capabilities in the financial and economic sector.

Quantum computers will be a standard tool in the future, just as the desktop PC was in the 1990s.  By then, quantum computers will respond more to vocal commands than keyboard inputs.

Whitney Grace, April 20, 2017

Image Search: Biased by Language. The Fix? Use Humans!

April 19, 2017

Houston, we (male, female, uncertain) have a problem. Bias is baked into some image analysis and just about every other type of smart software.

The culprit?

Numerical recipes.

The first step in solving a problem is to acknowledge that a problem exists. The second step is more difficult.

I read “The Reason Why Most of the Images That Show Up When You Search for Doctor Are White Men.” The headline identifies the problem. However, what does one do about biases rooted in human utterance.

My initial thought was to eliminate human utterances. No fancy dancing required. Just let algorithms do what algorithms do. I realized that although this approach has a certain logical completeness, implementation may meet with a bit of resistance.

What does the write up have to say about the problem? (Remember. The fix is going to be tricky.)

I learned:

Research from Princeton University suggests that these biases, like associating men with doctors and women with nurses, come from the language taught to the algorithm. As some data scientists say, “garbage in, garbage out”: Without good data, the algorithm isn’t going to make good decisions.

Okay, right coast thinking. I feel more comfortable.

What does the write up present as wizard Aylin Caliskan’s view of the problem? A post doctoral researcher seems to be a solid choice for a source. I assume the wizard is a human, so perhaps he, she, it is biased? Hmmm.

I highlighted in true blue several passages from the write up / interview with he, she, it. Let’s look at three statements, shall we?

Regarding genderless languages like Turkish:

when you directly translate, and “nurse” is “she,” that’s not accurate. It should be “he or she or it” is a nurse. We see that it’s making a biased decision—it’s a very simple example of machine translation, but given that these models are incorporated on the web or any application that makes use of textual data, it’s the foundation of most of these applications. If you search for “doctor” and look at the images, you’ll see that most of them are male. You won’t see an equal male and female distribution.

If accurate, this observation means that the “fix” is going to be difficult. Moving from a language without gender identification to a language with gender identification requires changing the target language. Easy for software. Tougher for a human. If the language and its associations are anchored in the brain of a target language speaker, change may be, how shall I say it, a trifle difficult. My fix looks pretty good at this point.

And what about images and videos? I learned:

Yes, anything that text touches. Images and videos are labeled to they can be used on the web. The labels are in text, and it has been shown that those labels have been biased.

And the fix is a human doing the content selection, indexing, and dictionary tweaking. Not so fast. The cost of indexing with humans is very expensive. Don’t believe me. Download 10,000 Wikipedia articles and hire some folks to index them from the controlled term list humans set up. Let me know if you can hit $17 per indexed article. My hunch is that you will exceed this target by several orders of magnitude. (Want to know where the number comes from? Contact me and we discuss a for fee deal for this high value information.)

How does the write up solve the problem? Here’s the capper:

…you cannot directly remove the bias from the dataset or model because it’s giving a very accurate representation of the world, and that’s why we need a specialist to deal with this at the application level.

Notice that my solution is to eliminate humans entirely. Why? The pipe dream of humans doing indexing won’t fly due to [a] time, [b] cost, [c] the massive flows of data to index. Forget the mother of all bombs.

Think about the mother of all indexing backlogs. The gap would make the Modern Language Association’s “gaps” look like weekend catch up party. Is this a job for the operating system for machine intelligence?

Stephen E Arnold, April 17, 2017

Watson and Block: Tax Preparation and Watson

April 19, 2017

Author’s Note:

Tax season is over. I am now releasing a write up I did in the high pressure run up to tax filing day, April 18, 2017, to publish this blog post. I want to comment on one marketing play IBM used in 2016 and 2017 to make Watson its Amazon Echo or its Google Pixel. IBM has been working overtime to come up with clever, innovative, effective ways to sell Watson, a search-and-retrieval system spiced with home brew code, algorithms which make the system “smart,” acquired technology from outfits like Vivisimo, and some free and open source search software.

IBM Watson is being sold to Wall Street and stakeholders as IBM’s next, really big thing. With years of declining revenue under its belt, the marketing of Watson as “cognitive software” is different from the marketing of most other companies pitching artificial intelligence.

One unintended consequence of IBM’s saturation advertising of its Watson system is making the word “cognitive” shorthand for software magic. The primary beneficiaries of IBM’s relentless use of the word “cognitive” has been to help its competitors. IBM’s fuzziness and lack of concrete products has allowed companies with modest marketing budgets to pick up the IBM jargon and apply it to their products. Examples include the reworked Polyspot (now doing business as CustomerMatrix) and dozens of enterprise search vendors; for example, LucidWorks (Really?), Attivio, Microsoft, Sinequa, and Squirro (yep, Squirro). IBM makes it possible for competitors to slap the word cognitive on their products and compete against IBM’s Watson. I am tempted to describe IBM Watson as a “straw man,” but it is a collection of components, not a product.

Big outfits like Amazon have taken a short cut to the money machine. The Echo and Dot sell millions of units and drive sales of Amazon’s music and hard goods sales. IBM bets on a future hint of payoff; for example, Watson may deliver a “maximum refund” for an H&R Block customer. That sounds pretty enticing. My accountant, beady eyed devil if there ever were one, never talks about refunds. He sticks to questions about where I got my money and what I did with it. If anything, he is a cloud of darkness, preferring to follow the IRS rules and avoid any suggestion of my getting a deal, a refund, or a free ride.

Below is the story I wrote a month ago shortly after I spent 45 minutes chatting with three folks who worked at the H&R Block office near my home in rural Kentucky. Have fun reading.

Stephen E Arnold, April 18, 2017

IBM Watson is one of Big Blue’s strategic imperatives. I have enjoyed writing about Watson, mixing up my posts with the phrase “Watson weakly” instead of “Watson weekly.” Strategic imperatives are supposed to generate new revenue to replace the loss of old revenues. The problem IBM has to figure out how to solve is pace. Will IBM Watson and other strategic imperatives generate sustainable, substantial revenue quickly enough to keep the  company’s revenue healthy.

The answer seems to be, “Maybe, but not very quickly.” According to IBM’s most recent quarterly report, Big Blue has now reported declining revenues for 20 consecutive quarters. Yep, that’s five years. Some stakeholders are patient, but IBM’s competitors are thrilled with IBM’s stratgegic imperatives. For the details of the most recent IBM financials, navigate to “IBM Sticks to Its Forecast Despite Underwhlming Results.” Kicking the can down the road is fun for a short time.

The revenue problem is masked by promises about the future. Watson, the smart software, is supposed to be a billion dollar baby who will end up with a $10 billion dollar revenue stream any day now. But IBM’s stock buybacks and massive PR campaigns have helped the company sell its vision of a bright new Big Blue. But selling software and consulting is different from selling hardware. In today’s markets, services and consulting are tough businesses. Examples of companies strugglling to gain traction against outfits like Gerson Lehrman, unemployed senior executives hungry for work, and new graduates will to do MBA chores for a pittance compete with outfits like Elastic, a search vendor which sells add ons to open source software and consulting for those who need it. IBM is trying almost everything. Still those declining revenues tell a somewhat dismal tale.

I assume you have watched the Super Bowl ads if not the game. I just watched the ads. I was surprised to see a one minute, very expensive, and somewhat ill conceived commercial for IBM Watson and H&R Block, the walk in store front tax preparer.

The Watson-Block Super Bowl ad featured this interesting image: A sled going downhill. Was this a Freudian slip about declining revenues?

image

Does it look to you that the sled is speeding downhill. Is this a metaphor for IBM Watson’s prospects in the tax advisory business?

One of IBM’s most visible promotions of its company-saving, revenue-gushing dreams is IBM Watson. You may have seen the Super Bowl ad about Watson providing H&R Block with a sure-fire way to kill off pesky competitors. How has that worked out for H&R Block?

Read more

Yahoo Pay Inequity

April 19, 2017

Former Yahoo CEO Marissa Mayer made a considerable salary, especially considering she came to power during an economic downturn.  Her replacement Thomas McInerney, however, will be making double her salary.  Fortune reports on the income differences in: “Yahoo’s New Male CEO Will Make Double Marissa Mayer’s Salary.”  Pay inequity remains a big topic in today’s job market and this rises to the top as another example of a professional male receiving more money than a woman who held the same position.

Since Yahoo has sold its technology and advertising business to Verizon, it only consists of Alibaba stock, Yahoo Japan, and other miscellaneous investments.  One can assume that McInerney will have a much easier job than Mayer did.  McInerney is the former IAC CEO and his base salary will be $2 million, over Mayer’s $1 million.  He will also be getting more income from Yahoo:

What’s more, Yahoo actually expects to pay McInerney $4 million in his first year working at the company, assuming he earns his target bonus, which is equal to his base salary, according to the new disclosures. That’s 25% more than the $3 million the company is paying Mayer for a salary and cash bonus this year. On top of that, McInerney will also be eligible for grants of long-term incentive rewards of up to $24 million, depending on achievement of performance goals. If he were to receive the maximum amount, it would also be twice as much as Mayer’s long-term incentive grant in 2015, the last full year before the Verizon deal was announced.

McInerney will be paid to run the Yahoo equivalent of a mutual fund.  Yahoo will also not be buying new stock, instead, they will focus on managing their Alibaba stock and Yahoo Japan.  Those two investments basically run themselves.

If you ask me, it sounds like once again a woman cleans up a mess, makes it manageable, and a man comes in to take the credit and more pay.

Whitney Grace, April 19, 2017

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta