Yahoo: Me Too Becomes You Too in Fantasy Matter

November 18, 2015

I read “Yahoo a New Target in NY Daily Fantasy Sports Probe: Source.” The main idea is that Yahoo rolled out a fantasy sports service. Yahoo needed revenues and someone at Yahoo thought this me too play was a no brainer.

In the write up, I learned:

A probe by New York State’s attorney general into the fast-growing, multibillion-dollar daily fantasy sports industry has been expanded to include online media giant Yahoo Inc, a person familiar with the matter said on Tuesday. The move coincides with a court filing by Attorney General Eric Schneiderman on Tuesday seeking a temporary injunction that would shut down DraftKings and FanDuel, leaders among online companies offering paid-for daily fantasy sports contests.

Poor Yahoo. At least the issue generated an easy rhyme: Yahoo, you, too. More woes for the Xoogler’s outfit. Quite a gamble.

Stephen E Arnold, November 18, 2015

A Modest Dust Up between Big Data and Text Analytics

November 18, 2015

I wonder if you will become involved in this modest dust up between the Big Data folks and the text analytics adherents. I know that I will sit on the sidelines and watch the battle unfold. I may mostly alone on that fence for three reasons:

  • Some text analytics outfits are Big Data oriented. I would point modestly to Terbium Labs and Recorded Future. Both do the analytics thing and both use “text” in their processing. (I know that learning about these companies is not as much fun as reading about Facebook friends, but it is useful to keep up with cutting edge outfits in my opinion.)
  • Text analytics can produce Big Data. I know that sounds like a fish turned inside out. Trust me. It happens. Think about some wan government worker in the UK grinding through Twitter and Facebook posts. The text analytics output lots of data.
  • A faux dust up is mostly a marketing play. I enjoyed search and content processing vendor presentations which pitted features of one system versus another. This approach is not too popular because every system says it can do what every other system can do. The reality of the systems is, in most cases, not discernible to the casual failed webmaster now working as a “real” wizard.

Navigate to “Text Analytics Gurus Debunk 4 Big Data Myths.” You will learn that there are four myths which are debunked. Here are the myths:

  1. Big Data survey scores reign supreme. Hey, surveys are okay because outfits like Survey  Monkey and the crazy pop up technology from that outfit in Michigan are easy to implement. Correct? Not important. Usable data for marketing? Important.
  2. Bigger social media data analysis is better. The outfits able to process the real time streams from Facebook and Twitter have lots of resources. Most companies do not have these resources. Ergo: Statistics 101 reigns no matter what the marketers say.
  3. New data sources are the most valuable. The idea is that data which are valid, normalized, and available for processing trump bigness. No argument from me.
  4. Keep your eye on the ball by focusing on how customers view you. Right. The customer is king in marketing land. In reality, the customer is a code word for generating revenue. Neither Big Data nor text analytics produce enough revenue in my world view. Sounds great though.

Will Big Data respond to this slap down? Will text analytic gurus mount their steeds and take another run down Marketing Lane to the windmill set up as a tourist attraction in an Amsterdam suburb?

Nope. The real battle involves organic, sustainable revenues. Talk is easy. Closing deals is hard. This dust up is not a mixed martial arts pay per view show.

Stephen E Arnold, November 18, 2015

Google Plus or Is It +: Try and Trying Again

November 18, 2015

I read a pride of write ups about Google Plus or is it Google +. Searching for odd ball characters like “+” or “^” adds some spice to the researcher’s life.

A representative article is “Google Isn’t Giving Up on Its Social Networking Ghost Town Google +.” That’s an important idea. Google has been struggling with the Facebook type service since the days of Orkut.

Google, unlike Facebook, comes at social from the search and retrieval angle spiced with a healthy dependency on online advertising juice. Facebook originated with an idea appealing to lonely folks in a dorm.

According to the write up:

the web giant has just given the service a complete overhaul on iOS, Android and the web. The new design focuses on “collections” and “communities”, positioning Google+ as a network dedicated to interests, rather than a personal service. Its layout has also been simplified and better optimized for mobile.

Some of the comments on Hacker News were quite interesting. Here are three:

Dredmorbius: Google have been tremendously coy about what their success metrics for G+ are, though they’ve played highly disingenuous all-but-utterly-fake numbers games in playing up “engagement” since the very beginning. I’d argue that the issue isn’t numbers, but relevance. G+ is lousy in many ways but has a few small areas of success, notably its Notifications mechanic, a community which, for me, works fairly well, and a search which while pathetically under-featured is comprehensive and fast. inning the numbers game for social vs. Facebook in its current incarnation is a fool’s errand. Numerous people have pointed this out, including ex-Googlers pointing at the “Interest Graph” (though suggestions for following / pursuing this date to the first few months of G+). If Google does grab the Cosmo crowd, that’s fine, so long as it doesn’t also chase off the Nature/PLOS crowd in the process. Unfortunately, Google’s proven more than happy to sling absolute snot (as in the G+ “What Snot” feature … oh, no, that’s “What’s Hot”). Power users learn how to disable that instantly.

A second comment I noted:

Nilkn: This [Google’s design approach] is actually part of why my recent switch from Android to iOS was so refreshing. While material design looks great on some level, it seems to be so remarkably wasteful of space. Google+ actually feels claustrophobic to me in a way: there’s so much content, and yet you can see so little of it at a time. It creates a feeling of being constantly lost.

And a final one:

Pbreit:  Seems like it’s still drastically missing the mark on having a reason for being. Why would I use this? What would I put on there? Why there and not elsewhere?

The Alphabet Google thing wants to be social. It wants to generate ad revenue. It wants to be more than search. Noble goals.

Stephen E Arnold, November 18, 2015

IBM Watson: A Consumer App for Toy and Gift Buyers

November 18, 2015

When I read “IBM Watson’s New App Predicts the Must-Have Toys and Gifts,” I wondered if dead tree catalog makers would embrace Big Blue’s approach. The idea is intriguing. IBM Watson crunches data and generates outputs that guide the person looking for a toy for a grandchild or a gift for an office secret Santa party.

I learned:

The iOS app surfaces suggestions in a number of categories, like tech, health and toys and shows products that it thinks will be the most popular in each section. It also aims to predict trends so you can see whether an item will be popular throughout the holidays or if it will be a passing fad. Behind the scenes, Watson analyzes the conversations around products from social media, blogs, reader comments, product reviews and ratings to determine what will be the most on-demand items. It also takes sentiment (remember Watson’s personality identifier tool?) into account to detect exactly why something may be popular.

I thought about this. Amazon tries to suggest products, and it works once in a while. What confuses Amazon is that my wife purchases books and products which interest her. I use the same account. You can imagine the recommendations that combine math and ancient history books with mysteries written by depressed Scandinavian writers. The personal products recommendations are sometimes downright bizarre.

Microsoft also does the prediction thing with Bing. I wonder if Microsoft used Bing to determine how users of Windows 10 would respond to the “you have no choice” updates that are shoved down the Internet tubes?

For IBM, my reactions went in three directions.

First, IBM marketing and PR professionals are trying to make Lucene, acquired technology, and home brew scripts do something useful. In my own gift buying experience, I keep my eyes open, listen, and then use that information to purchase a personalized gift. For other gifts, I write a check. I am not going to fiddle with a mobile phone app. IBM is obviously aiming at a niche, which for IBM’s sake, finds this approach to buying useful.

Second, IBM Watson is not yet on my radar as a viable solution for search and content processing. IBM may be tallying huge sales, but I don’t hear about them. Even more telling is that my system for monitoring news about search and content processing snags wild and crazy assertions about how wonderful Watson is in curing cancer, making recipes, and, of course, selecting gifts. I find this difficult to believe. The sheer range of applications and capabilities attributed to Watson are difficult for me to believe.

Third, IBM has to find a way to generate substantial organic revenue. Reducing full time equivalents, buying back stock, and losing money for the Buffet machine are not inspiring confidence.

Perhaps IBM Watson will select the perfect gift? Amazon uses its recommendations to generate revenue for Amazon. IBM uses its Watson to generate public relations. Which is the better approach?

Answer: the one which makes money. I do not include the revenue IBM generates for its marketers and PR advisors.

Stephen E Arnold, November 18, 2015

All You Can View Patents

November 18, 2015

Patent information is available to peruse via the USPTO Web site and Google has an accurate patent search (that is significantly easier to use than USPTO’s search), but this does not tell the complete story of US patents.  GCN announced that the USPTO plans to remedy missing patent information in the article, “USPTO Opens The Door To Four Decades Of Patent Data.”

With the help of the Center of Science and Innovation Policy (CSSIP), the USPTO launched the new tool PatentsView:

“The new tool allows individuals to explore data on patenting activity in the United States dating back to 1976. Users can search patent titles, types, inventors, assignees, patent classes, locations and dates. The data also displays visualizations on trends and patent activity. In addition, searches include graphic illustrations and charts.”

People will be able to conduct the equivalent of an “advanced search” option of Google or an academic database.  PatentsView allows people to identify trends, what technology is one the rise or dropping, search a company’s specific patents, and flexible application programming interface to search patent information.

The USPTO wants people to access and use important patent and trademark data.  It faces the issue that many organizations are dealing with that they have the data available and even with the bonus of it being digital, but its user interface is not user-friendly and no one knows it is there.  Borrowing a page from marketing, the USPTO is using PatentsView to rebrand itself and advertise its offerings.  Shiny graphics are one way to reach people.

Whitney Grace, November 18, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Career Advice from Successful Googlers

November 18, 2015

A few words of wisdom from a Google veteran went from Quora query to Huffington Post article in, “What It Takes to Rise the Ranks at Google: Advice from a Senior Staff Engineer.” The original question was, “How hard is it to make Senior Engineer at Google.” HuffPo senior editor Nico Pitney reproduces the most popular response, that of senior engineer Carlos Pizano. Pizano lists some of his education and pre-Google experience, and gives some credit to plain luck, but here’s the part that makes this good guidance for approaching many jobs:

“I happen to be a believer of specialization, so becoming ‘the person’ on a given subject helped me a lot. Huge swaths of core technology key to Google’s success I know nothing about, of some things I know all there is to know … or at least my answers on the particular subject were the best to be found at Google. Finally, I never focused on my career. I tried to help everybody that needed advice, even fixing their code when they let me and was always ready to spread the knowledge. Coming up with projects but giving them to eager, younger people. Shine the light on other’s accomplishments. All that comes back to you when performance review season comes.”

Knowing your stuff and helping others—yes, that will go a long way indeed. For more engineers’ advice, some of which is more Google-specific, navigate to the list of responses here.

Cynthia Murrell, November 18, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

 

Indexing: A Cautionary Example

November 17, 2015

i read “Half of World’s Museum Specimens Are Wrongly Labeled, Oxford University Finds.” Anyone involved in indexing knows the perils of assigning labels, tags, or what the whiz kids call metadata to an object.

Humans make mistakes. According to the write up:

As many as half of all natural history specimens held in the some of the world’s greatest institutions are probably wrongly labeled, according to experts at Oxford University and the Royal Botanic Garden in Edinburgh. The confusion has arisen because even accomplished naturalists struggle to tell the difference between similar plants and insects. And with hundreds or thousands of specimens arriving at once, it can be too time-consuming to meticulously research each and guesses have to be made.

Yikes. Only half. I know that human indexers get tired. Now there is just too much work to do. The reaction is typical of busy subject matter experts. Just guess. Close enough for horse shoes.

What about machine indexing? Anyone who has retrained an HP Autonomy system knows that humans get involved as well. If humans make mistakes with bugs and weeds, imagine what happens when a human has to figure out a blog post in a dialect of Korean.

The brutal reality is that indexing is a problem. When dealing with humans, the problems do not go away. When humans interact with automated systems, the automated systems make mistakes, often more rapidly than the sorry human indexing professionals do.

What’s the point?

I would sum up the implication as:

Do not believe a human (indexing species or marketer of automated indexing species).

Acceptable indexing with accuracy above 85 percent is very difficult to achieve. Unfortunately the graduates of a taxonomy boot camp or the entrepreneur flogging an automatic indexing system which is powered by artificial intelligence may not be reliable sources of information.

I know that this notion of high error rates is disappointing to those who believe their whizzy new system works like a champ.

Reality is often painful, particularly when indexing is involved.

What are the consequences? Here are three:

  1. Results of queries are incomplete or just wrong
  2. Users are unaware of missing information
  3. Failure to maintain either human, human assisted, or automated systems results in indexing drift. Eventually the indexing is just misleading if not incorrect.

How accurate is your firm’s indexing? How accurate is your own indexing?

Stephen E Arnold, November 17, 2015

Watson Weekly: How Much Is That Hardware?

November 17, 2015

I read “IBM Watson Gains Processing Boost with Nvidia and Power8 Hardware.” The write up called to my attention how much computing horsepower is needed to make Watson usable.

IBM has its nifty Power8 hardware. To buttress, shore up, or help out, IBM is also using Nvidia chips.

The write up says:

IBM has now integrated Nvidia’s Tesla accelerated graphics platform into Watson’s hardware foundations in the form of Tesla K80 GPUs coupled with IBM’s Power8 processors commonly used to power servers that support enterprise and scalable cloud infrastructure.

Well and good.

The problem seems to be that as the volume of content to be processed goes up, the hardware demands will rise as well. In a world of Big Data, which IBM is embracing, what customer will be able to foot the bill for continuous hardware expansion.

What about the cloud?

IBM will have to expand its data centers. As it gets more customers for Watson, the expansion will be required. Without fast content processing, the customers may go elsewhere.

In short, Watson makes use of traditional computing methods. Those methods impose what Google and others have called the credit card debt of high technology.

In short, Watson may be too costly for IBM to scale if the customers balk at the IBM fees.

Stephen E Arnold, November 17, 2015

Deleting Data: Are They Really Gone?

November 17, 2015

I read “Gawker Media’s Data Guru Presents the Case for Deleting Data.” The main idea is that hoarding permits a reality TV program. Hoarding data may not be good TV.

The write up points out that data cleaning is not cheap. Storage also costs money.

A Gawker wizard is quoted as saying:

We effectively are setting traps in our data sets for our future selves and our colleagues… Increasingly, I find that eliminating this data from our databases is the best solution. Gawker’s traffic data is maintained for just a few months. In our own logs and databases, we only have traffic data since February. and even that’s of limited use: We’ll toss some of it before the end of the year.

Seems reasonable. However, there may be instances when dumping or just carelessly overwriting log files might not be expedient or legal. For example, in one government agency, the secretary’s “bonus” depends on showing how Internet site usage relates to paperwork reduction. The idea is that when a “customer” of the government uses a Web site and does not show up in person at an office to fill out a request, the “customer” allegedly gets better service and costs, in theory, should drop. Also, some deals require that data be retained. You can use your imagination if you are an ISP in a country recently attacked by terrorists and your usage logs are “disappeared.” SEC and IRS retention guidelines? Worth noting in some cases.

The question is, “Are data really gone once deleted?” The fact of automatic backups, services in the middle routinely copying data, and other ways of creating unobserved backups may mean that deleted data can come back to life.

Pragmatism and legal constraints as well as the “men in the middle” issue can create zombie data, which, unlike the fictional zombies, can bite.

Stephen E Arnold, November 17, 2015

More Bad News for Traditional TV

November 17, 2015

Traditional TV is in a slow decline towards obsoleteness.  With streaming options offering more enticing viewing options with less out of pocket expenses and no contracts, why would a person sign on for cable or dish packages that have notoriously bad customer service, commercials, and insane prices?  Digital Trends has the most recent information from Nielsen about TV viewing habits, “New Nielsen Study On Streaming Points To More Bad News For Traditional TV.”

Pay-for-TV services have been on the decline for years, but the numbers are huge for the latest Nielsen Total Audience report:

“According to the data, broadband-only homes are up by 52 percent to 3.3 million from 2.2 million year over year. Meanwhile, pay-TV subscriptions are down 1.2 percent to 100.4 million, from 101.6 million at this time last year. And while 1.2 percent may not seem like much, that million plus decline has caused all sorts of havoc on the stock market, with big media companies like Viacom, Nickelodeon, Disney, and many others seeing tumbling stock prices in recent weeks.”

While one might suggest that pay-for-TV services should start the bankruptcy paperwork, there has been a 45% rise in video-on-demand services.  Nielsen does not tabulate streaming services, viewership on mobile devices, and if people are watching more TV due to all the options?

While Nielsen is a trusted organization for TV data, information is still collected view paper submission forms.  Nielsen is like traditional TV and need to update its offerings to maintain relevancy.

Whitney Grace, November 17, 2015

Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta