Paying for Online: How Would This Work?
August 17, 2014
I read “The Internet’s Original Sin.” Talk about an interesting idea. Quite an insight: Pay for online access. So original. I believe the write up is confident in this radical concept.
Here is a passage I noted. The author recounts his experience at Tripod.com. He recalls:
At the end of the day, the business model that got us funded was advertising. The model that got us acquired was analyzing users’ personal homepages so we could better target ads to them. Along the way, we ended up creating one of the most hated tools in the advertiser’s toolkit: the pop-up ad. It was a way to associate an ad with a user’s page without putting it directly on the page, which advertisers worried would imply an association between their brand and the page’s content. Specifically, we came up with it when a major car company freaked out that they’d bought a banner ad on a page that celebrated anal sex. I wrote the code to launch the window and run an ad in it. I’m sorry. Our intentions were good.
Intentions that were good. Hmmm. Flash forward a lifetime in the zippy world of the Internet. I learn:
I have come to believe that advertising is the original sin of the web. The fallen state of our Internet is a direct, if unintentional, consequence of choosing advertising as the default model to support online content and services. Through successive rounds of innovation and investor story time, we’ve trained Internet users to expect that everything they say and do online will be aggregated into profiles (which they cannot review, challenge, or change) that shape both what ads and what content they see.
So what’s the fix?
One simple way forward is to charge for services and protect users’ privacy…Users will pay for services that they love.
Okay.
I recall that the for fee online services charged their users for information. This worked reasonably well, but the number of customers was modest. Dialog Information Services was the Big Dog. LexisNexis had the law firms whose employees would spend when clients paid the bill. SDC Orbit survived with some must have specialty files. Similarly there was success in a few other commercial shops.
But these services reached only those who met certain criteria:
- Money to spend
- Interest/motivation to learn the ins and outs of the systems
- Expertise to figure out what the systems were outputting.
Consumer services did come along, but these did not capture the markets which the innovators sought. Remember CompuServe? The Source? Prodigy? Dialcom?
Charging for information, in my experience, trims the number of people using a service significantly. My rule of thumb is that only three to five percent of a free service’s users will pay for the service. Those who have to use the for fee service look for ways of reducing the cost of online access.
I am confident that the whiz kids at the Atlantic have better data. Their approach might be able to show the old, panting dogs like Cambridge Scientific (Dialog), Reed Elsevier (LexisNexis), Dow Jones (Factiva), and Ebsco (bunches of confusingly named services) how to make online information generate substantial dough. Thomson Reuters and Bloomberg have a formula, but the general population is not too keen on these services.
Good enough is the cultural hook today. If one has to pay for “better”, I think there will be quite a few innovators who go back to business models that produce substantial revenue.
Like it or not, advertising is the go to solution. Oh, don’t forget to subscribe to the Atlantic in hard copy. You don’t get the good stuff for free. What’s ad supported are analyses that call for Google to walk away from $60-$65 billion in revenue this year.
I bet that is an idea that Messrs Brin and Page will embrace.
Stephen E Arnold, August 17, 2014
Who Wrote What? Will an Algorithm Catch Name Surfers?
August 17, 2014
I read “New Algorithm Gives Credit Where Credit Is Due.” The write up sparked a number of thoughts. Let me highlight a couple of passages that made it into my research file.
The focus of the paper, in my opinion, are documents intended for peer reviewed publications and conferences. The write up did not include a sample of the type of “authorship” labeling that takes place. I dug through my files and located a representative example:
This is a paper about stuffing electronics on a contact lens. Microsoft was in this game. Google hired Babak Parviz (aka Babak Amir Parviz, Babak Amirparviz, and Babak Parvis). The paper has four authors:
- H. Yao
- A. Afanasiev
- I. Lahdesmaki
- B. A. Parviz
The idea is that the numerical recipe devised at the Center for Complex Network Research will figure out who did most of the work. I think this is a good idea because my research suggests that the guys doing the heavy lifting in the lab, with Excel, and writing were Yao, Afanasiev, and Lahdesmaki. The guru for the work was Parviz. I could be wrong, so an algorithm to help me out is of interest.
One of the points I highlighted in the write up was:
Using the algorithm, which Shen [math whiz] developed, the team revealed a new credit allocation system based on how often the paper is co-??cited with the other papers published by the paper’s co-??authors, capturing the authors’ additional contributions to the field.
Okay, my take on this is that this is a variation of Eugene Garfield’s citation analysis work. That is useful, but it does not dig very deeply into the context for the paper, the patent applications afoot, or the controls placed on the writers by their employers or their conscience. In short, I need some concrete examples or better yet access to the software so I can run some tests. Yep, just like those that mid tier consulting firms (what I call azure chip consultants) do not do. For reference see the Netscout legal document or my saucisson write up.)
The second point is that the sample strikes me as small. I know the rule of thumb that one well regarded researcher used was 50 in the sample, but there are hundreds of thousands of technical papers. Many are available as open source from services like PLOS One. Here’s the point I noted:
the team looked at 63 prize-??winning papers using the algorithm. In another finding, the algorithm showed physicist Tom Kibble, who in 1964 wrote a research paper on the Higgs boson theory, should receive the same amount of credit as Nobel prize winners Peter Higgs and François Englert.
I think the work is interesting, but it is in my opinion not ready for prime time.
I know that one content processing firm almost totally dependent on the US Army for funding has been working to identify misinformation, disinformation, and reformation. So far, the effort has yielded no commercial product. Other companies purport to have the ability to “understand” content. Presumably this includes the entities identified in the content object. Progress has stalled. Smart software is easier to write about in a marketing slide deck or a proposal than actually deliver.
That’s why authorship remains something a human has to chase down. Let me give you an example. I provided research to IDC, a mid tier consulting firm in 2012. From august 2012 to July 17, 2014, IDC marketed reports that carried my name, two of my research assistants’ names, and an IDC “expert’s” name. Dave Schubmehl, the IDC “expert” in search is listed as the “author.”
Now is he?
I am confident that in his mind and in IDC’s corporate wisdom he is the man. The person who justifies surfing on another’s name illustrates a core problem in authorship. You can see examples of Dave Schubmehl’s name surfing at this link. The sale of one of these documents on Amazon was an interesting attempt to gain traction for Dave Schubmehl in the high traffic eBook store. See “Amazon May Be Disintermediating Publishers: Maybe Good News for Authors.” I include a screen shot of the Amazon “hit.” My legal eagle successfully got the document removed from Amazon. I am not an Amazon author and don’t want to be.
Hopefully the algorithm to identify the “real” author of a series of $3,500 reports will become a commercial reality. I am interested to learn if there are any other mid tier consulting firms that have used others’ content without getting appropriate permissions. How many “experts” follow the IDC path of expediency?
For now, name surfers have to tracked one by one. Shubmehl and Arnold are now linked. Arnold is the surfboard; Schubmehl is the surfer. Catch a wave is the motto of many surfers.
Stephen E Arnold, August 17, 2014
The Guardian Explores HP Autonomy
August 16, 2014
I read “Hewlett-Packard Allegations: Autonomy Founder Mike Lynch Tries to Clear Name.” The British “real” newspaper focuses on Mike Lynch, the founder of Autonomy. I am convinced that Autonomy pitched the value of its company to a number of firms. I know that Hewlett Packard bought Autonomy. I assume that spending $11 billion was not a K Mart blue light special impulse purchase. I know that HP has had what the MBAs call “governance challenges.” These range from allegations of getting frisky with folks to management churn. I know that for me, the HP of electronic devices yielded to the HP of the ink cartridges.
Here’s a point I highlighted in the Guardian’s write up:
Meanwhile, lawyers on all sides are using legal privilege to sling mud. Lynch says it is not only his name that has been stained, but that of the British technology industry. Autonomy’s accounting and marketing methods had attracted criticism before the HP acquisition, but Lynch was also a poster child for the achievements of Cambridge’s Silicon Fen. The Autonomy affair casts a shadow, and a conclusion from the SFO is overdue.
I have a slightly different view of the dust up. Folks want to believe that information retrieval will generate another Google. Because of those expectations, executives whose expertise in search extends to running a Google search on a mobile device assume they know about content processing.
When buyers get excited about a purchase, some people buy Bugatti Veyrons and spring for gold iPhones. Others snap up search companies and expect the money to roll in like the oohs and aahs at the golf club when the Veyron rolls up.
Wrong. The dust up between HP and Autonomy is an illustration of what happens when folks without too much understanding of content processing’s complexities covet a home run. The impact does affect Mike Lynch, a Cambridge PhD and real live inventor.
The collateral damage is on the buyers of search companies who toss millions at a sector without understanding how difficult it is to create a search company that is not selling ads or living exclusively on Department of Defense largesse.
HP bought a company with a strong brand, customers, and technology that when properly resourced works. HP did not buy a Google scale money stream, a Palantir clinging to the US government, or a break even metasearch system.
The impact on the reputation of Autonomy professionals is significant. What does this dispute do to other search and content processing companies? Search is tough enough without having a megaton dispute played out in the datasphere.
HP did not have to buy Autonomy. Microsoft passed. Oracle passed. HP bought. HP had time and resources to dig through Autonomy. If it did not, then HP created its own problem. If it did, HP created its own problem. Autonomy, with 15 years of history, was looking for a buyer. My hunch is that HP was looking for a Google and bought a different business because HP convinced itself it could generate more money than Autonomy could. HP found out that it could not match Autonomy’s revenues. Whom does any self respecting MBA or lawyer blame? The other guy.
This hassle says much about HP. Sadly it affects other search and content processing companies as well.
Stephen E Arnold, August 16, 2014
Google Glass Creator Shifts to Amazon
August 15, 2014
As he heads out the Google X door, Google Glass developer Babak Parviz notes that pervasive use of that device (or ones like it) is far from inevitable. In CNet’s “Google Glass Creator: Glass Not Only Answer to Life After Smartphones,” writer Richard Nieva reports on Parviz’s wider viewpoint:
“People are increasingly moving away from desktop computers and latching on to smartphones and tablets, and Glass was born from trying to figure out where the next great platform shift would take us. ‘Google Glass is one answer to that question,’ said Babak Parviz, a director at Google X, at the Wearable Technologies Conference here. ‘It’s not necessarily the definitive answer.’”
Alas the article does not share examples of alternatives Parviz has in mind, leaving the curious hanging. It does, though, illustrate that Parviz understands why some criticize the face-mounted computer. The article continues:
“But for all the possible benefits, Parviz is still aware of the danger of making these next-gen devices alienating experiences. ‘As these technologies set in, some of the humanity comes out,” he said. “There’s a balance between what technology allows and what technology takes away.”
Now we know at least one person behind Google Glass seems to understand the challenge of placing such technology in the larger culture. Interestingly, Parviz is now heading to Amazon, a company that has become nearly as keen on product diversity as Google is. Can Parviz help Amazon stay (get) on the right side of culture commentary as he helps it pursue innovative tech? After all, Amazon can’t expect to stay “bulletproof” forever.
Cynthia Murrell, August 15, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Read Non-fiction Books In A Blink
August 15, 2014
While it is easier than ever to access and read books, the act of reading still takes time. An even bigger issue is selecting what to read in the onslaught of digital media. User reviews help, especially for fiction books. What do you do, however, if you want unbiased summaries of non-fiction books? Those are still in short supply, because the first half of book summaries are taken up by accolades and then the text’s meat is too short to gauge the book.
What if you could you get useful summaries in the blink of an eye? Blinkist is the solution then! Blinkist is a service to help people digest non-fiction book reviews without relying on academic databases or subjective insights:
“Blinks are powerful bites of insight from outstanding nonfiction. You can read a blink in less than two minutes, so with each book made up of about eight blinks, you can cover the work’s key insights in 15 minutes.”
Blinkest’s book blinks are 100 percent original, they claim, and are not meant to replace books. The service is supposed to help people determine what books they want to read further and also save them time. It encourages learning and satisfies a person’s curiosity about a book’s content. Fulfilling the requirements of most book applications these days, Blinkest is available on all mobile devices.
Blinkest makes the academic world more consumer friendly. Academic databases, such as Ebsco and Elsevier, will not like the new competition. Academic databases are already expensive, offering a cheaper alternative stab at their profits. Blinkest needs to venture into academic journals next.
Whitney Grace, August 15, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Netscout Gartner Document
August 14, 2014
I learned that some folks were not able to locate the Netscout Gartner document referenced in this Diginomica article. You may want to try and get the 27 megabyte court filing at http://slidesha.re/1pPsY21.
This is definitely worth some face time. Parts evoked in me a “stop repeating yourself” but other bits were juicy indeed if true. Plus, there are some allegedly accurate factoids in the document and an illustration purporting to show the Gartner products and services. Keep in mind that this document presents only Netscout’s point of view. I find the information in the document compelling and thought provoking. For me, Netscout’s array of data seem close to reality.
If I come across the Gartner response, I will try to remember to post an item in Beyond Search. But as a former nuclear consultant who was lured into a top tier consulting company, who knows? I have my attention riveted by an IDC swizzle which allowed my content to be sold on Amazon without my permission and with another person’s name on it. Clever stuff these “experts” find to do.
I highly recommend the slide on page 27 of the Netscout legal document. I would like to include it in this short write up, but I don’t have a dog in this Netscout Gartner squabble.
Stephen E Arnold, August 14, 2014
Short Honk: Buzzword Mania and the Internet of Things
August 14, 2014
Short honk: I don’t have too much to say about “Gartner: Internet of Things Has Reached Hype Peak .” Wow will have to suffice. The diagram in the article is amazing as well. A listicle is pretty darned limited when compared to a plotting of buzzwords from a consulting firm that vies with McKinsey, Bain, Boston Consulting, and Booz for respect. Another angle on this article is that it is published by a company that has taken a frisky approach to other folks’ information. For some background, check out “Are HP, Google, and IDC Out of Square.” I wanted to assemble a list of the buzzwords in the Network World article, but even for my tireless goslings, the task was too much. I could not figure out what the legends on the x and y axis meant. Do you know what a “plateau of productivity” is. I am not sure what “productivity” means unless I understand the definition in use by the writer.
One fact jumps out for me:
“As enterprises embark on the journey to becoming digital businesses, they will leverage technologies that today are considered to be ‘emerging’,” said Hung LeHong, vice president and Gartner fellow. “Understanding where your enterprise is on this journey and where you need to go will not only determine the amount of change expected for your enterprise, but also map out which combination of technologies support your progression.”
The person making this statement probably has a good handle on the unpleasantness of a legal dispute. For some color, please, see “Gartner Magic Quadrant in the News: Netscount Matter.”
Stephen E Arnold. August 14, 2014
Venture Outcome: The Search and Content Processing Angle
August 14, 2014
I suggest you read “Venture Outcomes Are Even More Skewed Than You Think.” The write up contains several factoids. I highlighted one and added a couple of exclamation points. I suggest you print out the article, grab a writing instrument, and do your own filtering.
The main point of the write up is buried in the paragraph that begins “This really underscores the challenge of crating a venture portfolio that produces reasonable returns.” The factoid I honored with exclamation points is:
In my hypothetical $100M fund with 20 investments, the total number of financings producing a return above 5x was 0.8 – producing almost $100M of proceeds. My theoretical fund actually didn’t find their purple unicorn, they found 4/5ths of that company. If they had missed it, they would have failed to return capital after fees. Even if we doubled the number of portfolio companies in the hypothetical portfolio, a full quarter of the fund’s return comes from the roughly ½ of a company they invested in that generated 10x or above. Had they missed it, they would have produced a return that roughly approximated investing in bonds – not the kind of risk adjusted return they or their investors were looking for.
I know this is a hypothetical. Assume that the analysis is off by plus or minimum 10 percent. What do we get? Lousy returns; that is, returns comparable to dumping cash into bonds. I think about the banking and venture firm meetings in which I have participated. I cannot recall any of the smiling MBAs considering that their best ideas could perform on a par with bonds. My hunch is that the people who pushed money into venture funds and bank VP-inspired investments are not thinking bond-type yield.
If the number is accurate, I wonder if those folks who have pumped tens of millions of dollars into outfits promising a money ball from search and content processing will get their money back. Forget an upside. Break even may be tough. Search and content processing makes headlines like this one every day:
To get similar results, navigate to Google News and enter the query Autonomy HP or Autonomy CFO.
The second item I circled with my pink marker was a diagram:
The important part is the small number of “winners” graphically embodied in the miniscule 0.4% column. This is a broad swath of investments. For search and content processing, the payoffs have to be measured in what money flows via revenues or a sell off like Fast Search to Microsoft, Exalead to Dassault, or Autonomy to HP. The number of folks who made big bucks and are really happy may be modest. In fact, judging from the legal hassles with regard to Fast Search and the recent HP Autonomy headlines, even those who were MBA winners may have headaches. Information retrieval seems to deliver a number of headaches for stakeholders.
The third item is the factoid that makes clear the failure rate of start ups. Search and content processing poses similar challenges. There is a twist. Once a search and content processing sells to a larger firm, how many have become major money pumps to the acquiring companies? The question is very difficult to answer. The absence of information tells me that there are not too many feel good stories to tell. The pleas on LinkedIn enterprise search discussion threads for positive case studies about search are easy to ignore. Good news with regard to search and content processing is not sloshing around the Big Data bucket in which we exist.
How long with companies that have been in business for many years promising a money ball from search be able to survive? How long will the old soft shoe about search and content processing open checkbooks? How many years will it take some information retrieval companies to replace red ink with the blank ink of hefty after tax profits? How long will it take those seeking answers to information retrieval problems to wake up to the fact that consultant saucisson, Star Trek fantasies, and marketing hyperbole are unlikely to deliver a Disneyland-like “win”?
The data set for the Seth Levine write up is large enough to warrant a tentative answer, “Probably never.” Search and content processing are different. The algorithms and methods are decades old. Talk does not change what can be accomplished with affordable computational resources. Pumping money into search, therefore, may be painful when the actual financial data are reviewed by investors and stakeholders.
Why aren’t their abundant “good news” cases for search and content processing? There just aren’t that many. Think a power curve of implementation successes. There are more examples of search going off the rails than home runs. This is surprising when so many profess to be experts in search and so much money has been injected into information retrieval start ups. The business strategy of search and content processing companies may be raising money. Any other work may be of little interest.
Stephen E Arnold, August 14, 2014
Watson Is In the Army Now
August 14, 2014
First he is training to become a world-class chef, then he goes to medical school, now Watson is joining the army. Gigaom reports that “IBM And USAA Put Watson To Work In The Military.” While Watson will not go through boot camp or face deployment, the supercomputer will be used to help military personnel transition to civilian life. The IBM Engagement Advisor is an engagement tool that service people will be able to query with questions related to the transition experience, including health benefits and finances. The Engagement Advisor scans more than 3000 military documents and can even answer questions about the content.
This is another push by IBM to make Watson a feasible product.
“In a statement announcing the USAA application, which is in pilot, IBM SVP Mike Rhodin said: ‘Putting Watson into the hands of consumers is a critical milestone toward improving how we work and live.’ And make no mistake; IBM needs to get Watson out there and in use or risk squandering this lead.”
IBM’s product revenue dropped in the past two years. The company invested a huge amount of funds and man-hours developing Watson. Now IBM is focused on seeing a return on the investment. Watson is going beyond winning game shows to more practical applications.
Whitney Grace, August 14, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Data Burping
August 14, 2014
Data integration from an old system to a new system is coded for trouble. The system swallowing the data is guaranteed to have indigestion and the only way to relieve the problem is by burping. Chez Brochez has dealt with his own share of data integration issues and in his article, “Tips and Tricks For Optimizing Oracle Endeca Data Ingestion” he details some of the best way burp.
Here he explains why he wrote the blog post:
“More than once I’ve been on a client site to try to deal with a data build that was either taking too long, or was no longer completing successfully. The handraulic analysis to figure out what was causing the issues can take a long time. The rewards however are tremendous. Not simply fixing a build that was failing, but in some cases cutting the time demand in half meant a job could be run overnight rather than scheduled for weekends. In some cases verifying with the business users what attributes are loaded and how they are interacted with can make their lives easier.”
While the post focuses on Oracle Endeca, skimming through the tips will benefit anyone working with data. Many of them are common sense, such as having data integrations do the heavy lifting in off-hours and shutting down competing programs. Others require more in-depth knowledge. It beats down to getting content into a old school system requires a couple of simple steps and lots of quite complex ones.
Whitney Grace, August 14, 2014
Sponsored by ArnoldIT.com, developer of Augmentext