Predictive APIs: Will Search Vendors Play in This Sandbox?

October 31, 2014

I received a notice about new conference called “The First International Conference on Predictive APIs and Apps.” According to the write up I saw:

Several companies who are building predictive APIs and tools to make predictive app development easier will be at PAPIs (BigML, Datagami, Dataiku, Indico, Intuitics, GraphLab, Openscoring, PredictionIO, RapidMiner, Yhat). We’re expecting to see both actual and potential users who will share and learn how to use these products. Newcomers will learn and get inspiration from the keynotes, showcases and practical “predictive for all” user stories. Experts will also be interested in the sessions on technical challenges and in the panel discussion on the future of predictive APIs.

A number of search and content processing vendors suggest they deliver advanced analytics. Text analytics vendors are either feeding data into predictive engines or delivering outputs that are predictive.

Are predictive analytics one of the next big things? If so, traditional information retrieval and content processing companies are likely to be attending this conference on November 17 and 18, 2014.

At this time, IBM and Microsoft are on the program.

IBM will be addressing “intelligent APIs.” In the abstract for his talk, I did not see a reference to Watson. Microsoft’s talk abstract is not on the program page as of October 30, 2014.

Worth attending if you in the Barcelona area.

Stephen E Arnold, October 31, 2014

FirstRain Escapes “Death Spiral” Through Work of Penny Herscher

October 31, 2014

The article on Fortune titled The Company Was In a Death Spiral. She Brought It Back From the Brink lauds the work of Penny Herscher  at data analytics firm FirstRain. Herscher took over the company in 2004 after successful work at Cadence Design Systems, Simplex and Texas Instruments. FirstRain was a bankrupt company with a great prototype but no product. Herscher embraced the challenges posed by FirstRain and began her overhaul with a move from New York to California. The article goes on,

“She raised $20 million from new investors and hired a trusted team, including chief operating officer Y.Y. Lee, a mathematician and software engineer… Today, more than 50% of FirstRain’s senior leadership is women. The fledgling company had barely started developing a product when storms began brewing on the horizon. It was 2008. The global economy was beginning to collapse. “The wheels came off the bus,” Herscher says with lament. To survive, the company had to completely change course again…It pulled through.”

But only after major lay-offs and changes in the structure. Today FirstRain customers include IBM and Cisco, and it is only continuing to grow, with new offices in San Mateo. Herscher’s story of success is one of commitment and creative problem-solving.

Chelsea Kerwin, October 31, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Visualizing Data: Often Baffling

October 29, 2014

I read “72 Hours of #Gamergate.” I don’t follow the high buck world of video games. The write up contains oodles of data. Some of the information is in the form of bar charts. Other information is presented in words, spreadsheets, and graphics. I am okay with the bar charts. These have labels and numbers on the x and y axes. The visualization show below baffles me:

image

The image adds graphic impact. I have been in briefings in which senior executives and military brass have presented similar visualizations.

I suppose clarity is less important than sizzle. Analytics vendors, are you listening? I think not if this graphic is any indication of the way data are presented.

Stephen E Arnold, October 29, 2014

Attensity Ups Its Presence in Hackathons

October 28, 2014

I found the Attensity blog post “Attensity Takes Utah Tech Week” quite interesting. I cannot recall when mainstream content processing companies embraced hackathons so fiercely.

The blog post explains:

A hackathon, for the uninitiated, is exactly what it sounds like: a hybrid of computer hacking and a marathon in a grueling, caffeine-fueled, 12-hour time period. Groups comprised of mostly engineers and IT whizzes compete against the clock and other teams to create a project to present at the of the day to a panel of judges.

What did Attensity’s engineers build to showcase the company’s sentiment analysis and analytics technologies? Here’s the Attensity description:

With the Twitter API up and running, Team Attensity used Raspberry Pi to process tweets using #obama and #utahtechweek. Simultaneously, the team used Arduino to code sentiments from the tweets using a red light for negative sentiments, blue for positive sentiments, and yellow for neutral sentiments.

Attensity was pleased with the outcome in Utah. More hackathons are in the firm’s future. I wonder if one can deploy IBM Watson using a Raspberry Pi or showcase HP Autonomy with an Arduino.

How will hackathons generate revenue? I am not sure. The effort seems like a cost hole to me.

Stephen E Arnold, October 28, 2014

Predictive Analytics: Trouble Ahead?

October 28, 2014

I learned about a new book that will be available in early 2015. Its title is The Black Box Society: The Secret Algorithms That Control Money and Information. The author is Frank Pasquale, a professor of law at the University of Maryland.

The Harvard promotional Web site for the book asserts:

Hidden algorithms can make (or ruin) reputations, decide the destiny of entrepreneurs, or even devastate an entire economy. Shrouded in secrecy and complexity, decisions at major Silicon Valley and Wall Street firms were long assumed to be neutral and technical. But leaks, whistleblowers, and legal disputes have shed new light on automated judgment. Self-serving and reckless behavior is surprisingly common, and easy to hide in code protected by legal and real secrecy. Even after billions of dollars of fines have been levied, underfunded regulators may have only scratched the surface of this troubling behavior.

The Institute for Ethics and Emerging Technologies mentioned the forthcoming book here. One of the comments about that post was interesting to me. TooManyJoes wrote:

The control of the results by the decision makers is what makes this future menacing. Right now, Google is under attack being too good at search prediction and making money on targeted advertisements whose brilliantly written algorithms allow such a sophisticated variety of information to be indexed. As a result search bubbles have formed, and a lack of statistics comprehension prevents the awareness of control over this new medium. Snake oil salesmen turned into Mad Men and psychiatrists, it’s the medium of internet based controlled by one snake oil salesman that frightens us all. I believe it’s not possible without a formal computational human algorithm to have enough of an impact to have widespread influence. I bring up these mediums because to engage in them is to participate, participation can be tracked, then imagine the expense of the things we have access to because free participation drives those products and services by up selling those products. Without education, which most people won’t be open to, and time for the common man to analyze the data…those in control of the data will be people delegated by others. Welcome to the age of transparency.

The Google reference may presage some discussion of the company’s predictive wizardry.

Stephen E Arnold, October 28, 2014

Looking for an AI Silver Bullet to Make Software Smart? Keep Looking

October 24, 2014

Here in Harrod’s Creek, Kentucky there is not too much chatter about machine learning. It is hunting season. Time to get out the Barrett Automatic Rifle and go hunting for varmints.

Sundown yesterday when calm returned to the hollow, I read “Machine-Learning Maestro Michael Jordan on the Delusions of Big Data and Other Huge Engineering Efforts.”

My thought after reading the IEEE article was that I was really tired of the artificial intelligence yap yap. Now a whiz at UCal Berkeley is pointing out that some of the methods are a “cartoon.”

The Dr. Michael Jordan says:

I think data analysis can deliver inferences at certain levels of quality. But we have to be clear about what levels of quality. We have to have error bars around all our predictions. That is something that’s missing in much of the current machine learning literature.,,if people use data and inferences they can make with the data without any concern about error bars, about heterogeneity, about noisy data, about the sampling pattern, about all the kinds of things that you have to be serious about if you’re an engineer and a statistician—then you will make lots of predictions, and there’s a good chance that you will occasionally solve some real interesting problems. But you will occasionally have some disastrously bad decisions. And you won’t know the difference a priori. You will just produce these outputs and hope for the best. And so that’s where we are currently.

In short, marketing hyperbole takes precedence over the plodding realities of the steps required of a person aspiring to a PhD in statistics is supposed to follow.

With regard to the applications that deliver predictive outputs, Dr. Jordan says:

But unless you’re actually doing the full-scale engineering statistical analysis to provide some error bars and quantify the errors, it’s gambling. It’s better than just gambling without data. That’s pure roulette. This is kind of partial roulette.

I strongly recommend you read the interview. I would not involve a search or content processing marketer in the exercise, however.

Stephen E Arnold, October 24, 2014

Watson Analytic Example

October 21, 2014

Navigate to Thinglink. At this location is an example of the type of graphic that can be generated with output from Watson, IBM’s next big thing. A graphic artist has taken the data and created an eye snapping infographic. How many other systems can generate this type of output? Quite a few if the information in my analytics files are representative. Is it necessary to use IBM Watson when Microsoft Excel and an open source tool like Tableau are available? IBM Watson analyzed 135 million tweets from 10 countries in Central and South America. Brazil was excluded.

Twitter said in 2013:

Brazil is one of our largest markets with a strong user base. Twitter has already become an important part of our lives in Brazil and, by strengthening our local presence, we plan to continue delighting our users as well as creating new opportunities for marketers who want to connect with them.

Perhaps I overlooked Brazil. No big deal.

Stephen E Arnold, October 22, 2014

Autonomy: 33 APIs

October 21, 2014

Curious about Hewlett Packard’s Autonomy APIs? You can see the list of 33 at IdolOnDemand.com. If you are curious about Autonomy’s Big Data capabilities, you may be puzzled about the lack of explicit analytics application programming interfaces. Don’t be. The savvy developer selects operations, takes outputs, and pumps the data into a search based application, third party number crunching system, a data management system, or plain old Excel. What’s interesting is that the naming of the APIs makes clear the search-centric nature of Autonomy. The marketing of IDOL as a service or a cloud solution shifts attention away from search in my view.

Stephen E Arnold, October 21, 2014

Big Data Failure: Teens and Music

October 13, 2014

How much data are available for teen demographics, popular music sales by genre and medium, downloads from iTunes and Amazon, the music trade associations, and myriad other sources. If there is one industry with data, lots of data, isn’t it the music business?

I read “No One Knows How Teens Listen to Music.” The information is surprising. I thought we lived in the world of Big Data. With flashy algorithms and lots of zeros and ones, the secrets of the universe are exposed. Business strategists and entrepreneurs would flourish. The world would be a better place. Isn’t that what Big Data marketers suggest?

Here’s a passage I noted:

Fast forward to 2014. Nielsen’s recent analysis of the music industry at large showed a six-percent decrease in digital music sales and a 32-percent increase in overall streaming. According to the company, these changes were largely… because of teens. As Martin Pyykkonnen, an analyst at Wedge Partners, told Yahoo last year, “Young people today don’t buy music anymore.” Except maybe they do, according to the Piper Jaffray report. Or maybe they don’t buy MP3s but do download them. Or maybe they don’t download them but do listen to them.

So lots of data about music and teens. We learn, “All the major surveys disagree. Maybe it’s a secret.”

Yep, Big Data delivers. Oh, how about those Ebola predictions?

Stephen E Arnold, October 14, 2014

Chiliad Offline: A Precursor for Other BI Outfits

October 13, 2014

According to PacerMonitor, Chiliad, Inc. filed for bankruptcy on August 6, 2014. As you may recall, the company was a Washington, DC area analytics firm founded by Christine Maxwell of McKinley Group and Magellan fame. (Magellan became part of Excite, which also faded away.)

About two years ago, Beyond Search wrote about Chiliad and its big rocks. Also, in 2012, the company named Craig Norris, as chief executive officer. Mr. Norris (an industry leader according to Reuters)  had been the CEO of Attensity, sentiment analysis outfit, which has experienced its share of strong headwinds. In the news release about his appointment, he said:

“I am excited to be joining Chiliad at an important stage in its growth. What makes or breaks an analytics company is the quality and usability of its core technology. Chiliad’s offering has proven its ability to extract critical findings from data at massive scale for both Government and Commercial customers. I am eager to see us gain recognition for our technology leadership.”

The news release included assertions by Patrick Gross (Chairman of the Chiliad board of directors) that I have encountered many times in the last five years; to wit:

“Chiliad has already solved two very challenging problems. The first is the ability to rapidly search data collections at greater scale than any other offering in the market. The second is to allow search formulation and analysis in natural language. This means that no longer is an elite class of analysts required in order to generate meaningful results, thus reducing the personnel training and skills shortages that plague alternative solutions and put timely discovery at risk. The explosion of ‘Big Data’ is real and valuable findings are buried in vast collections for both enterprises and governments. Chiliad has the opportunity to integrate its innovative, massively scalable solutions with emerging open source software to build customized solutions for the largest-scale clients.”

Businessweek described the company in this way:

Chiliad, Inc. provides data analysis solutions for various clouds, agencies, departments, and other stovepipes. The company offers Discovery/Alert, a platform that enables investigators, business analysts, and knowledge workers to securely reach, find, analyze, and continuously stay on top of big data—whether structured or unstructured, and classified or unclassified. Its software solutions include Iterative Discovery cycle that allows analysts and researchers to reach various content silos, find what matters, analyze it to find meaning from the information relationships presented and continuously monitor changes; and Architecture, a virtual consolidated data center that enables multidimensional analysis and ranking. It serves government/intelligence, law enforcement, healthcare, pharmaceutical, insurance, and other markets. Chiliad, Inc. was founded in 1998 and is headquartered in Herndon, Virginia.

I have highlighted the buzzwords that were designed to generate sales leads and revenue. I can only assume that the verbiage and the Attensity management touch fell short of the mark. How many of the “analytics” and “business intelligence” companies will follow Chiliad’s path? Good question but I keep asking it.

Stephen E Arnold, October 12, 2014

Next Page »