Oracle Endeca Business Intelligence Rules

August 21, 2014

Rules are good. The problem is getting people to do what the rule maker wants. Oracle wants Endeca to be a business intelligence system at the same time Oracle wants Endeca to be an ecommerce system. You can find the five rules in the white paper “The Five Rules of the Road for Enterprise Data Discovery.”

What are these rules?

I don’t want to spoil your fun. I want to encourage you to dig into Endeca’s rules and to work through the white paper to see if you are doing enterprise data discovery the Oracle way. What is “enterprise data discovery”? Beats me. I think it is 1998 style search based on Endeca’s 1998 technology disclosed in those early Endeca patents.

First, you want to get results without risk. That sounds great. How does one discover information when one does not know exactly what information will be presented? If that information is out of date or statistically flawed, how does Endeca ameliorate risk? Big job.

Second, Endeca wants you to blend data so you get deeper insights. What if the data are not normalized, consistent, or accurate? Those insights may not be deeper; they may be misleading.

Third, Endeca wants everything integrated. How does one figure out what is important in a syst3m that gives the user a search box, links to follow, and analytics? Is this discovery or just plain old 1998 style Endeca search? Where’s the discovery thing? Blind clicking?

Fourth, Endeca wants you to “have a dialog with your data”. I find this interesting but fuzzy. Does Endeca now support voice input to its ageing technology?

Finally, Endeca wants those data indexed and updated. The goal is “keep on discovering.” I wonder what the latency in Endeca’s system is for most users? I suppose the cure for latency and Endeca’s indexing method can be resolved with Oracle servers. How much does the properly configured fully resourced Endeca system cost? My hunch. More than a couple of Pebble Beach show winners.

The white paper is interesting because it contains an example of the Endeca interface and the most amazing leap from five rules to customer support. Oracle also owns RightNow and InQuira. Where do these systems fit into the five rules?

Confused? I am.

Stephen E Arnold, August 21, 2014

Government Web Site Reliability

August 21, 2014

I read “IT Outages Are an Ongoing Problem for the US Government.” I was surprised if the information is accurate. The article reports:

When outages occur, 48% of the workers said they do what they can via telephone, while 33% use personal devices and another 24% try to find a workaround, such a Google Apps. When asked to grade their IT department, only 15% of the field workers gave it an “A”; 49% gave it a “B”; and 27% gave it a “C.” When asked what caused the most recent outages, the IT professionals said 45% were due to a network or server outage; 20% cited Internet connectivity loss; 13% blamed natural disaster; 7% said a specific application stopped working, and 6% pointed to human error.

With the new push to improve government Web sites, perhaps the core infrastructure needs attention as well? Is it possible that good enough is comparable to the US broadband capability, the educational system, or airline on time performance? And search results? Nah, USA.gov’s search results are good enough for some.

Stephen E Arnold, August 21, 2014

Launching and Scaling Elasticsearch

August 21, 2014

Elasticsearch is widely hailed as an alternative to SharePoint or many of the other open source alternatives, but it is not without its problems. Ben Hundley from StackSearch offers his input on the software in his QBox article, “Thoughts on Launching and Scaling Elasticsearch.”

Hundley begins:

“Qbox is a dedicated hosting service for Elasticsearch.  The project began internally to find a more economical solution to Amazon’s Cloudsearch, but it evolved as we became enamored by the flexibility and power of Elasticsearch.  Nearly a year later, we’ve adopted the product as our main priority.  Admittedly, our initial attempt took the wrong approach to scale.  Our assumption was that scaling clusters for all customers could be handled in a generalized manner, and behind the scenes.”

Hundley walks through reader through several considerations that affect their own implementation: knowing your application’s needs, deciding on hardware, monitoring, tuning, and knowing when to scale. These are all decisions that must be made on the front-end, allowing for more effective customization. The upside of an open source solution like Elasticsearch is greater customization, control, and less rigidity. Of course for a small organization, that could also be the downside as time and staffing are more limited and an out-of-the-box solution like SharePoint is more likely to be chosen.

Emily Rae Aldridge, August 21, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Prediction Takes a Step Forward

August 21, 2014

Prediction is hard, even within the realm of the impossible some may say. However, prediction has taken a step forward with the work of a Web site, correlated.org. Their goal is to find correlation between seemingly unrelated things. They have been able to take the aggregated results and draw greater conclusions about the wider population. Read more in the Business Insider article, “Correlation Expert Explains How 5 Questions Allow Him To Predict A Bunch Of Traits About People.”

The article begins:

“Gallagher, a former newspaper editor, runs correlated.org, a site that polls registered users on a wide variety of questions to identify strange correlations, ranging from the tendency of pot smokers to prefer sweet snacks to the tendency of Twitter users to remember their dreams. He also recently released a book. ‘Our answers to five basic questions are enough to predict our preferences and opinions about a whole lot of other things,’ Gallagher wrote.”

This could be good news for the world of predictive analytics. Sure, predictive analytics are pretty tried and true in the world of insurance, but in terms of consumer behavior, and other more casual needs, it is harder to draw straight lines. Exploring these smaller, less linear relationships through correlated.org may produce big dividends for other areas. Quite frankly, it is impressive that they are successfully predicting anything, and it bodes well for the future.

Emily Rae Aldridge, August 21, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Microsoft Focuses on SharePoint User Experience

August 21, 2014

Microsoft is turnings its attention to the user experience of SharePoint in their roadmap for Office 365. SharePoint receives a lot of attention for its increased functionality, but it receives a lot of negative attention for its complexity and general difficulty of use. CMS Wire covers the issue in their latest article, “Where User Experience Should Fit in SharePoint’s Roadmap.”

The article begins:

“One only need to take a look at the Microsoft roadmap for Office 365 to see that the company is making huge investments in the UX for SharePoint, from new social and search capabilities (such as Office Graph, inline social and Groups) to deeper integrations with other Microsoft platforms, like Dynamics CRM. Unlike previous platform updates, the focus of each incremental release is clearly meant to improve the end user (and administrator) experience within the platform.”

And while it is comforting to see that Microsoft is taking user experience seriously, many users and managers will still need help along the way. One source of help may be ArnoldIT.com. The Web site is managed by Stephen E. Arnold – a longtime leader in all things search. His SharePoint feed is especially insightful, offering tips and tricks for all levels of user.

Emily Rae Aldridge, August 21, 2014

Google Search Has Been Improved. A Lot.

August 20, 2014

I do a lecture for the police and intelligence community. The focus is on the techniques helpful in finding information that answers a query. If a person types a query into Google, the results are ads, popular hits that others found useful, and search engine optimized content.

Consider looking for a “shotgun suppressor”. Ignore the quotes. Here’s the results from Google.com on August 20, 2014:

image

Pictures. Not too many adds. A video.

Where does one buy a shotgun suppressor? Run the query “purchase shotgun suppressor”.

The results are:

image

More pictures. Ads. and a couple of companies mentioned several times.

So it is easy to get information about a shotgun suppressor and buy one. Now, do some clicking and you will find that the links include auto mufflers from 2WheelPartsSupply.com and some other results that are off point.

In order to nail the real deal, military grade suppressor, some additional work is required.

When I read “Google Made 890 Improvements To Search Over The Past Year”, I just sighed. The write up is a rah rah for Google. Here’s a passage that I highlighted:

In a Google+ post from Google head of search Amit Singhal, Google shares they have made “more than 890 improvements to Google Search last year alone.” In 2009, Google told us they made between 350 to 400 changes to search and in 2010, they said they made 550 improvements to search in the past year. Google’s Matt Cutts said in a video in 2010 they make one change per day to their core search algorithm. We also know Google tests hundreds of changes in a day but only some of them make the light of day.

Okay, run some queries. Has Google improved search, or has Google improved its methods for diffusing ads into results. My experience is that Google is great for information about Dr Dre and pizza. For other types of information, considerable effort is required to unearth useful, on point information.

By the way, the key to finding the shotgun suppressor is to use synonyms like moderator and to approach the problem using another Google service. The content is findable but I am not feeling lucky anymore.

Since everyone is now an “expert” in search, which of the top 10 changes to Google in the last decade ring your bell. How about “universal search”? Ever wonder why books, blogs, non US content are not included in a universal search? Think about it, please.

Stephen E Arnold, August 20, 2014

HP IDOL: A Battery? Who Provides the Jumper Cables?

August 20, 2014

I read “Can HP IDOL Jumpstart the Big Data App Economy?” My first reaction was, “A Big Data app?” and then “What’s the Big Data app economy?” I ploughed into the write up and learned that:

Hewlett-Packard Co. is looking to take the driver’s seat in bringing about the era of pre-packaged analytic applications with the IDOL platform from Autonomy, and according to the head of product marketing for the subsidiary, it already has results to show for the effort.

Okay. And the evidence:

Standing out among the case studies that were being demonstrated at the conference was a clinical data management system serving as a foundation for services that each implemented the underlying functionality in a different way. Veis [HP professional] pointed at the solution as a prime example of developer ingenuity that would not be facilitated had HP not made the capabilities of IDOL available for consumption from the cloud last December.

Well, there is some work required:

Despite the tremendous amount of progress that has been made on simplifying data processing in recent years, Veiss said that operationalizing information remains a widespread painpoint.

Okay, already. Solve the problem.

Apparently there is another hurdle:

Another major challenge is mobility, which Veiss sees as the “great equalizer” for user experience, especially as it pertains to delivering data insights.

Frankly I don’t know what this means.

I suppose this type of content marketing and jargonizing will sell some folks. For me, it’s confusing. IDOL is now about 15 years old. The DRE (digital reasoning engine) requires training and that means one has to know what type of information will be processed. In order to get useful results, content known to be like the content to be processed has to be assembled as a training set. Skip this set and the results are likely to be off point.

Has HP figured out how to crack this aspect of Big Data? I thought that IDOL and DRE required the licensee to train the system so that IDOL and DRE can deliver results that are on point for the content set.

My hunch is that by shifting the focus to apps, HP may be ignoring some of the time consuming intellectual work needed to allow IDOL and DRE to show their stuff.

HP has to find a way to generate billions to pay off the Autonomy buy and then make those lines of business return high margin, sustainable revenue. Apps may make sense to an MBA. Will apps deliver the truck loads of cash HP seeks from 15 year old technology?

Let me check the Apple apps store. Nope, no app for that.

Stephen E Arnold, August 20, 2014

Google Research Shares Some Key Findings of 2013

August 20, 2014

Google is famous for its very curious research arm, and now the company has published its favorite findings of 2013. We learn of the generous gesture from eWeek’s “Google Shares Research Findings with Scientific World,” where writer Todd R. Weiss discusses reports on the roundup originally posted in a Google Research blog post. It is a very interesting list, and worth checking out in full. What caught my eye were the reports on machine learning and natural language processing. Weiss writes:

“Machine learning is a continuing topic, as seen in papers including … the paper ‘Efficient Estimation of Word Representations in Vector Space,’ which looks at a ‘simple and speedy method for training vector representations of words,’ according to the post.

“’The resulting vectors naturally capture the semantics and syntax of word use, such that simple analogies can be solved with vector arithmetic. For example, the vector difference between “man” and “woman” is approximately equal to the difference between “king” and “queen,” and vector displacements between any given country’s name and its capital are aligned,’ the post read.”

Weiss next turns to natural language processing with the report, “Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging.” He quotes the paper:

“Constructing part-of-speech taggers typically requires large amounts of manually annotated data, which is missing in many languages and domains. In this paper, we introduce a method that instead relies on a combination of incomplete annotations projected from English with incomplete crowd-sourced dictionaries in each target language. The result is a 25 percent error reduction compared to the previous state of the art.”

The article concludes by noting that Google has is no stranger to supporting the research community, pointing to its App Engine for Research Awards program. It also notes that the company grants access to the Google infrastructure to academics for research purposes. Will all this generosity help Google in the PR arena?

Cynthia Murrell, August 20, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

Salesforce Snaps Up RelateIQ

August 20, 2014

Bubble? What bubble? ZDNet informs us that “Salesforce Acquired Big Data Startup RelateIQ” for a sum approaching $400 million. The deal will be Salesforce’s second-largest acquisition, following their purchase of “marketing cloud” outfit ExactTarget last year for $2.5 billion. Reporter Natalie Gagliordi writes:

“According to a document filed Friday with the Securities and Exchange Commission, Salesforce will pay up to $390 million for the Palo Alto, California-based startup, which provides relationship intelligence via data science and machine learning. RelateIQ will become a Salesforce subsidiary, the filing says.

“On its website, RelateIQ says it’s built ‘the world’s first Relationship Intelligence platform’ that redefines the world of CRM. In a nutshell, the platform captures sales data from email, calendars and smartphone calls and social media to provide insights in real time.”

Relationship intelligence, eh? That’s indeed a new one (outside the discipline of sociology, anyway). RelateIQ launched in 2011, based out of Palo Alto. In nearby San Francisco, Salesforce was launched in 1999 by a former Oracle exec, Now, their success in cloud-based customer-relationship-management solutions has them operating offices around the world. Will their spending spree pay off?

Cynthia Murrell, August 20, 2014

Sponsored by ArnoldIT.com, developer of Augmentext

HP Autonomy: A Mysterious Action

August 19, 2014

I just read “The Mysterious Case of Hewlett-Packard’s Autonomy Deal.” The HP and Autonomy PR professionals have some work to do. Heck, search and content processing vendors have some work to do. The unflagging interest in the purchase of the largest enterprise search and content processing vendor (Autonomy) by one of the largest sources of printer ink (Hewlett Packard) is drawing attention to the risks associated with information retrieval.

The write up from Therese Poletti’s Tech Tales is an example of how a utility function like search is sporting a black eye, a chipped tooth, and a broken nose. Ugly.

The mystery, as I understand the article, concerns writing down “almost $9 billion of its $11.1 billion acquisition of the British software company, Autonomy Corp.” The article reports:

one of the law firms that represented the shareholders in their case against H-P directors, Cotchett, Pitre & McCarthy LLP, now working with H-P, is being accused of a conflict of interest. Cotchett was previously the lead counsel in another class action against H-P. That suit, which also recently settled, alleged that the company’s inkjet printers falsely warned consumers when they were out of printer ink.

I savored the “falsely warned” phrase.

The article reports:

“The inkjet litigation has no bearing on the Autonomy settlement,” an H-P spokeswoman said in an email. “We believe the motion to intervene in the derivative case is just a lawyer-driven attempt to seek attorneys’ fees. It is meritless, as will be shown in court filings.”

And the mystery of the write down? The article asserts:

H-P has said that $5 billion of the write-down was due to accounting improprieties at Autonomy. But so far, the accounting problems found at Autonomy are said to be around $200 million in either hardware sales at a loss or fraudulent transactions, out of just over $1 billion in annual revenue. How this became a multi-billion-dollar write-down is a big question among investors. Perhaps these legal maneuvers will shine some light on the mystery. But it probably will be a long time before investors know what really happened.

The mystery is not yet solved. Life, it seems, does not work out like a US television crime drama. I await the next installment of “The Write-down Mystery.”

Stephen E Arnold, August 19, 2014

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta