Recommind Moves into Healthcare
April 14, 2013
Recommind is embracing the healthcare market. Marketwire shares, “Recommind Will Be First Time Speaker and Sponsor at World Health Care Congress Conference.” With legal conquered, it looks like the company is on to new adventures. We learn from the press release:
“Recommind, a leader in unstructured data management, analysis and governance technology, today announced it will be sponsoring and speaking for the first time at the World Health Care Congress (WHCC) event on April 8-10 at the Gaylord National Harbor in Maryland. Recommind will join the global health care community of business, political, and academic leaders to actively share information and collaborate to improve the overall quality and cost of health delivery in the US and throughout the world.”
The company hosted a speaking session, at which they advised attendees on key analytics issues, like implementing an efficient infrastructure, communicating information back to providers, analytics-informed preventative programs, and sharing improved outcomes. It is good to see the company branching into the spirited medical arena.
Experts at handling unstructured data, Recommind provides search-powered analysis and governance solutions to customers around the world. These tools are built around on their CORE information management platform. Headquartered in San Francisco, the company was formed in 2000.
Cynthia Murrell, April 14, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Oracle Upgrades Discovery and BI Tools
April 14, 2013
Oracle has released upgrades aimed at improving business outcomes and simplifying IT requirements, we learn from a press release posted at MarketWatch, “Oracle Extends Business Analytics Portfolio Empowering Organizations to Transform Data Into Insights.” Both Endeca Information Discovery and Oracle Business Intelligence Foundation Suite have been enhanced. The company points out that both solutions perform best on their tailor-made Exalytics In-Memory Machine. The write-up informs us:
“Oracle Endeca Information Discovery 3.0 delivers a completely redesigned user interface that offers new drag and drop visualizations to provide users with a superior discovery experience, new personal data load for business users to add their own Excel data files to IT provided data, and new Oracle BI Server connectivity, to leverage trusted data from existing analytic applications, along with other features.
“Oracle Business Intelligence Foundation Suite Release 11.1.1.7 delivers significant enhancements to usability, mobility, user experience and Big Data integration, enabling organizations to analyze critical information and get the intelligence they need to optimize their business.
“Endeca Information Discovery and Oracle BI Foundation Suite run better on Oracle Exalytics In-Memory Machine, the industry’s first engineered system for Business Analytics. Oracle Exalytics takes best-in class analytics and in-memory software engineered on high-performance hardware to reduce the cost and complexity of IT infrastructures while increasing productivity and performance for data discovery, business intelligence, modeling and planning applications.”
This Exalytics machine has the potential to make the entire BI undertaking much, much simpler. Endeca, acquired by Oracle in 2011, has long been a strong player in the enterprise discovery field. Oracle’s BI suite integrates several key features in one platform: enterprise reporting, dashboards, ad-hoc analysis, scenario analysis, scorecards, and predictive analytics. The company’s commitment to supplying cutting-edge technology while maintaining easy-to-use interfaces is apparent in these latest improvements.
Cynthia Murrell, April 13, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Chiliad and Virtual Consolidation
April 13, 2013
Yep, fragmentation is an issue even in big data analytics. I read “Chiliad Takes Virtual Consolidation Route to Big Data Analytics.” Chiliad is a company which provides software and services to a number of US government agencies and to other customers as well.
The main point of the write up is:
Chiliad says it can eliminate some of those barriers to adoption of big data analytics with Discovery/Alert 7.0, an iterative information retrieval system that lets analysts search any data warehouse or data set across clouds, agencies, departments and organizations.
The once popular term “federation” has given way to a host of synonyms, including unified information access and consolidation. The new twist is the use of the word “virtual” which implies that leaving data where it resides is something new. The alternative is creating a data warehouse like the old iPhrase system and other repository centric approaches to data management.
The article contains this statement: “Discovery/Alert provides global ranking of results to help analysts find relevant information in massive collections of data.” Then I read, “We [Chiliad] actually give you a unified, holistic global ranking of all of your results… As organizations scale out to billions of records and petabytes of data, relevancy becomes more important.”
The story highlights features which remind me of Fast Search & Transfer’s descriptions of its system. In fact, a number of search and content processing vendors have made similar points for many years.
What’s new? The emphasis is on “all” and “virtual.” Will these concepts be enough to move search and content processing, analytics, and business intelligence forward? Will these assertions cause some firms to dog paddle instead of speeding to the finish line?
I don’t know. The marketing points seem remarkable consistent in my opinion. The problems seem to be unsolved despite the efforts of many, many vendors to deliver actionable information.
Stephen E Arnold, April 13, 2013
Sponsored by Xenky.com, the new ArnoldIT portal
Information: Dark Sides and Bright Sides
April 9, 2013
I find the information revolution semi-bright or semi-dark. I read “Are We Paying Enough Attention to Information Technology’s Dark Side?” My first reaction was, “Nah.” Most outfits are worrying about revenues. Google has to deal with the shift from the money Gold Rush of the desktop era to the lower revenue per click of the mobile world. Microsoft has to worry about the economic impact of its initiatives to nowhere. Smaller outfits in search have either been crushed like Convera or squished like Dieselpoint, mired in controversy like Autonomy and Fast Search, or just unable to make ends meet, deliver a product which works, or get their act together long enough to close a deal.
Paradise Lost may help illuminate the dark sides and bright sides of information. A happy quack to Lapidary Apothegms for reminding me of this phrase.
The concern of the “Dark Side” write up is broader. The big issue is Big Ideas. With references to high profile information luminaries like James Clapper, the director of national intelligence, and governmental issues. Here’s the quote I find interesting:
While the idea of lumbering bureaucracies adapting quickly may seem unlikely; it’s entirely possible they’ll adapt just fast enough to remain in place for awhile yet. And instead of quick change, the classic definition of the state will twist and wither. Whether its successor proves good or ill remains to be seen—but if history (and Marc Goodman) is any guide, it’ll be some of each.
The future is the semi-bright and semi-dark situation.
With regard to information, flows of information, data, and knowledge can erode certain structures. In an organization, as information moves more freely, the old chokepoints are bypassed. The notion which has gripped managers and bureaucrats is that flowing information has more of luminescence than cutting off that flow thus casting shadows.
In my experience, information is not neutral. Digitization has its own motive power. In one talk I gave years ago, I pointed out that information breeds more of itself. The image I used in my lecture was a sci-fi decision maker surrounded by Tribbles. Tribbles just kept on making more Tribbles. Bad news were Tribbles in the confines of a starship.
Even though I have worked in information centric businesses and government agencies for decades, I am not sure I understand information. I do not have a clear grasp of its behaviors. Over the years, I have formulated some “laws”, which I describe in some of my writings and talks. A recent example is Arnold’s Law of Vulnerability. In a nutshell, the “law” reports data from our research that says, “As the volume of information increases attack surfaces expand.”
The implication of this “law” is that digital information disconnects from the factual and becomes the propaganda described by Jacques Ellul. A software program which crashes a system or more importantly modifies it in a manner unknown to the system developers is a growing problem.
Conflating political movements, digital data, and next generation systems increases complexity. In short, as informationizing operates, clear thinking becomes more and more difficult. Thus, we now have to navigate in a datascape in which:
- Facts are not facts, even the results of a scientific experiment can be falsified or, more troubling, placed in an “objective journal” as an advertorial
- Systems have minimal ability to detect falsified data from sensors, SMS messages, or data streams which contain signals to which the smart software responds in a Pavlovian way
- Humans accept outputs of systems as though those outputs were a reality which corresponds to the actuality of a single individual.
Work needs to be done in the space between the bright and dark of information. Much remains to be done and not by failed webmasters, azure chip consultants, search engine optimization experts, and unemployed journalists. Perhaps Google’s smart software can just take on the job
The Heat in Text Radar: March 29 to April 04
April 9, 2013
The Text Radar big data analytics and content intelligence blog continually provides readers with informative resources on how big data is impacting modern workplaces. This week, I will highlight several articles that were particularly informative.
We all know the impact that big data has on marketers. But what about other industries? According to, “Big Data Analytics Reveals Vision Giving Major Disaster Responders Advance Notice” provides an example of how big data is helping the development of American bridges.
The article lays out a frightening scenario:
“The American Society of Civil Engineers says that one quarter of all American bridges is ‘deficient’. 17,000 bridges didn’t meet inspection criteria, including 3% of all freeway bridges.
Want a scary statistic? The average age of America’s bridges is 43 years. The average lifespan of America’s bridges: 50 years. This means, unless something changes, we should all avoid pretty much all river crossings after the year 2020.”
Another story, “Growing Big Data and Information Access Bring New IT Challenges” explains how big data is transforming the new world of computing.
When explaining some new challenges, the article states:
“The big change now is not that everyone is an I.T. manager – there are still plenty of ways companies will control devices, access to computers, and data – but that everyone is a consumer of a lot of data. Making that easy on them will most likely be a winning strategy.
‘There has been a revolution in design theory,’ says Phil Libin, chief executive of Evernote, a storage site for consumers and businesses. ‘We’ve all had to learn how to have taste.’ He credits the change toward a design focus, in both consumer electronics and enterprise software, to Apple.”
Another innovative way that big data is being utilized is in major league baseball. According to “MLB Uses Big Data for Uncovering Player Insight”, this data allows the performance of players to be predicted.
The article explains:
“‘We’re trying to predict the future performance of human beings, oftentimes in situations that those people themselves haven’t even encountered,’ he said. ‘One of the things we really need to do is the skill from the luck.’
DePodesta cited ‘The Success Equation: Untangling Skill and Luck in Business, Sports, and Investing,’ a book by Michael Mauboussin of Credit Suisse in the idea that ‘skill is more repeatable than the luck.”
This is just a small sampling of the creative ways that big data can be utilized to make the biggest industry impact. Smartlogic offers a suite of solutions that will help any organization transition into analytics.
Jasmine Ashton, April 09, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
Social Media Strategies
April 9, 2013
Social media is not just for personal use anymore it has expanded into the business world. The Expert System Cogito Blog piece “Understanding the Strategic Value of Social Media Analysis” talks about how many companies are selling themselves short when it comes to using social media.
“I have often said that companies are missing out on the real value of social media analysis. More often than not, even the big players don’t have the processes or models in place to really make use of the data gained from the analysis. As a result, social media analysis has a limited impact on the business, not to mention the budgets assigned to such projects.”
However, despite the usual oversights the author talks about a recent encounter with the head of customer experience at a well-known bank. They were going to discuss the tools they would need to support social media analysis but instead of going through the usual song and dance the manager was actually prepared to discuss exactly what they needed from them. Even more surprising the customer was actually able to provide specific examples of quantitative as well as qualitative data that she wanted to be able to extract from the streams of data. This made it easier to talk about semantics and how it can bring value to their company. Strategies such as focusing on extracting relationships between monitored entities and relieving some of the social media noise through deep analysis and contextualization can help to improve product visibility as well as market trends. The author ends by nothing that they are sure that they haven’t seen the last of their “usual pitch” because many organizations do not have a clear and concise strategy when it comes to social media projects. However, as the trend changes and more and more companies are realizing the importance of social media semantic technology vendors better strike fast and learn how to “grab the bull by its horns.”
April Holmes, April 09, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Pay to Play Content: Now Even the New York Times Knows
April 8, 2013
Mondays usually start in a predictable way. I walk the dogs. I eat a cardiologist-approved breakfast. I find out what my wife has on her list for me to do. But this morning I flipped through the New York Times, environmentally unfriendly version, and burst out laughing.
My wife asked, “What’s so funny?”
I replied, “The New York Times describes pay to play with more crazy synonyms than I thought possible.”
She asked, “And that’s humorous?”
To me it was. Navigate to these two articles. The first is on the front page of Harrod’s Creek edition and Google-crafted this way: “Scientific Articles Accepted (Personal Checks, Too).” The story appears in the April 8, 2013, edition which you will find in the dead tree version. My link points to a short lived version of the file on another newspaper. After a rousing quote “the dark side of open access” the story jumps to section B, page 8.
The second story appears in the business section of the same issue. Its title is “Sponsoring Articles, Not Just Ads. Branded Content on the Web Mingles with Regular Coverage.” The story features a creative graphic showing pencils held in a roll of money. (You remember. Printed money just like the early newspaper moguls collected by the horse drawn cart in the good old days of publishing.)
The point of both articles is that there are people who will pay to get their content published in a form which has some respectability. Academics pay to play in the academic journals. Companies pay to get their ideas published in a wide range of channels. The New York Times mentions Mashable, but there are many other outfits who charge money to run content. My Augmentext operation is in this business too. I suppose I could trot out the names of big publishers who offer college guides with inflated “inclusions” describing the wonders of certain college campuses. The write ups are compelling and once produced money for those who operated these quasi-reference services.
What words does the New York Times use to describe these pay to play operations? Here’s a list of some of the terms from the write up:
- Advertising
- Advertorials
- Branded content
- Campaign
- Content
- Corporate propaganda
- Native advertising
- Pure editorial
- Sponsored content
Here in Harrod’s Creek, we call content someone wants published for money:
- An inclusion
- A “pay to play” story
- POP or Plain old propaganda as defined by Jacques Ellul. If the name does not ring a bell, you can find the information in his decades old study in Propaganda: The Formation of Men’s Attitudes.
The professional publishing sector has been charging academics for page proofs and other services for many years. Now the practice has diffused to conferences. In my view, the use of “pay to play” methods is now part of the atmosphere and has been for decades.
I find it fascinating that the topics are now front page news from the New York Times. Perhaps “real” journalists are learning more about how the information world works.
What troubles me is that none of these questions is addressed:
- Do modern systems identify pay to play content?
- Are automated content processing systems giving equal weight to shaped content and objective content?
- Are the outputs from analytics systems manipulable?
In my proprietary report on this subject, the surprising answer is, “We just process data.”
In short, despite the huff and puff of next generation content processing system cheerleaders, the systems have what William James called “a certain blindness.” In the quest for revenues, many organizations are unwittingly conspiring to deliver information which at best is semantically swizzled and at worst weaponized. Oh, the phrase “weaponized information” does not appear in the New York Times’ stories nor in the gigabytes of words explaining the wonders of next generation analytics. Like the New York Times, the present is too much with us.
Stephen E Arnold, April 8, 2013
Elasticsearch Joins Fog Creek
April 8, 2013
Elasticsearch is trying to expand its reach by partnering with other trendy tech services. It is definitely getting some headlines. The most recent headline is detailed by Market Watch in their article, “Fog Creek Selects Elasticsearch to Search and Analyze Terabytes of Data.”
“Elasticsearch, the company behind the popular real-time search and analytics open source project, today announced that Fog Creek has selected Elasticsearch to provide instant search capabilities within Kiln, its software development product. Kiln is designed to support and simplify development workflow for users searching more than 100,000 source code repositories. Elasticsearch is now a critical ingredient of Kiln, providing instant search for 300,000,000 requests across 40 billion lines of code to improve overall performance, reliability and user experience.”
Elasticsearch is known for collaboration with leading edge products, but it is not without its controversies as well. GitHub recently reached out to Elasticsearch to develop its new search infrastructure, but the service quickly exposed security concerns and then crashed. So when it comes to a search infrastructure that goes beyond trends, trust an industry standard. Do not assume that every search application will be safe enough for the enterprise. For instance, consider LucidWorks. They are built on open source Lucene/Solr, employ one quarter of the Core Committers on that project, and are optimized for the enterprise. Choose industry confidence, not trends.
Emily Rae Aldridge, April 8, 2013
Sponsored by ArnoldIT.com, developer of Beyond Search
InTrade: A Harbinger of Prediction Woes to Come?
April 7, 2013
Key word search is not to useful when there are trillions of content objects. Clustering trillions of objects is not economically feasible, so the sets are trimmed. Who’s to know? Predictive analytics sounds so darned promising because “real time processing” is cheap, plentiful, and trivial to boot.
What can go wrong with text processing, text analytics, social crowdsourcing data, and the other Lone Ranger silver bullets? How can predictive systems come back and bite a user, an investor, or an employee who loses her job?
I suppose that the article “InTrade Announces $700,000 Cash Shortfall And Risk Of Imminent Liquidation” describes an anomaly. Here’s the key point in my opinion:
..the company has posted the following message on its site, which says that it has discovered a $700,000 cash shortfall that must be rectified immediately in order to avoid liquidation.
InTrade, is or maybe was, a prediction market. The company says:
It’s a market that allows you to make predictions on the outcome of hundreds of real-world events. Stock exchanges find the price of stocks, and futures markets find the price of commodities. Prediction markets find the probability of something happening – a predefined, uncertain future event.
InTrade is more than voting. The company uses a range of methods to answer yes or no. Life should be so simple. The company even posted some Golden Rules to make the system almost foolproof; for example, “If you sell shares you profit if the market value of the shares goes down. Your profit is maximized if the market settled at $0.00.”

Eel bites can be painful due to “alien style jaws.” Investors in some predictive outfits may experience similar bites.
There are many meanings for the word “prediction.” I don’t want to get into a squabble that InTrade is one type of prediction and an outfit like Digital Reasoning or Agilex is another type of prediction. I want to capture several thoughts so I can include them in my text analytics lecture later this month, chance willing, of course:
First, predictions are slippery eels. I once offered predictions to my clients. Now I offer clients. I learned that regardless of methods predictions jump into a murky pool and get lost. Stick your hand in the pool and one can come up with nothing or an eel clamping on the extremity. Ouch.
Second, predictions and various methods and the companies built upon them can simply fail. Why not predict that? I think that getting hoisted by one’s petard is part of life.
Third, InTrade may be one example of what can happen when hyperbole outraces the capabilities of the numerical recipe crowd. Will other companies in the fancy math business suffer similar fates? I don’t know. I won’t predict.
If you are into fancy math, why not plug your retirement nest egg into one of the analytics outfits and let me know how that works out for you. Azure chip consultants, feel free to weigh in and explain to me and my two to three readers how such a clever idea could end up in a pickle of reality.
Stephen E Arnold, April 7, 2013
Business Structures Revealed through New Analysis Technique
April 7, 2013
Now here is an interesting implication of social-graph analysis in business. The MIT Technology Review reports, “Social Networks Reveal Structure (And Weaknesses) of Business.” We’ve known for some time that, through the analysis of connections, social networks can reveal even more about us than is obvious to most users. Now, researchers at Israel’s Ben Gurion University used this concept to derive an impressive amount of information about businesses. The article reveals that the team begins:
“. . . by using a search engine to find the Facebook pages of a number of individuals who work for a specific company.
“Using these individuals as seeds, they then begin crawling the social networks, sometimes jumping from one network to another, looking for other individuals at the same company. These in turn become seeds to find more employees and so on.
“They end up with a basic network of links between employees within the company. It’s then that the fun begins.
“Using standard measures of connectedness, Fire and co then identified people in positions of leadership and by adding in details such as location, mined from the Facebook pages, they reconstructed the international structure of these organisations. They also used community detection algorithms to reconstruct the organisational structure of the company.”
Wow. The researchers used their method on several “well known hi-tech companies” and found startling details. For example, they found a cluster of comparatively disconnected folks at a large organization, and discerned they belonged to an acquired startup that had yet to be well-integrated into the company. This sort of information can be used by companies to monitor themselves, but it could also be used by potential investors (for good or ill for the business, I suppose, depending on what turned up.)
More ominously, competitors could use the information to their advantage. Now that this technology is in the news, many companies will want to prevent such details from emerging, but how? Researcher Michael Fire advises them to “enforce strict policies which control the use of social media by their employees.” Immediately, I might add. And, I suspect that whatever was previously considered a “strict policy” must become even more strict in order to avoid exposure from this technique.
Won’t employees be thrilled?
Cynthia Murrell, April 07, 2013
Sponsored by ArnoldIT.com, developer of Augmentext




