Text Analytics Summit Freight Train Arrives in November 2011

October 5, 2011

We wanted to remind you that the Text Analytics Summit West is November 10th and 11th in San Jose, California. The conference venue is the Convention Plaza Hotel. Among the speakers are:

  • Tom H. C. Anderson, Managing Partner of Anderson Analytics
  • Cliff Figallo, Senior Site Curator and Editor at Smart Data Collective and Social Media Today
  • Vincent Granville, Chief Architect, Executive Director of AnalyticBridge and many other analytics visionaries and practitioners.

You can read an informative transcript of a discussion among these three experts at http://www.textanalyticsnews.com/text-mining-conference-west/summit-news.shtml.

The conference program is available on the Text Analytics News Web site. The conference offers special student discounts.

Stephen E Arnold, October 5, 2011

Sponsored by Text Analytics News

Digital Reasoning and Entity Based Analytics

October 5, 2011

As the entity-based analytics discipline becomes more prominent in the business sector, private company Digital Reasoning has already made great strides in setting the standard for achieving actionable intelligence.

Dr. Ric Upton will be leading Digital Reasoning’s Washington, DC area office and team in this exciting time for the company. Their product Synthesys is exactly what analysts require in this era of ever-amassing data.

While many other firms offering intelligence software focus on an aspect of entity extraction, Synthesys provides analysts with a comprehensive package for automating the interpretation of big data when the work of search and content processing systems has been undone.

In an exclusive  Arnoldit.com interview, Upton revealed how Digital Reasoning deals with such high volumes of real time information. He said:

[O]ur processing and analytics often have to complement these high volume data flows. We do this in part through judicious use of cloud-based processing augmented by intelligent methods of processing and storing data as it becomes available so that we can avoid the need to perform batch processing or redundant processing of previously-captured data.

The real value is their focus on content centric analytics instead of using statistical algorithms to analyze structured data. Essentially, they decipher the subtext and implicit meanings of content that doesn’t have to be well-structured. The real feat in this is that Digital Reasoning can automate this analysis without any data preparation.

Without Digital Reasoning’s systematic interpretation of data, analysts and clients would actually have to spend hours upon hours of time reading and comprehending content.

Upton shared the reasons why clients have typically used their software:

Our ability to automate understanding is critical to customers with concerns about time, accuracy, completeness, or even the ability to leverage the massive amount of data they have generated.

Serving as an intermediary between the raw data and analysts in the business process, this software has the capabilities to understand the subtleties of the human language. Synthesys can understand the underlying messages in the context of the content’s medium—whether it is a blog, a tweet, or an SMS.

In the interview, Upton sheds insight into how this rich entity extraction manifests itself:

We don’t just extract a name, we can develop and create a persona – the sum of what a person is called, where they have been and when, their relationships with other persona, their behaviors over time, etc.

Digital Reasoning is already looking towards the future, which forecasts that other media such as video and audio sources hold clout as data. As they work on developing methods to analyze these structures, competitors’ opportunities to dominate this field dwindle away.

Megan Feil, October 5, 2011

Sponsored by Pandia.com

Supercomputer Predicts Political Revolutions and Maybe More

October 4, 2011

It sounds like science fiction but it appears that technology has evolved to the point where we can now use a supercomputer to predict revolutions. Shocking I know. The way it works is software retrospectively scans over 100 million news articles from the past 30 years and uses sentiment analysis, text geocoding and predictive analytics to determine what direction political upheaval will go.

According to the Read Write Web article, Can the World’s Next Political Revolution be Predicted by Computers? this technology has greater implications than just predicting revolutions. The author states:

This is Culturnomics at work. One of the more well-known applications of it would be the Google Books Ngram Viewer, a Google Labs project that scans 15 million digitized books to reveal the frequency of certain words and phrases over time. By applying a similar methodology to news articles, researchers can gain insight into human society on an even bigger scale and in a more real-time fashion. A growing body of work has shown that measuring the ‘tone’ of this real-time consciousness can accurately forecast many broad social behaviors, ranging from box office sales to the stock market itself.

While this is still a relatively new area of study, this could have major implications for the flow of unfettered information and it is very exciting to see what can happen when brilliant minds from different fields work together. However, this sounds like the PR usually output by IBM and its Watson business unit.

Jasmine Ashton, October 4, 2011

Sponsored by Pandia.com

Inteltrax: Top Stories, September 26 to September 30

October 3, 2011

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, how some of the biggest names in the business are underwhelming us lately and need to do better.

One such story was “Microstrategy Not the King of Cloud BI,” http://inteltrax.com/?p=2471 which discussed how one very fine business intelligence operation is failing at becoming something it is not—a cloud BI hotspot.

Similarly, the story “Teradata Doesn’t Have the Power to be Dominant” http://inteltrax.com/?p=2435 showcases how another smart analytics firm is stretching itself too thin by trying to become everything to everyone instead of focusing on the core things it does exceedingly well.

Our feature story, “Pentaho and SizeUp Lean Toward Free Analytics,” http://inteltrax.com/?p=2616 was rich with successes for analytic software providers, but also cataloged how customers working exclusively with freeware programs from top BI names will be regretting the choice.

We’re keeping our eye on the top names in business intelligence and data analytics and playing watchdog in the process. Even excellent companies make mistakes and we’ll be here to warn consumers and slap said companies on the wrists with cutting commentary and insight every day.

Follow the Inteltrax news stream by visiting
www.inteltrax.com

Patrick Roland, Editor, Inteltrax, October 3, 2011

Sponsored by Pandia.com

TrendSpottr and DataSift Team Up to Provide Predictive Insights to Social Media Consumers

September 28, 2011

DataSift and Endeca said their vows and now DataSift has another partner.

TrendSpottr, the popular web service for real-time trend analysis, and DataSift, a powerful real-time social media data filtering platform, announced this week that they will be teaming up to provide customers with early and predictive insights from their personalized, filtered streams. In other words they will be telling you what you don’t know before you need to know it… sort of.

The September 23 PR Web release TrendSpottr and DataSift Integrate Services to Provide Customers with Early and Predictive Trending Insights From Their Real-Time Data Streams states:

The DataSift platform filters 100 percent of the Twitter feed and other sources in real-time. This is delivered as enriched data with added augmentations, including sentiment analysis, geo- location and social influence amongst others, providing data of exceptional fidelity that is rich with information to mine for specific and valuable trends.

The release also quotes the founder and CEOs of both companies who appear very excited to be working together to improve enterprises, media companies and social media users access social data to gain real-time insights and market intelligence.

Three observations:

  1. Twitter content is a must have input for certain companies
  2. DataSift is one of the gateway vendors for the Twitter content so we expect more pursuit of DataSift
  3. The availability of large flows of data from the Twitter community requires significant investments in value adding software which can make sense of short, often cryptic and context free,messages.

We see a mini land rush building.

Jasmine Ashton, Sept 28, 2011

Sponsored by Pandia.com

IBM Totes Up Its Analytics Properties

September 23, 2011

IBM owns more analytics functions than an Escalade filled with math PhDs on their way to an American Mathematical Association shin dig.

IBM recently acquired Algorithmics, a Toronto-based company which makes risk-analysis software for the financial industry. IBM paid $387 million for the company and will gain all of Algorithmic’s 900 international employees.

The tools provided by Algorithmics help businesses automate much of their financial risk management and also helps customers meet regulators’ data oversight demands. We learn more about the deal in ZDNet’s article, “IBM Snaps Up Algorithmics for Risk-Management Tools.” The article reports:

‘What Algorithmics brings is risk-analytic capabilities for credit risk, market risk and liquidity risk that is incredibly timely,’ Laurence Trigwell, an executive in IBM’s European business analytics unit, told ZDNet UK. ‘Regulators are asking for more frequent analysis and more frequent exposure… also the regulators are doing a lot more analytics of those disclosures.’

With such a specific financial niche being so boldly approached by IBM, I’m left with a few questions. Can IBM successfully integrate this wide collection of math centric properties? Or will the licensee pay IBM to make a seamless system of the many moving parts? Time will tell if the software purchase helps the company improve their ability to assist customers in managing financial risk.

My question: “Is IBM angling for services revenue from its collection of numerical recipes?” And another, “Will this math goodness enhance IBM’s search offerings so that the company can win another round on Jeopardy?”

Andrea Hayden, September 23, 2011

Sponsored by Pandia.com

Decide: Another Predictive Play

September 21, 2011

Purchasing big ticket items, specifically technology, is always a challenge. The price, trends, and efficiency of products varies greatly depending on the timing of your purchase.

New startup Decide is focused on helping consumers decide when to purchase items based on intelligent predictions monitored by price trends, rumors, news, and technical specifications. Technology Review’s article, “Algorithms Tell Consumers When to Buy Tech Products” discusses the process. We learn:

Etzioni (chief technology officer and cofounder of Decide) says the long-term impact of price prediction could be huge. It’s not just a question of when to buy a flashy new toy, he says. As companies become better at predicting prices and features for all types of devices, buying at the right time could help consumers own better-quality products across the board.

At what point do I end my search and let vendors predict what I need and when I need it? The company seems to have good intentions at saving me money, but I enjoy the independence I have while shopping and decision-making when to purchase items. I like to wait when new products come on the market to research reviews and see the price drop. I also like to think I am intelligent enough to complete this process on my own without waiting too long for the product to become outdated.

We have noticed a flurry of publicity about Dr. Etzioni, and we are forming the hypothesis that he may be in Decide marketing mode.

Andrea Hayden, September 21, 2011

sponsored by Pandia.com

Oracle Text Information Seems Thin

September 16, 2011

Here at ArnoldIT, we were looking for information about Oracle Text Search, an aspect of Oracle Text, which offers a complete text search solution. Oracle Text is included with
both the Oracle11g Standard and Enterprise Editions. After much floundering and a bad link from the Google, we navigated to, and found, a 2007 gem, which seems a bit dated.

In the Oracle technical white paper we found one reference to text search which stated:

[Through Oracle Text’s integrated text search capability] Oracle 11g provides an extensibility framework that enables developers to extend the data types understood by the database kernel. Oracle Text uses this framework to fully integrate the text indexes with the standard Oracle query engine.

We had hoped to find more up-to date information at The Oracle forum. Unfortunately, it did not have a direct entry for Oracle Text Search. If a reader has links to more current information, please, post them in the comments section to this blog. In the meantime, Oracle, please, make information about your search and content processing systems findable.

Jasmine Ashton, September 16, 2011

Sponsored by Pandia.com

When Search Knows Best, Do You Get What You Want?

September 15, 2011

Bing’s term is adaptive search. The idea is that Bing, like the Google, “knows” what I want when I run a query. You can read “Adapting Search to You” to get the details with some spin, of course. How well do these adaptive systems work? If one is a member of a herd looking for sports scores or Lady Gaga news, adaptive search makes life easier. However, if you run some real world queries, adaptive search is maddening.

I was trying to locate flight information from San Francisco to Paris with a return to Washington, DC. One of the adaptive search services concluded that I was in Spain. I was in Austria. Then when the information displayed the language was German with a link that said, “To visit our main site, click here.” Guess where the “adaptive system” sent me. Give up. I was shown a page in Italian. Sure, I am an outlier, but the “smart” systems get confused with real world situations.
When one jumps to a mission critical search, adaptive systems and smart software can return information that may not be what is required. I can work around most problems, resorting to for fee services the retriev5ed information is off point. Other online searchers may suck up what’s offered and make a decision on incorrect or distorted information.

Do you know what you want when you search? Do you know if the information is not on the mark?

Stephen E Arnold, September 15, 2011

Sponsored by Pandia.com

Smartlogic Buys SchemaLogic: Consoliation Underway

September 15, 2011

Mergers have captured the attention of the media and for good reason. Deals which fuse two companies create new opportunities and can disrupt certain market sectors. For example, Hewlett Packard’s purchase of Autonomy has bulldozed the search landscape. Now Smartlogic has acquired SchemaLogic and is poised to have the same effect on the world of taxonomies, controlled vocabularies, and the hot business sector described as “tagging” or “metadata.”

As you know, Smartlogic has emerged as one of the leaders in content tagging, metadata, indexing, ontologies, and associated services. The company’s tag line is that its systems and methods deliver content intelligence solutions. Smartlogic supports the Google search technology, open source search solutions such as Solr, and Microsoft SharePoint and Microsoft Fast Search. Smartlogic’s customers include UBS, Yell.com, Autodesk, the McClatchy Company, and many others.

With the acquisition of SchemaLogic, Smartlogic tries to become one of the leading if not the leading company in the white hot semantic content processing market.  The addition of SchemaServer to the platform adds incremental functionality and extends solutions for customers. The merger adds more clients to Smartlogic’s current list of Fortune 1000 and global enterprise customers and confirms the company as the leading provider of Content Intelligent Software. Jeremy Bentley told Beyond Search:

Smartlogic has a reputation for providing innovative Content Intelligence solutions alongside an impeccable delivery record. We look forward to providing Grade A support to our new clients, and to broadening the appeal of Semaphore.

SchemaLogic was founded in 2003 by Breanna Anderson (CTO) and Andrei Ovchinnikov (a Russian martial arts expert with a love of taxonomy and advisory board member) and Trevor Traina (chairman and entrepreneur; he sold Compare.Net comparison shopping company to Microsoft in 1999). SchemaLogic launched its first product in November 2003. The company’s flagship product is SchemaServer. The executive lineup has changed since the company’s founding, but the focus on indexing and management of controlled term lists has remained.

A company can use the SchemaLogic products to undertake master metadata management for content destined for a search and retrieval system or a text analytics / business intelligence system. However, unlike fully automated tagging systems, SchemaLogic products can make use of available controlled term lists, knowledge bases, and dictionaries. The system includes an administrative interface and index management tools which permit the licensee to edit or link certain concepts. The idea is that SchemaServer (and MetaPoint which is the SharePoint variant) provides a centralized repository which other enterprise applications can use as a source of key words and phrases. When properly resourced and configured, the SchemaLogic approach eliminates the Balkanization and inconsistency of indexing which is a characteristic of many organization’s content processing systems.

Early in the company’s history, SchemaLogic focused on SharePoint. The firm added support for Linux and Unix. Today, when I think of SchemaLogic, I associate the company with Microsoft SharePoint. The MetaPoint system works when one wants to improve the quality of Sharepoint metadata. But the system can be used for eDiscovery and applications where compliance guidelines require consistent application of terminology? Time will tell, particularly as the market for taxonomy systems continues to soften.

Three observations are warranted:

First, not since Business Objects’ acquisition of Inxight has a content processing deal had the potential to disrupt an essential and increasingly important market sector.

Second, with the combined client list and the complementary approach to semantic technology, Smartlogic is poised to move forward rapidly with value added content processing services. Work flow is one area where I expect to see significant market interest.

Third, smaller firms will now find that size does matter, particularly when offering products and services to Fortune 1000 firms.

Our view is that there will be further content centric mergers and investments in the run up to 2012. Attrition is becoming a feature of the search and content processing sector.

Stephen E Arnold, September 15, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta