Constellio Profile Now Available

May 1, 2012

If you are tracking open source search, you can download a free profile of Doculibre’s Constellio system. Each week, ArnoldIT will make available a profile of an open source search vendor. You can request a copy of this week’s profile from our TheSeed2020 site. We leave each new profile “live” for one week. If you want the complete set, you will need to request each profile when the file becomes available.

The full collection will comprise 12 profiles. Once each profile has been available without charge, the full collection plus the market analysis and outlook sections will be available in a single PDF file for a service charge.

Stephen E Arnold, May 1, 2012

Sponsored by ArnoldIT

Inteltrax: Top Stories, April 23 to April 27

April 30, 2012

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, problems in the data analytics world and how they are overcome.

The Trouble with Big Data and Social Media” took a look at the overwhelming glut of info brought on by social media and how analytics looks to wrangle it.

Big Data Law Could Smooth Bad Government PR” actually looks to smooth over a prior problem. The government appears to be making nice with big data companies after threatening its reputation in a data mining suit.

Big Data Downfall Not Believable” zeroes in on the naysayers of big data and proves them wrong at every turn.

As with any burgeoning industry, there are lows that go along with the highs. Often, like with the above stories, you can learn a lot about how people handle these rough patches. You can bet we’ll be studying these moments along with the highs every day.

Follow the Inteltrax news stream by visiting www.inteltrax.com

 

Patrick Roland, Editor, Inteltrax.

April 30, 2012

The Open Source Search Ostriches

April 30, 2012

ArnoldIT, located in Harrod’s Creek, Kentucky, has spotted a new species of search, content processing, and text mining vendor: The Scrutans Struthioniformes. Believed to be related to the ratites, this new subspecies is known to be indifferent to ignorant of the predator from the open source jungle.

The proprietary search vendor, Scrutans Struthioniformes, ignores the impact of open source search and information retrieval systems.

ArnoldIT has completed a couple of exploratory expeditions thought he wilds of open source search, clustering, and related disciplines. Sparked by the bimonthly feature on open source search which is currently appearing in Information Today’s Online Magazine, the discovery of the Scrutans Struthioniformes was unexpected.

For almost 50 years, information retrieval meant proprietary systems built upon innovations by academic researchers. When the influence was from the number crunching of the Cornell school or the semantic shenanigans from Stanford, search and retrieval translated to:

  1. Expensive to license, install, optimize, and maintain systems
  2. Licensing restrictions which prevented client-specific tailoring and fast cycle problem remediation or feature addition
  3. High levels of user dissatisfaction from the CFO’s office (the lady who pays the bills) to the user in the sales department (the person who has to find out what happened to a particular customer’s order).

What’s changed, according to ArnoldIT, is that open source options are readily available. Smart outfits like IBM killed off in house, brute force search efforts and embraced the open source Lucene/Solr technology. IBM is a proprietary outfit, but the use of Lucene/Solr allowed more effort to be put into value-adding projects such as the “wrappers” which make Watson a game show winner. IBM has also used its billions to purchase proprietary vendors to deliver “additional value.” The purchase of Vivisimo is a good example of a quick way to get clustering, deduping, and federating functions to bolt on the open source plumbing. IBM may disagree, but we have our views.

Other vendors have built businesses on open source search. One example is the emergence of Lucid Imagination and its Lucid Works Enterpriser 2.0 solution. Licensees get speedy search and retrieval, a staff able to answer questions, and a the rapid cycle innovation of the open source Lucene/Solr software.

Clever Amazon is a “sort of” open outfit. On one hand, the company uses open source  software to make the Amazon cloud work. However,the CloudSearch solution is based on A9. Amazon, however, provides “sort of” open application programming interfaces. Open source as a business angle is part of the CloudSearch play along with making life easy for developers to deliver “good enough” search.

The Basho Riak Search angle is a variation. Riak Search is proprietary but Basho has made it open source. (A free profile of Basho is available by registering at TheSeed2020, an ArnoldIT content delivery Web site.) Good citizens and good marketing. For a company with a problem which requires Basho data management, the Riak Search solution is available, and it is open source.

There are other variations as well, and these are explained in the ArnoldIT briefing about open source search, its opportunities, and its challenges. Unlike the technology payloads delivered by blogs, the ArnoldIT briefing focuses on the business angle of open source search, and the research has delivered some shockers; for example:

  • In a sample of 35 proprietary search vendors, 25 assert that their systems are in some way open source. Good marketing, better technology, or great hyperbole?
  • In a sample of 100 search vendors, two thirds of those pinged by ArnoldIT know about or are on top of open source search. Quite an assertion as the Lucid Imagination Lucene Revolution approaches with dozens of case studies that reveal large companies’ willingness to shift from proprietary solutions to open source search. Are most vendors of proprietary search systems ignoring reality? Sure looks like some are confident the search world tomorrow will look the way it did in 2003.
  • Hosted search is gaining traction in some specific niches. Two of these niches have long been dominated by proprietary systems. More surprising in the fact that the greatest inroads are being made among the Fortune 1000. That’s the market where money often is for enterprise software vendors.

Will vendors of proprietary search and retrieval systems be able to keep their investors and stakeholders happy as open source becomes a greater force in 2013? The briefing considers the scenario when firms pour more funds into open source search and content processing start ups. If this happens, life becomes more difficult from “on the bubble” vendors of taxonomy, clustering, search, and basic information retrieval systems.

Net net: Another search revolution is brewing. Is your proprietary search vendor a  Scrutans Struthioniformes? A better question: Are you? For more information about the ArnoldIT open source search briefing, write seaky2000 at yahoo dot com for options and fees. ArnoldIT may create an open source search ostrich T shirt. Stay tuned. Max and Tess are working on this project now.

Stephen E Arnold, April 30, 2012

Sponsored by Ikanow

Inteltrax: Top Stories, April 16 to April 20

April 23, 2012

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, how three of the biggest supporters of analytics are fairing.

Surprisingly, transportation has taken a shine to analytics, as we discovered in “Transportation Analytics Grows Crucial to Success”.

Not so surprisingly, government spending is leaning heavy on analytics. “Intelligence Community Leads Public Sector Analytics” showed how spy agencies love analytics.

Unfortunately, the one-time titan of analytic love, the medical field, is falling behind, as we learned in “Healthcare Analytics Needs a Boost”.

While there are thousands of industries that utilize big data analytics, these three are probably the most visible. Their successes and failures are important elements of the analytic story and ones we’ll be monitoring daily.

Follow the Inteltrax news stream by visiting www.inteltrax.com

 

Patrick Roland, Editor, Inteltrax.

April 23, 2012

Algorithms Can Deliver Skewed Results

April 18, 2012

After two days of lectures about the power of social media analytics, Stephen E Arnold raised doubts about the reliability of certain analytics outputs. He opined: “Faith in analytics may be misplaced.”

Arnold’s lecture focused on four gaps in social media analytics. He pointed out that many users were unaware of the trade offs in algorithm selection made by vendors’ programmers. Speaking at the Social Media Analytics Summit, he said:

Many companies purchase social media analytics reports without understanding that the questions answered by algorithms may not answer the customer’s actual question.

He continued:

The talk about big data leaves the impression that every item is analyzed and processed. The reality is that sampling methods, like the selection of numerical recipes can have a significant impact on what results become available.

The third gap, he added, “is that smart algorithms display persistence. With smart software, some methods predict a behavior and then look for that behavior because the brute force approach is computationally expensive and adds latency to a system.” He said:

Users assume results are near real time and comprehensive. The reality is that results are unlikely to be real time and built around mathematical methods which value efficiency and cleverness at the expense of more robust analytic methods. The characteristic is more pronounced in user friendly, click here type of systems than those which require to specify a method using SAS or SPSS syntax.”

The final gap is the distortion that affects outputs from “near term, throw forward biases.” Arnold said:

Modern systems are overly sensitive to certain short term content events. This bias is most pronounced when looking for emerging trend data. In these types of outputs the “now” data respond to spikes and users act on identified trends often without appropriate context.

The implication of these gaps is that outputs from some quite sophisticated systems can be misleading or present information as fact when that information has been shaped to a marketer’s purpose.

The Social Media Analytics conference was held in San Francisco, April 17 and 18, 2012. More information about the implications of these gaps may be found at the Augmentext.com Web site.

Donald C Anderson, April 18, 2012,

Sponsored by Pandia.com

Inteltrax: Top Stories, April 9 to April 13

April 16, 2012

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, the ways in which money is dealt with in analytic terms.

Saving money is the focus of “Knowing Needs and Wants Save Tons with Big Data” which aims to help buyers decide what they want in an analytic package before buying.

Making the right investment for you is covered in “Speed is the Analytic Key” which says, above all other factors, spend extra money on speed because that’s the quickest to get outdated.

Finally, “Series-B Investments Expand Analytic Growth” shows how smaller firms and startups depend on private investors to compete with the big names in a big data.

Money makes the world go around and the big data planet is no different. But the ways in which it is saved and spent and acquired could fill a book. We are writing a new chapter every day and hope you’ll join us.

Follow the Inteltrax news stream by visiting www.inteltrax.com

 

Patrick Roland, Editor, Inteltrax.

April 16, 2012

Inteltrax: Top Stories, April 2 to April 6

April 9, 2012

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, lesser known industries falling in love with analytics.

Safety Analytics Fits Every Industry” showed us how big data is adding major advances in public and private security.

Small Biz Gaining in Big Data” told more about what we already know: data analytics helps level the playing field for small businesses.

Customer Service Propels Many BI Companies” delves into the ways in which supporting users is helping vendors succeed.

Analytics is invading our world, often in the most unexpected places. This is just a small sampling of the deep research we provide every day.

Follow the Inteltrax news stream by visiting www.inteltrax.com

Patrick Roland, Editor, Inteltrax.

April 9, 2012

Inteltrax: Top Stories, March 26 to March 30

April 2, 2012

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, specifically, the ways in which unstructured data is impacting the big data industry.

Our feature story this week, “Digital Reasoning Makes Major Move in Military,” shows how the leader in unstructured data wrangling is helping the military increase its reach.

Unstructured Data Demands Right Tools” proves that not all unstructured data softwares are created equal. That’s not a bad thing, it’s just a shell game for users to find the right one for their needs.

Governments Get Self Conscious with Analytics” showed how clever government agencies are clearing up inaccuracies and becoming more efficient by utilizing the massive collections of unstructured data lingering in their systems.

If you aren’t familiar with the term “unstructured data” you will be. It’s the big horizon in the analytics world. We, fortunately, are well versed in the ephemeral stuff. It’s going to change the way the entire industry works and we’ll be following it every day.

Follow the Inteltrax news stream by visiting www.inteltrax.com

Patrick Roland, Editor, Inteltrax.

April 2, 2012

SAS Gets More Visual

March 31, 2012

Inxight (now owned by BusinessObjects, part of the SAP empire)  is history at SAS or almost history. Now the company is moving in a different direction.

Jaikumar Vijayan writes about a new visual analytics application recently unveiled by SAS in his article “SAS Promises Pervasive BI with New Tool.” Einstein is believed to have once said “computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.” We noted this passage from Mr. Vijayan’s write up:

Unlike many purely server-based enterprise analytics technologies, Visual Analytics gives business users a full range of data discovery, data visualization and querying capabilities from desktop and mobile client devices, the company said.

The initial version of the new tool allows iPad users to view reports and download information to their devices. Future versions will support other mobile devices as well, SAS added. The quote is actually a good description of the concept that underlies Visual Analysis. The process uses analytic reasoning to detect specific information in massive amount of data. For example, a clothing manufacturer might use it to determine current trends in ladies’ fashions. The results are presented in charts and graphs to the users, who can fine-tune the parameters until their specific queries are answered.

SAS is known for its statistical functionality, its programming language, and its need for SAS-savvy cow pokes to ride herd on the bits and bytes. Will SAS be able to react to the trend for the consumerization of business intelligence.

While the technology is impressive, SAS may be a little late to the game. Palantir and Digital Reasoning have already introduced applications that offer clients powerful Visual Analysis capabilities. Time will tell if SAS is able to catch up to some competitors’ approach. We are interested in Digital Reasoning, Ikanow and Quid.

Stephen E Arnold, March 31, 2012

Sponsored by Pandia.com

Protected: You Do Not Need Hot Water to Shrink Your SharePoint Crawl Database

March 27, 2012

This content is password protected. To view it please enter your password below:

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta