Hadoop Gaining Ground on RDBMS Like a Smart Car Climbing Pike’s Peak
September 6, 2011
Open-source Apache Hadoop software is co-existing on the market with the more established RDBMS for relational database management. Computer World reports in, “Hadoop Growing, Not Replacing RDBMS in Enterprises.” We learned:
Hadoop is designed to help companies manage and process petabytes of data. Much of the technology’s appeal lies in its ability to break up very large data sets into smaller data blocks that are then distributed across a cluster of commodity hardware for faster processing. Early adopters of the technology, including Facebook, Amazon, eBay and Yahoo, have been using Hadoop to store and analyze petabytes of unstructured data that conventional RDBMS setups couldn’t handle easily.
Computer World’s review is not completely negative, but rather restrictive in our view. RDBMS has organizational inertia on its side, an obstacle any newcomer has to conquer. RDBMS is entrenched in the rigid world of transaction data, customer information, and call records. However, Hadoop is adept in creative sectors such as event data, search engine results, and text and multimedia content from social media sites. Security concerns are also cited, although as adoption becomes more widespread those concerns are sure to lessen.
Our view is that in the present financial environment, open source is likely to suffer severe pressures. Giant, for profit companies will want to capitalize on open source goodness and then implement a fiercely commercial pricing model for services, training, consulting, engineering, and proprietary extensions. Big money will lure key developers, and the “community” may be subject to London, UK style dissention. Yikes!
Emily Rae Aldridge, September 6, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Linguamatics Scores Big with Text Mining
September 6, 2011
Wouldn’t it be great if there was a way to sift through all the chatter on Twitter and other social media sites to get to the real meat and potatoes? What if companies could find the proverbial needle in the Twitter-haystack? All this is being done by Cambridge-based Linguamatics as reported in the article, Tweet Smell of Success, on Business Weekly.
The small company (only 50 employees after expanding) caught the world’s attention due to their text-mining skills. Last year, using their search expertise, they were able to very accurately predict the outcome of an election based on the Tweets which occurred during a live, televised debate.
There core technology was developed by the four original founding members. Three remain at the company. They have expanded, rapidly, in their ten years of business, and rely solely on income. They believe their success is due to their unique search approach.
David Milward, CTO and co-founder said: ‘We knew that language processing could get people relevant information much faster than traditional search methods. However, previous systems needed reprogramming for different questions: we wanted to give users the flexibility to extract any information they wanted.’
Linguamatics is just one of many emerging search management companies, each with its own niche. With business and technology constantly shifting to newer and faster methods of getting information, it is no surprise that businesses demand better search methods. More and more information is popping up within the internet, intranets, file-sharing and other data storage entities. Traditional brute force search looks less and less useful to the professionals in some of these hot new market sectors.
Catherine Lamsfuss, September 6, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Protected: SharePoint Social Tools Gets a Major Endorsement
September 6, 2011
Inteltrax: Top Stories, Aug 29 to Sept 2, 2011
September 5, 2011
Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, pulling these stories from across a wide spectrum of analytic topics.
Our feature this week, “Definition of Big Data Evolving” took an inside look at how customers, not designers, are sculpting what we will come to call “big data” in the future.
Another story, “JP Morgan Shows No Sign of Analytic Slowdown” explains how JP Morgan cut its costs by investing in faster analytic tools.
Another interesting story, “Digital Reasoning Beefs up its Front Office,” showed how one of the business intelligence/data analytics world’s fastest risers is strengthening its leadership with an expert in healthcare. (Beyond Search will be running an interview with Dr. Ric Upton in a future issue of Beyond Search.)
These stories and more made up our week as we follow the ever-evolving landscape of big data. Whether it’s executives changing titles or the changing terminology of the field, we’ve got our eyes on it all and will bring the latest scoop to readers.
Follow the Inteltrax news stream by visiting www.inteltrax.com
Patrick Roland, Editor, Inteltrax, September 5, 2011
Sponsored by Pandia.com
Oracle Data Mining Update
September 5, 2011
The new Oracle Data Mining Update is generating buzz, including a piece by James Taylor entitled, “First Look – Oracle Data Mining Update.” Oracle Data Mining (ODM) is an in-database data mining and predictive analytics engine, which allows for the building of predictive models. The features added in the latest version are highlighted.
The fundamental architecture has not changed, of course. ODM remains a “database-out” solution surfaced through SQL and PL-SQL APIs and executing in the database. It has the 12 algorithms and 50+ statistical functions I discussed before and model building and scoring are both done in-database. Oracle Text functions are integrated to allow text mining algorithms to take advantage of them. Additionally, because ODM mines star schema data it can handle an unlimited number of input attributes, transactional data and unstructured data such as CLOBs, tables or views.
The ability of ODM to build and executive analytic models completely in-database is a real plus in the market. The software would be a good candidate for anyone interested in using predictive analytics to take advantage of their operational data.
Emily Rae Aldridge, September 5, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Statisticians Weigh In on Big Data
September 5, 2011
The Joint Statistical Meetings, the largest assembly of data scientists in North America, provided fertile ground this summer for a survey by Revolution Analytics on the state of Big Data technologies. Revolution Analytics presents the results in “97 Percent of Data Scientists Say ‘Big Data’ Technology Solutions Need Improvement.”
As the headline suggests, the vast majority of these experts crave improvement in the field:
The survey revealed nearly 97 percent of data scientists believe big data technology solutions need improvement and the top three obstacles data scientists foresee when running analytics on Big Data are: complexity of big data solutions; difficulty of applying valid statistical models to the data; and having limited insight into the meaning of the data.
Results also show a lack of consensus on the definition of “Big Data.” Is the threshold a terabyte? Petabyte? Or does it vary by the job? No accepted standard exists.
Survey-takers were asked about their future use of existing analytics platforms, SPSS, SAS, R, S+, and MATLAB. Most respondents expected to increase use of only one of these, the open source R project (a.k.a. GNU S).
Revolution Analytics bases their data management software and services on the R project. The company also sponsors Inside-R.org, a resource for the R project community. I’d have to see the survey to know whether the emphasis they found on R was skewed, but let’s give them the benefit of the doubt for now.
Cynthia Murrell, September 5, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Is the Open Source Community Getting More Fractious?
September 5, 2011
Do we sense some edginess in the “community” for open source search? TheServerSide.com declares, “Lucene Should Just Shut Up about Java 7.” This rude headline is a response to those who have written on Lucene’s side, such as The H Open’s “Java 7 Paralyses Lucene and Solr.”
ServerSide writer Richard Mayhew defends Oracle and its release of the open source Java 7. He admits there were problems, like there usually are with revised software, but says that they are being addressed. He feels Lucene should have tested Java 7early on, and is overreacting to the problems:
It’s not like Java 7 was sneaking up on anyone, Oracle’s been doing webinars and presentations and press releases a lot lately to get the word out to whoever’s living under a rock and didn’t know. So Lucene should have said hey this thing’s coming, maybe we should try it. And when these geniuses did, it was too late to turn back without giving java 7 a huge black eye, which nobody needs.
We decline to weigh in on this particular debate. However, we do see it as a sign of a deeper, more critical problem in the open source fellowship. We think it warrants close observation.
Cynthia Murrell, September 5, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Web Search Industry Challenged to Innovate
September 5, 2011
Google is the ultimate search solution, right? But have you noticed a curious lack of new ideas in the world of Web search? If so, you’re not alone. The critcism of Google won’t die.
Network World reports, “Computer scientist calls for Web search shake-up.” It seems that Oren Etzioni, who teaches computer science at the University of Washington, feels creative juices are in short supply in the Web search field. His commentary in Nature is only available to subscribers or those willing to pay per article, but writer Bob Brown provides a glimpse:
The main obstacle to progress ‘seems to be a curious lack of ambition and imagination,’ Etzioni writes in the piece.
The search critic, is “[Dr. Oren] Etzioni, who directs the University of Washington’s multidisciplinary Turing Center, calls on search engineers and others to ‘think outside the keyword search box.’” He is also working in the field of search and retrieval as well.
We learned:
[Dr. Etzioni] envisions more voice-based search that relies on increasingly intelligent computers like IBM’s Jeopardy-winning Watson and technologies like the Turing Center’s ReVerb software that can figure out how online information relates to each other.
Dr. Etzioni asserts that some changes will continue to be prompted by the move to smaller screens. Despite his criticism, he sees a bright future for the industry. He points to intelligent search developments in shopping search, like Decide.com, as an example of where we’re headed.
Eventually.
Cynthia Murrell, September 5, 2011
Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search
Protected: Simplifying Email within SharePoint: An ABB Case Study
September 5, 2011
CNN Opines about Alleged Gates-Page Parallels
September 4, 2011
Quote to note: I am not doing too many of the trade show carnivals these days. I am a bit tired and 20 somethings give me a headache. However, I do keep a quotes file, and every once in a while I find a quote that looks like a keeper. Here’s a candidate from the CNN story “The New Bill Gates: Google’s Larry Page.” The passage:
Like Gates, Page is often described in otherworldly terms, a near-genius with autistic tendencies like counting the seconds out loud while you’re explaining something too slowly to him. Like Gates, he has run his own company for his entire adult life and has had uninterrupted success. Like Gates, he has an engineer’s soul and is obsessive about cutting waste — one of his first acts after taking over as CEO in April was to send an all-hands e-mail describing how to run meetings more efficiently. Like Gates, he is hugely ambitious — he once suggested that Google hire a million engineers and told early investors that he saw Google as a $100 billion company. That’s $100 billion in annual revenue, not just stock value. (It’s about one-third of the way there.) And like Gates, Page may have a blind spot about the intersection of business and the Beltway.
Whether one agrees or not, I find the public position regarding a powerful, feisty company like Google interesting. The use of the word “autistic” is fascinating. I wonder if the author knows Mr. Page or any person afflicted with autism? What will CNN do if its referral traffic slows? Come up with more snappy quotes or just buy Adwords? CNN may find itself doing some fresh thinking after this story with the “autistic” word and the clumsy parallels drawn between Bill Gates and Larry Page. Honk.
Stephen E Arnold, September 4, 2011
Sponsored by Pandia.com