Calais Web Service
April 22, 2012
Entity extraction and other value added tagging has been moving from center stage to the supporting cast of Analytics: The Next Big Thing. If you want to get a sense of how entity extraction and other semantic functions provides raw inputs to analytics programs, you can navigate to the Open Calais Viewer. Copy some text and paste it into the input box. I used the contents of “Judge Alsup Decides He, Not the Jury, Will Decide the Issue of API Copyrightability.” Here’s the output from Open Calais:
Worth a look. Open Calais is open source.
Stephen E Arnold, April 22, 2012
Sponsored by PolySpot
Blekko Sees Drastic Spike in Search Traffic
April 22, 2012
Search Engine Land recently reported on Blekko, the spam free search engine, in the article “Blekko’s Traffic is Up Almost 400 Percent; Here are the CEO’s Five Reasons Why.”
According to the article, since January Blekko has seen a 337 percent gain in unique visitors this year. What are the reason for this drastic increase you ask? Blekko CEO Rich Skrenta attributes the spike to: improved index quality, dissatisfaction with Google, distribution partnerships, trade show and convention appearances, and the demise of Yahoo Site Explorer.
The article states:
“Google continues to have a near death-grip on search engine market share in the U.S., but the growth that Blekko — and DuckDuckGo, too — is seeing in recent months means that at least some searchers like to keep their options open. (And, much like the rumors of a Facebook search engine, the growth of Blekko and other search engines is also a Good Thing for Google as it deals with antitrust/monopoly concerns.)”
These numbers are small compared to Google’s billions of queries, but the trend is certainly interesting. With Yandex, the Russia-based search system, rattling its weapons in Europe, is a clash with Google coming?
Jasmine Ashton, April 23, 2012
Sponsored by PolySpot
SharePoint and Open Source: A Collision Ahead?
April 22, 2012
The latest release of SharePoint contains the features needed in order to function as a document, record, and content management system all rolled into one. In addition, it easily integrates into the MS Office Suite, particularly Office 2010 and 2007.
However, SharePoint 2010 lacks many features that a lot of users are looking for such as high-volume processing and document capture. In addition, because web technologies have evolved and come so far, users wish to see the same level of UI intuition, interactivity, and richness that modern websites possess. In order to reach that level, SharePoint has to be enhanced. Enter third-party solutions. And because many enterprises are using SharePoint (as of 2011 78% of Fortune 500 firms utilize it) want more from it on top of its in-the-box features yet don’t have enough resources to do so, SharePoint enhancement solutions have become a booming business.
A good example is Best Bets. In the white paper by SurfRay entitled “How to improve SharePoint Search with Best Bets in SharePoint 2010”, it is discussed how the said enhancement can improve the search experience of users:
Users who search in an enterprise context usually expect relevant document to populate the top of results every time, which can result in a poor user experience and abandonment of search when those expectations are not met. Best Bets, in effect, bypass the normal search engine algorithm’s assignment of relevance and value to a particular piece of content in order to make it rank highest for the selected keywords.
But SharePoint has many competitors, both licensed options and open-source ones. And open-source systems such as Alfresco, Drupal, and MindTouch have experienced an increase in user and developer base. Why? Probably mainly for the zero licensing cost. Then for their flexibility and wider browser compatibility.
But rumor has it that Microsoft will be releasing SharePoint 15 within this year. Will SharePoint remain a leader or will it suffer the same type of pressure brought by open source search solutions. You can track news about open source search in out new publication OpenSearchNews. There may be some fender benders but the intersection of open source and proprietary software is getting more heavily used.
Lauren Llamanzares, April 22, 2012
Sponsored by PolySpot
Google Trims Its Sails
April 21, 2012
I live in rural Kentucky on a pond filled with mine runoff. I know zero about sailing on the open seas. However, I do know that the phrase “trim the sail” means a series of steps taken to deal with what the Dockside wearing crowd calls “heavy weather.”
Sailing ships with canvas sails can be tough to handle when the wind blows with gusto. The idea is that the sails should be rolled up in order to minimize the likelihood that a sailing vessel will turn over. The Ahabs call this capsizing. Old geese in Kentucky call this loosing control.
The USS Google, the largest and most unsinkable search system based on advertising, is taking prudent measures to streamline itself. I would describe the actions as “trimming its sails.” The reason? My hunch is that the MBA-speak word would be “efficiency.” My word would be “control.”
“Spring Cleaning: Google Shuts Down Patent Search Homepage, One Pass, Google Related & More” informed me that Google is presenting a lower profile to the economic winds. The write up reports:
Ever since Larry Page took over as Google’s CEO, the company has shut down more and more of its products that were only being used by a limited number of users. Today, the company announced another round of “spring cleaning.”
But here’s the comment which caught my attention, a verbal fog horn perhaps:
As part of this process, Google is also retiring a number of APIs, but most importantly, it is moving to a one-year API deprecation policy across its products (that’s down from three years for some of the company’s APIs).
APIs matter little to the garden variety Google user. APIs do matter to the enterprise, and I think APIs may have a contribution to make to the legal process underway between Google and Oracle.
My view is that most people are blissfully unaware of many Google services. Seven years ago, I counted about 80 Google products and services in The Google Legacy. I no longer keep track of Google products and services because many of them seem anchored in Google’s brute force approach to content processing.
For me, the shift in Google’s approach to APIs will signal that the company may be moving toward a more proprietary approach for developer interaction with Google services. I also think the shut downs and direction changes may give some enterprises additional variables to consider before embracing a “total” Google approach to storage, email, and hosted applications.
A final thought: Perhaps Google knows a major storm is coming. Precautions may be designed to keep the USS Google safe until it reaches a safe harbor.
Stephen E Arnold, April 21, 2012
Sponsored by PolySpot
Big Data: Implications for Open Source and Proprietary Tools
April 21, 2012
During a Web cast in the OpenWorld Tokyo this month, Oracle President Mark Hurd zeroed in on the developments his company has made in the area of analytics. The overall theme of the presentation appears in “Oracle’s Mark Hurd Spells Out Analytics Vision”.
Hurd framed his remarks around the perils and promise held in ever-increasing amounts of digital information. “The amount of data on the planet is just huge,” he said. “I have bad news. It’s going to get worse.” He added:
The true question is how to get the right information to the right person at the right time to make the right decision. This is hard.
Come to think of it, all of the other major players in analytics – Microsoft, IBM, and SAP – talk about it in a similar light. The gist is that they’re making Big Data analytics technology available to businesses so that they can delve into both structured and unstructured data to unearth actionable knowledge. That is, minus the risks traditionally associated with it.
Included in the updates that Hurd announced was the upgrade to the Hyperion Enterprise Performance Management (EPM), that is, version 11.1.2.2. This new version has modules for account reconciliation and financial planning, support for Exalytics, and enhanced user experience, among others. Oracle also announced the release of Endeca Information Discovery, which is a system that’s capable of combining both unstructured and structured data sans modeling.
However, Oracle isn’t the only analytics player that is continuously expanding its feature set. SAP recently launched ActiveEmbedded. But several open source analytics players are going strong. Examples of these are Ikanow and Revolution Analytics.
So what does this mean for proprietary solutions?
Enterprises continue to struggle with the amount of data that they have to manage as that amount skyrockets into the petabyte stage. Hence, they also have to upgrade their infrastructure which means bigger costs on top of the license fees of proprietary tools. Open source analytics, aside from being free, allows businesses to create their own custom-fit analytics solution.
However, I believe that that while open-source analytics will eventually be more widely used, proprietary technologies will remain viable and over time, we’ll see a blend of both being used by companies to handle big data.
Lauren Llamanzares, April 24, 2012
Sponsored by PolySpot
SharePoint Development Tutorial Within 85 Pages
April 21, 2012
Although the SharePoint Fast search option is under assault from many quarters, many organizations want to “run what Microsoft brung.”
No longer do you have to scour the Web for basic tutorials on how to start SharePoint development? You can waste a lot of working hours researching sources, when you can save yourself time and money by heading over to SharePoint Tutorial and reading about their book, “Learn SharePoint Development.”
Pulling from six years of experience, the author pours his knowledge into a short eighty-five guidebook. It was written with the absolute beginner in mind, it includes step-by-step instructions, focused viewpoint with pictures for explanation, concepts and practices for training with source code to start.
“This document shows you basic concepts of SharePoint regarding development and deployment of solutions as well as customizations like Web Parts. It helps you to understand the basic development and deployment process and what elements are involved since the process differs from the ASP.NET process although SharePoint is based on ASP.NET.”
You are also treated to the tools for data organization, SharePoint environmental development, deployment, and Visual Studio 2010 basics. You can either purchase the book for $24, but for an additional $10 you can get the source code as well. One of the problems I have with these SharePoint start up books is that they hardly ever address SharePoint search. If you do not understand search enterprise concepts, then it is good to rely on SurfRay Ontolica—a search enterprise platform that requires zero to little extra programming for adoption.
Whitney Grace, April 21, 2012
Sponsored by OpenSearchNews.com
Gartner Pushes Enterprise Social Network Integration
April 21, 2012
The regulators and project managers of certain government work will enjoy figuring out how the next big Gartner thing affects their day–to-day work.
Gartner has been quiet in the search sector. It is tough to make money in a space where open source and low cost or baked in solutions are widely available. But quite a few azure chip consultants have discovered social networks, which might be viewed as “search under a different name.”
Gartner’s blog expounds on “Search and the Enterprise Social Network.” There are some new buzzwords like “enterprise frictionless sharing” and some spice like “notifications,” but the message is that corporations have to figure out how to make social networks work for them.
Notifications do feature prominently in blogger Larry Cannell’s concerns. If an opportunity or problem arises regarding a client or project, interested parties can be automatically notified so they can take action quickly. Also, Cannell stresses the value of building a company knowledge base that integrates information from multiple applications. I think most will agree that these are meaningful issues which may support some fresh consulting revenues for the firm.
The write up follows the consultant’s middle path by stating:
“As multiple business applications become integrated with a social network site, a significant challenge will be the normalization of business entities across applications. For example, aligning a customer record in a CRM system with the same customer record in a warranty claims system. Without this alignment, cross-business application relationships cannot be captured within the site’s social graph and workers participating in the network will see duplicate customer profiles (one from the CRM system, the other from the warranty claims system).”
Yes, I can see how that could be a valuable tool. The article also points out some areas where for-fee expertise may be useful:
“Another important question regarding these types of integrations will be the role of security semantics managed by the originating business applications. Should the social network site enforce the same access control specified in the business application or should a different set of controls apply to this information? In other words, should revoking a worker’s view permission to a customer’s record automatically revoke this access to the entity’s shadowed social network profile? Replicating a single business application’s security semantics with the social network site may be difficult. A more daunting task will be normalizing security semantics across multiple business applications feeding the social network site.”
Yes, such security reconciliation could be a challenge, but it seems like a fundamental function. Again, this all comes down to the message that organizations must figure out how to make social networks work for them. The assumption is that social networks are here to stay.
Look, most “companies” are small, fewer than 50 employees. The big outfits can afford to experiment, though, if they desire. What we have here is solid azure chip output aimed at generating revenue.
We keep thinking about the interesting intersection of social networks and government contract requirements, regulatory requirements for compliance, and plain old trade secret common sense.
When it comes down to it, aren’t social networks little more than amplifications of traditional communication methods? Oh well, lashing the social hobby horse to search may be one way to keep senior management sufficiently nervous to seek outside wisdom. Gartner caters to large organizations in 85 countries and has been in business since 1979 which is longer than most of the companies offering expertise in social media, networks, and analytics.
Cynthia Murrell, April 21, 2012
Sponsored by OpenSearchNews.com
Sponsored by Pandia.com
Lucid Announces Lucid Words Enterprise 2.1
April 20, 2012
Lucid Imagination has released LWE 2.1. (LWE is the acronym for Lucid’s robust enterprise search solution, Lucid Works Enterprise.) “Lucid Imagination announces General Availability of LucidWorks Enterprise 2.1” emphasized that LWE offers support for both structured and unstructured data.
The write up said:
LucidWorks Enterprise 2.1 adds enterprise-class features and benefits that meet the demanding needs of organizations of any size. Out of the 30 Core Committers to the Apache Lucene/Solr project, eight individuals work for Lucid Imagination, making the company the largest supporter of open source search.
Among the enhancements in this release are:
- Addition of crawler scheduling
- Support for Drools Business Rules Management System and a framework which supports other business rules management systems
- Dynamic fields which permit schema free configuration
- Direct control of buffer and cache settings.
The system is available for on premises or cloud deployment. More information about the company is available at www.lucidimagination.com. The exclusive interview with Paul Doscher is available in the Search Wizards Speak series.
Stephen E Arnold, April 20, 2012
Sponsored by TheTrendPoint.com
Google Pursues Big Data
April 20, 2012
It was only a matter of time before Google wanted a piece of the big data pie and the New York Times Bits column reports, “Google Ventures’ Big Data Bet.” Did you know that Google funds an independent venture capital entity called Google Ventures? Currently, Google Ventures is building an internal data sciences team and Hazem Adam Ghobarah is their most recent hire. Ghobarah formerly worked at Google for six years. He will spend his new career searching for investment opportunities in the data analysis business. Ghobarah will work with the companies under Google’s umbrella on how they can gather and make use of the information.
“It should not be too surprising that a Google-created entity should have this bent. Google, along with Web pioneers like Yahoo and Amazon, was crucial to the creation of the emerging Big Data industry. By tracking things like consumer clicks and the behavior of thousands of computer servers working together, they amassed large volumes of data at a time when collapsing prices for data storage made it attractive to analyze. They also captured information from nontraditional sources, like e-mail, leading them to create so-called “unstructured” database software like Hadoop and MapReduce.”
The methods Google uses to analyze web traffic and predict patterns can be applied to other fields as more data moves online. Google Ventures is one of many companies who are venturing into big data. Everyone is trying to make a buck from the next big trend. The question is will we get a lot of companies chasing the big data client, but will their products and services be top quality?
Whitney Grace, April 20, 2012
Sponsored by OpenSearchNews.com
Datameer Explains Its Services
April 20, 2012
SmartData Collective’s Bob Gourley gives his take on a fairly young company in “Datameer Provides End-User Focused BI Solutions for Big Data Analytics.” The company, founded in 2009, supplies a business intelligence platform that runs on top of the open source Hadoop engine from Apache. Gourley writes:
“Datameer provides a big data solution that focuses on perhaps the most important niche in this growing domain, the end-user. . . . I’ve met with the CEO (Stefan Groschupf) and other Datameer executives. I’ve also interacted with them in events like our Government Big Data Forum. Through these events plus demonstrations by some of their greatest engineers has led me to a few conclusions about Datameer. In general, I believe enterprise technologists should take note of this firm for several reasons.”
Those reasons include: familiar and easy-to-use interfaces; the availability of free trials; scalability; software wizards that guide non-techies through accessing and integrating data; the ability to deploy either on premises or in the cloud; and the ease with which capabilities can be expanded through plug-ins and open APIs.
These are all good features, it is true. But we still have one important question: what differentiates this outfit from such fast movers as Ikanow, Quid, and Digital Reasoning?
Cynthia Murrell, April 20, 2012
Sponsored by Pandia.com