Webmasters Vocalize about the Google Panda Update

August 2, 2011

The race to the top is tough – especially in the race to Google’s top search rankings. “Google: Ask Yourself These 23 Questions if Panda Impacted Your Website” reports on the woes of some webmasters who’ve lost revenue due to Google’s rollout of the Panda Update, the algorithm change aimed at identifying low-quality pages and sites.

The article reports on Google’s Amit Singhal’s recent blog post outlining 24 questions webmasters should ask themselves about the quality of their own sites. But blog respondents also had something to say:

Singhal also notes that since Panda rolled out, Google has rolled out more than a dozen additional tweaks. But that doesn’t matter to a few people who have already commented on Singhal’s post, noting a very obvious flaw that Google still hasn’t conquered: scrapers are outranking the original content in many cases.

So what’s Google aiming at here? Could they be trying to give their AdWords a boost? AdWords, the main advertising product and source of revenue for the information giant, is the reason you see sponsored links with your Google search. An advertiser pays for trigger words.

When a user searches on Google, ads for the relevant words are shown as the sponsored links. So what does this mean for the little guy with high-quality Web content?

We’ll have to wait and see as Google continues to tweak and webmasters continue to vocalize through blog posts. The question we keep discussing at lunch is, “Are these changes intended to boost ad revenues or help the average Google searcher?”

Philip West, August 2, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Written by Stephen E. Arnold · Filed Under Google, News, Search | 1 Comment

Social Content Feed Tool from Know about It

August 2, 2011

When all your Facebook, Twitter, and other social streams become so convoluted, you might miss out on that link, photo, or music video you would’ve loved. You’ll never know – until now…maybe. Marshall Kirkpatrick looks at the new start-up, Know About It, in “New Service Sniffs out Secret Gems from across Your News Feeds.”

The service brings in all your subscribed content from major social networks, then offers a number of different ways to sort what it finds. My favorite is the filter called “Potentially Missed – links from people who don’t share a lot of links.

Know About It explains on its Web site they collect all the links passing through your social streams and perform a “bunch of analysis on each one to determine which are most likely to be of interest to you.”

Sounds helpful. The idea of sorting all your inbound information in a variety of ways is appealing. You can also look at the service’s recommendations based on your expressed interest or get a personalized email digest.

Mr. Kirkpatrick has not yet tested the service but likes the idea. What isn’t mentioned? Privacy. So what is the ‘bunch of analysis’ and where do all those links end up? Advertisers? If the start-up is successful, time will tell. But with the social web moving at a never-ending pace and growing, social media users wanting to sort their feeds likely won’t mind too much. We think these types of tools are likely to grow in importance as free real time search becomes a difficult service to monetize.

Philip West, August 2, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Written by Stephen E. Arnold · Filed Under News, Real time search, Social | 2 Comments

Protected: DotNetNuke and the New SharePoint Additions

August 2, 2011

Written by Stephen E. Arnold · Filed Under Enterprise, Enterprise search, Microsoft, News, Search, SharePoint | Comments Off on Protected: DotNetNuke and the New SharePoint Additions

Inteltrax: Top Stories, July 25 to July 29

August 1, 2011

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, each dealing either with some of the surprising negative news found in the analytics industry—each a lesson that can be learned for others. The system powering the Inteltrax system is called Augmentext.

Perhaps the most shocking tale of self-destruction we’ve seen in a while, “US Army is Not an Analytic Superpower,” detailed how this defense branch spent over $2 billion taxpayer dollars for an analytic tool that never worked, when private companies could have been contracted out for pennies on the dollar.

Another sort-of David and Goliath story, “SAS Falling Behind in the Cloud,” detailed how one-time business intelligence superpower SAS has rested on its laurels and, in the process, become a joke in the competitive and lucrative world of cloud-based analytics.

Finally, we served up a cautionary tale to those believing everything they read with “Parallel NFS Barely on the Radar.” This was a story of warning, as the company in question got some great press for its software, but has almost no history to back it up, which made us incredibly suspicious.

These three stories are, thankfully, the exception and not the rule. Every day we are wowed by news of analytics and business intelligence helping practically every business imaginable. However, there are always rotten eggs, even during an impressive time of growth. That’s why we’re here, to help readers sort out the good and the bad and make more informed decisions.

Follow the Inteltrax news stream by visiting www.inteltrax.com

Patrick Roland, Editor, Inteltrax, August 1, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Written by Stephen E. Arnold · Filed Under Business intelligence, Business strategy, News, Technology, Text analytics, Text processing | Comments Off on Inteltrax: Top Stories, July 25 to July 29

Oracle Updates SES11g

August 1, 2011

We wanted to mention the update to Secure Enterprise Search (SES) to our Oracle fans. Users will want to upgrade to 11g Release 1 (11.1.2.2), which can be downloaded at the link above.

First the token bullet list of “what’s new” straight from the Web site:

All platforms available for download, including Windows 64-bit
Oracle Access Manager integration for crawler and search application
Autovue CAD file support
Custom lexers and stop words lists, on per-data source granularity

It’s nice to see that the add-on is ready to cooperate with Oracle’s own Autovue; including drawings in an index is a must for several industries. Provided it proves functional, adding more flexibility with the stoplist should increase accuracy and weed out those pesky repetitive user-specific terms.

I scanned the release notes; no surprises here (a patch is a patch is a patch). There are several known issues but save a few exceptions the workarounds are adequately documented. Watch out for a possible compatibility issue with IPv6. Keep in mind that Oracle bought a natural language search engine with its InQuira purchase. NLP seems to be an interest of Oracle.

Sarah Rogers, August 1, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Written by Stephen E. Arnold · Filed Under Enterprise, Enterprise search, News, Search, Technology | Comments Off on Oracle Updates SES11g

Vivisimo Granted Patent for Clustering

August 1, 2011

Great news for Vivisimo! A news release titled “Vivisimo Receives U.S. Patent for Clustering System and Method” reproted:

“Before remix clustering, Vivisimo was the first tech firm to introduce on-the-fly clustering, which allowed users to see their search query results arranged in topic folders. This unique feature gave users the ability to review a list of similar results associated with their search. The invention of remix clustering took enterprise search to a whole new experience, allowing both consumers and employees to see what other subtler topics are connected to their search.”

If you haven’t had the pleasure of receiving neatly, categorized search results courtesy of remix clustering, check out Yippy (formerly Clusty).

Rattling off a client list including Cisco Systems, NASA, the German Intelligence body, the Institute of Physics, the National Library of Medicine, the American Association for the Advancement of Science, et al lends some serious credibility for Vivisimo; needless to say these aren’t your everyday Google users or trend surfers. If remix clustering is preferred by those in the business of information, that’s as good an endorsement as any.

So well done and congratulations, Vivisimo. I look for the clearing of this hurdle to spawn more innovation in the future or litigation, which seems to be important to many organizations.

Never hurts the ol’ pockets, either.

Sarah Rogers, August 1, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Written by Stephen E. Arnold · Filed Under News, Technology, Text processing | 4 Comments

Solr Deep Paging Fix

August 1, 2011

After being spoiled by modern technology, let’s face it: who has three seconds to spare? This ultimately is the question posed in the “Deep Paging Problem” post on the Solr Enterprise Search blog, which presents an interesting performance tweak for the open source system.

Querying data buried deep in the information banks can be a bit hairy. Even the search giant Google stands at arms length from the problem, only returning 90-pages or so of results. If Solr was asked to retrieve the 500th document from an index, it must cycle through each of the first 499 documents to grab it. What can be done to save valuable time as well as ease the strain on the system, you ask?

Here enters the power of filters, handy from cigarettes to spreadsheets and nearly everything in between. The author asserts:

“The idea is to limit the number of documents Lucene must put in the queue. How to do it? We will use filters to help us, so Solr we will use the fq parameter. Using a filter will limit the number of search results. The ideal size of the queue would be the one that is passed in the rows parameter of query. … The solution … is making two queries instead of just one – the first one to see how limiting is our filter thus using rows=0 and start=0, and the second is already adequately calculated.”

So use the two saved seconds in searching to write that down. One query to recover the first page of results and a second, two-part query to check the number of results and then return the desired elements. For a useful example of the code in action, check the original post linked above.

Sarah Rogers, August 1, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

Written by Stephen E. Arnold · Filed Under News, Open source, Search | Comments Off on Solr Deep Paging Fix

Protected: SharePoint Licensees May Want to Check Out Daytona

August 1, 2011

Written by Stephen E. Arnold · Filed Under Business strategy, Enterprise, Enterprise search, Microsoft, News, Open source, Search, SharePoint | Comments Off on Protected: SharePoint Licensees May Want to Check Out Daytona

« Previous Page

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.