Mindbreeze InSite DemoAugmentextPolySpot: Agile Enterprise Search Infrastructure

Three Metasearch Vulnerabilities and DuckDuckGo

May 25, 2012

I read “The Digital Skeptic: DuckDuckGo Cooks Google’s Goose.” I am okay with online cheerleading. I like to use metasearch systems like DuckDuckGo, but my favorite Ez2Ask.com went away. Ixquick is okay, but each of these systems has three vulnerabilities. I want to highlight them before my addled goose brain forgets them. It is possible that those experts writing about metasearch or federating systems will want to consider these points. One of two might make the analysis a little tastier, sort of like paté from a force fed goose.

First, metasearch engines take a query and send it to a third-party index. The results come back and the results are ideally deduped, relevance ranked, and displayed for the user. Some metasearch systems perform a number of value adding functions. These include putting the hits in folders, which was Vivisimo’s claim to fame. Others parse the results by source type and display them in groups, a function which EZ2Ask.com offered while it was going full throttle from its redoubt in southern France. But when the third party indexes charge money to pull results or just block the metasearch engine, the party is over. Vivisimo built a crawler in order to have an original index for some applications. Most metasearch systems just hope that the third party index won’t change the rules. Anyone remember the original BOSS service and its flexibility? So, vulnerability one is losing a source of hits. No hits, reduced utility. Less utility means less traffic.

Second, when queries are sent to third party indexes, there is latency. There are tricks to mask the latency, but the fact is that in certain situations, the metasearch engine is either presenting a partial result set or one that is just slow to render. So vulnerability two is a performance headache for the metasearch crowd.

Third, deduplication. For some queries, the Web indexes will bang the same drum and loudly. A query for Hewlett Packard Lynch will generate many duplicate and near duplicate hits. The metasearch system must have a way to winnow the most egregious duplicates from the results list and quickly. Slow deduping or no deduping is bad. Partial deduping may be acceptable, but there is a trade off. So, vulnerability three is a results list which contains many identical or similar stories.

Why do a metasearch engine if there are vulnerabilities cheerfully overlooked in the “Cooks Google’s Goose” write up?

  1. Metasearch is a heck of a lot cheaper to pull off than brute force search.
  2. Users often prefer the convenience of having one system “pull together” what the user perceives as the most relevant content
  3. Metasearch allows a marketer to engage in the type of promotion that produces the “Cooks Google’s Goose” article.

As an addled goose, I try not to be too confused about metasearch. Are you?

Stephen E Arnold, May 25, 2012

Sponsored by Polyspot

Big Outfits Buy Search Vendors: Does Chaos Commence?

May 25, 2012

I don’t want to mention any specifics in this write up. I have a for-fee Overflight on the subject. I do want to highlight some of the preliminary thoughts the goslings and I collected before creating our client-focused analysis. This write up was sparked by the recent news that the founder of Autonomy, which HP acquired for $10 billion, is seeking new opportunities after eight months immersed in the HP way. See “Hewlett-Packard Can’t Say It Wasn’t Warned about Autonomy.” This write up contained a remarkable statement, even when measured against the work of other “real” journalists:

Some will say this is a classic case of an entrepreneurial business being bought by a hulking, bureaucratic institution which failed to integrate it and failed to understand its culture. Others will say HP, desperate to do a deal, simply overpaid for a company that was going to struggle to maintain its sales and earnings momentum and was deluded about its abilities. Certainly warnings about the latter were there for HP to see before it handed over all that cash. Here’s what Marc Geall, a Deutsche Bank analyst who used to work at Autonomy, said in October 2010 about the business model: “…investment in the business has lagged revenues… [which] could affect customer satisfaction towards the product and the value it delivers.” He went on to warn that Autonomy’s service business was “too lean” and that it “risks falling short of standards demanded by customers”. All of which prompted Geall to question whether the company needed to change its business model – “traditionally, software companies have needed to change their business models at around $1bn in revenues”.

Yep, now the issues are easy to identify: the brutal cost of customer support, the yawning maw of research and development, the time and cost of customizing a system. The problem is that these issues have been identified. However, senior managers looking for the next big thing are extremely confident of their business and technical acumen. Search is a slam dunk. Heck, I can find what I want in Google. How tough can it be to find that purchase order? That confidence may work in business school, but it has not worked in the wild-and-crazy world of enterprise search and content processing.

Think back to the notable search acquisitions over the last few years. Here are some to jump start your memory:

  • IBM in 2005 and 2006 purchases iPhrase (a MarkLogic precursor with semantic components) and Language Analysis Systems (a next generation content processing vendor)
  • Microsoft which acquired Powerset and Fast Search & Transfer in the 2008 to 2009 period. Both vendors had next-generation systems with semantic, natural language processing, and other near-magical capabilities
  • Oracle acquired TripleHop in 2005, focused on its less-and-less visible Secure Enterprise Search line up (SES10g and SES11g), then went on a buying spree to snap up InQuira (actually the company formed when two weaker players, Answerfriend Inc. and Electric Knowledge Inc., merged in 2002 or 2003, RightNow (which uses the Q-Go natural language processing system purchased in 2010 or 2011), and Endeca, an established search vendor with technology dating from the late 1990s)
  • SAP snagged some search functions with its NetWeaver buy in 2004 which coexisted in a truce of sorts with the SAP TREX system. SAP bought Business Objects in 2007, the company inherited the Inxight Software, a text analytics vendor with assorted wizardry explained in buzzwords by marketing mavens.

So what have we learned from these buy outs by big companies? Here are the observations:

First, search and content processing does not behave the way other types of software learns to sit, come, and roll over. The MBAs, lawyers, and accountants issue commands like good organizational team players. The enterprise search and content processing crowd listens to the management edicts with bemusement. Everyone thinks search is a slam dunk. How tough can a utility function be? Well, let me remind you, gentle reader, search is pretty darned difficult. Unlike a cloud service for managing contacts, search is not one thing. Furthermore, those who have to use search are generally annoyed because systems have since 1970 failed to generate answers. Search outputs create more work. Usually the outputs are mostly wide of the mark. Big companies want to sell a software product or service that solves a problem like what is the back log for the Midwestern region or when did I last call Mr. Jones? The big companies don’t get this type of system when they buy, often for a premium, companies which purport to make content findable, smart, and accessible. So we have a situation in which a sales presentation whets the appetite of the big company executive who perceives himself or herself as an expert in search. Then when anticipation is at its peak, the sales person closes the deal. In the aftermath, the executives realize that search just does not follow the groove of an accounting system, a videoconferencing system, or a security system. Panic sets in, and you get crazy actions. IBM pretty much jettisoned its search systems and fell in love with open source Lucene / Solr. Good enough was a lot better than trying to figure out the mysteries of proprietary search and how to pay for the brutal research and development costs search requires.

Second, search is a moving target. I find that as recently as my meetings with sleek MBAs from six major financial firms, search was assumed to be a no brainer. Google has figured out search. Move on. When I asked the group how many considered themselves experts in search, everyone replied, “Yes.” I submit that none of these well-paid movers-and-shakers are very good at search and retrieval. Few of them have the time or patience for old fashioned research. Most get information from colleagues, via phone calls which include “I have a hard stop in five minutes”, and emails sent to people whom they have met at social functions or at conferences. Search is not looking up a phone number. Search is not slamming the name of a company into Google. Search is not wandering around midtown Manhattan with an iPhone displaying the location of a pizza joint. Search is whatever the user wishes to find, access, know, or learn at any point in time and in any context. Google is okay at some search functions. Other vendors are okay at others. The problem is that virtually all search and retrieval solutions are okay. People have been trying for about 50 years to deliver responses to queries that are what the user requires. Most systems dissatisfy more than half their users and have for 50 years. A big company buying a next generation search system wants these problems solved. The big company wants to close deals, get client access licenses, or cloud transactions for queries. But the big companies don’t get these things, so the MBAs, lawyers, and accountants are really confused. Confused people make crazy decisions. You get the idea.

Third, search does not mean search. Search technology includes figuring out which words to index in a document. Search does a miserable job of indexing videos unless the video audio track is converted to ASCII and then that ASCII is indexed. Even with this type of content processing system, search does not deliver a usable output. What a user gets is garbled snippets and maybe the opportunity to look at a video to figure out if the information is relevant. Search includes figuring out what a user wants before the user asks the question or even knows what the question is. One company is collecting millions in venture money to achieve this goal. Good luck on that. Search includes providing outputs that answer an employee’s specific question. Most systems provide a horseshoe type of result; that is, the search vendor wants points for getting close to the answer. Employees who have to click, scan, close, and repeat the process are not amused. The employee wants the Smith invoice from April, not increased risk of carpal tunnel problems. The poobahs who acquire search companies want none of these excuses. The poobahs want sales. What search acquisitions generate are increased costs, long sales cycles, and much friction. Marketers overstate and search systems routinely under deliver.

Who cares?

Another enterprise search train wreck. The engineer was either an MBA, an accountant, or a lawyer. No big deal. Just get another search train. How tough can it be to run a search system? Thanks to http://www.eccchistory.org/CCRailroads.htm

Well, the executives selling big companies a search and content processing just want the money. After years of backbreaking effort to generate revenues, the founders usually figure out that there are easier ways to earn a living. If the founders don’t bail out, they get a new job or become a guru at a venture capital firm.

Read more

Learning SharePoint and PowerShell is Worth It

May 25, 2012

In “Getting to Grips with PowerShell,” Robert Schifreen continues his SharePoint 2010 odyssey series by taking a closer look at the importance of PowerShell. The powerful scripting language may leave many users baffled, but Schifreen explains the benefits of being well-versed in PowerShell:

Although you can do all your SharePoint admin through the web interface via something called Central Administration, getting to grips with doing things in PowerShell is well worth the investment in time. For example, instead of sitting in Central Admin creating dozens of departmental site areas, you can take a CSV file of your department names and quickly turn it into a PowerShell script that does the job for you. If the resulting site structure then doesn’t look quite right, just alter the script, delete all the sites you created (using PowerShell, naturally), and run the script again. So being a good SharePoint admin means learning SharePoint and PowerShell. That’s just the start.

Schifreen also suggests investing in a decent book on the topic, and if you want enterprise-level search, he suggests adding a book on FAST, InfoPath, Business Intelligence connectivity services, Visual Studio, SharePoint Designer, and more. While enterprise-search is an investment, there are out-of-the-box solutions out there that can save you valuable training and setup resources.

To bypass the need for some expensive or time–consuming training, consider a third party solution like Fabasoft Mindbreeze, which extends the capabilities of your SharePoint system. Their Web Parts based information pairing capabilities give you powerful searches and a complete picture of your business information, allowing you to get the most out of your enterprise search investments. And your end users will benefit from the fast and intuitive search with clearly displayed results and simple navigation.

Fabasoft Mindbreeze Enterprise gains each employee two weeks per through focused finding of data (IDC Studies). An invaluable competitive advantage in business as well as providing employee satisfaction.

Mindbreeze’s intuitiveness means less training required. They also have tutorials and wikis that are easy to use and more efficient. Here you can browse Mindbreeze’s support tools for users, including videos, FAQs, wikis, and other training options. Check out the full suite of solutions at Fabasoft Mindbreeze.

Philip West, May 25, 2012

Sponsored by Pandia.com

Jetbox Introduces Enticing Solution to Old Problems

May 25, 2012

Finding the right balance between easy to use and robust enough to handle a company’s data is a problem with which many traditional product lifecycle management providers struggle.  A new company, Jetbox, has just recently entered the PLM field and promises that their solution is unlike any before.  The article, “Introducing Jetbox(TM), Inc., a Company that Has Reinvented the Way PLM Is Sold, Deployed, Used, and Maintained”, on Market Watch, announces all the ways Jetbox is not the same old, same old when it comes to PLM solutions.

The article describes the company:

“Included in its ground-breaking software is iC5(TM) Turbo, a comprehensive sales and implementation toolset that dramatically slashes the time and eliminates the expertise required to sell, plan, install, configure, document, test, deploy, train, use, support, and upgrade a PLM system. This powerful software toolset helps companies eliminate the need for a customized solution and a team of consultants, keeping within the constraints of an out-of-the-box approach.”

Jetbox’s approach at making PLM accessible to more people within an enterprise while keeping costs affordable and making it easily integratable with existing platforms is exactly the direction PLM needs to be moving.  Other providers, like Inforbix, are also realizing that a PLM solution that is too difficult to operate and requires a lot of money to install and maintain does more harm than it does good.  We look forward to what Jetbox and others similar come up with next.

Catherine Lamsfuss, May 25, 2012

Attivio Signs TCPlus as a Partner

May 25, 2012

The folks at Attivio must be pleased with their most recent success. MMD Newswire reveals, “TCPlus and Attivio Sign Partner Agreement for Australia and Switzerland.” Network component vendor TCplus Datennetz is adding Attivio’s Active Intelligence Engine (AIE) to its wares. AIE is a unified information access platform that goes beyond traditional data warehousing and enterprise search solutions. The press release emphasizes:

“Attivio AIE freely integrates and presents structured data (databases) together with unstructured content (documents, SharePoint, web content, email, etc.) enabling customers to know not only ‘what’ is happening, but also gain context to analyze ‘why’ it is happening. Organizations that implement AIE empower their business users to easily access and analyze all relevant enterprise information to identify new business solutions and opportunities that might otherwise go undiscovered.

“Attivio AIE offers the most accessible and standards-based approach to analytics of any UIA platform.”

AIE is compatible with SQL as well as with leading business intelligence and analytic platforms. The product has garnered several awards.

Headquartered in Newton, MA, Attivio also has offices in the UK and Germany. The company has made it its mission to unravel the unstructured data conundrum. It has partnered with prominent OEM/service providers and solution providers as well as technology vendors like TCplus Datennetz.

Founded in 2004, TCplus Datennetz sells and installs active and passive network components. They pride themselves on their expertise and their close relationships with their customers.

Cynthia Murrell, May 25, 2012

Sponsored by PolySpot

ZyLAB Embraces Predictive and Concept Searching

May 25, 2012

The CodeZed blog recently reported on the automated classification of legal documents in the article “Technology Assisted Review, Concept Search and Predictive Coding: The Limitations & Risks.”

According to the article, artificial intelligence and machine learning has been around since the 1980’s but a recent US ruling regarding the use of machine learning technology in legal review has stirred up trouble in the eDiscovery community. As a result of this ruling, one can expect a dramatic increase in Predictive Coding, Concept Search or other terms relating to TAR capabilities being a requirement for eDiscovery software buyers.

When discussing some of the detriments of machine learning and artificial intelligence, the article states:

“Machine-learning requires significant set-up involving training and testing the quality of the classification model (aka the classifier), which is a time consuming and demanding task that requires at least the manual tagging and evaluation of both the training and the test set by more than one party (in order to prevent biased opinions). Testing has to be done according to best practice standards used in the information retrieval community (e.g. see the proceedings of the TREC conferences organized by the NIST). Deviation from such standards will be challenged in courts. This is time consuming and expensive and should be factored into the cost-benefit analysis for the approach.”

So the short of it is, before using Technology Assisted Review make sure that you do your research and figure out what is best for your business.

Jasmine Ashton, May 25, 2012

Sponsored by PolySpot

SharePoint Options for iPhone and iPad

May 24, 2012

Mobile devices are moving from a business necessity to a formal mandate.  While apps exist for most major functions, tackling mobile access to SharePoint is a bigger creature to tame.  We may have a breaking development in, “Use SharePoint on your iPad or iPhone,” by Dave Johnson.

Collaborating with an iPad in a corporate environment is sometimes challenging, because the ubiquitous tablet hasn’t had the same access to SharePoint as laptops and desktops. Sharing a document, then, has meant sending attachments, and iPads were blocked from accessing the vast stores of corporate documents already archived in SharePoint. Now harmon.ie delivers substantially the same SharePoint experience as your desktop PC provides.

The free harmon.ie app offers an upgrade to a premium addition for a slim $20.  And while accessing an existing SharePoint infrastructure via an iPad or iPhone is now possible, the usability and efficiency is still in question.  How much work can actually be accomplished by forcing SharePoint to fit the mobile mold?

For those who are high mobile users, we would recommend a smart third-party solution that integrates a mobile offering into their primary solution.  Fabasoft Mindbreeze Mobile is one such option.

Smartphones and tablets are constant companions, indispensable in the business world. Information needs to be able to be exchanged at all times and wherever you are. Easily. Quickly. Securely.  Fabasoft Mindbreeze Mobile makes company data available on all mobile devices, regardless of whether you have a BlackBerry®, iPhone®, Windows Phone or Android™ Smartphone or a tablet such as the Apple iPad, Samsung Chromebook/GalaxyTab or Blackberry Playbook. You can act independently and freely – yet always securely. Irrespective of what format the data is in.

Perhaps most importantly, security is never compromised with the Mindbreeze compliance and industry-vetted solution.   So while some apps will let you access your SharePoint installation, Fabasoft Mindbreeze is built to make your mobile access fully functional.

Emily Rae Aldridge, May 24, 2012

Sponsored by Pandia.com

Tips for Implementing PLM for SMEs

May 24, 2012

Now that cloud based technology has lowered the cost of implementing product lifecycle management (PLM) platforms to a level to which most small and midsized enterprises (SME) can take advantage many are still gun-shy when it comes to actually pursuing a suitable PLM solution.  A recent article, “Ten Tips for PLM Success”, on CADalyst, helps direct such companies in choosing a PLM solution.

One of the best tips is to hire a professional.  The analogy is made that PLM is like a do-it-yourself home repair project. Most of the time one gets in over their head and are forced to call a professional to clean up their mistake making the project more expensive and time consuming than necessary.  As for PLM, it explains,

“PLM is no different. You need very specific expertise for these types of deployments, and many companies — out of a desire to save money and reduce the initial investment — think they can manage it internally with their IT team. Without the right level of knowledge on hand, however, deployments can take a lot longer and have a lower chance of success.”

All ten tips are great for small and midsized businesses intimidated by PLM solutions and the entire culture surrounding the once impossible notion of having one’s own PLM platform.  Choosing the right professional for the job can be tricky which is why we recommend that companies look for providers with a proven record in customer service.  As the article also explains, implementing a PLM solution that no one understands or can use is worthless.

Catherine Lamsfuss, May 24, 2012

 

Autonomy Bomb Shell: Revenue Miss and Lynch to Exit HP

May 24, 2012

Navigate to “HP Cuts 27,000 Workers.” Here’s the passage I noted:

One person who will be leaving HP is Mike Lynch, the founder of big data software company Autonomy, who came to HP through its $10.24bn acquisition of Autonomy in August 2011 in the short era when Leo Apotheker was CEO at HP. Whitman said that Autonomy had a bad quarter and was disappointing, and added that Bill Veghte, the executive vice president in charge of HP Software as well as the company’s chief strategy officer, will take over Autonomy and that Lynch will leave HP after a transition period. Lynch’s exit follows that of Chris Lynch, the boss of HP’s other big data acquisition, Vertica, who left HP back in March. Whitman said that Autonomy was a “smart acquisition” and a “great product” and that the revenue miss was more a matter of execution than anything else and that it would take a few quarters to fix whatever the problem is.

Interesting.

Stephen E Arnold, May 24, 2012

Sponsored by Polyspot

Working with SharePoint Active Directory

May 23, 2012

Continuing to following the excellent SharePoint series by Robert Schifreen, our current focus leads us to a discussion of Active Directory.  Read Schifreen’s full piece at “Active Directory: Tips from the front line.”

SharePoint can’t look up users’ information in Active Directory (AD) on the fly. Instead, it has a User Profiles database which, alongside each person’s username, holds their full name, job title, location, photograph, boss’s name, and dozens more attributes besides. You need to set up AD Sync to regularly and automatically copy everyone’s up-to-date details from AD into the User Profile database.  Unfortunately, despite this being one of the key facets of running a successful corporate SharePoint installation, it’s also buggy as hell.

All of these quirky pieces of SharePoint are what makes it what it is, quirky.  However, they are also necessary.  That does not mean that they are effective or intuitive.  Smart third party solutions such as those offered by Fabasoft Mindbreeze may prove a less painful option that allows the user to achieve the same result.

Consider the federated search options of Fabasoft Mindbreeze Enterprise:

You can use the Account settings to activate data sources from a source directory.  If a federated data source needs authentication a dialog will appear. Each user can activate and deactivate the data source he/she needs.  Your administrator defines a source directory and everything else is up to your users. No domain level trusts have to be established.

News pieces such as this one by Schifreen are important to keep a running dialogue about the progression of SharePoint and how to make it more usable for a wider audience.  However, for many organizations, a smaller, more intuitive solution is more approach.  Such organizations should consider Fabasoft Mindbreeze and its suite of solutions.

Emily Rae Aldridge, May 23, 2012

Sponsored by Pandia.com

Next Page »

  •  Only search links from this page: