WAND and Layer2 Team for SharePoint Taxonomy Functions

March 19, 2010

A happy quack to the reader who sent me a link to “Jump-Start Microsoft SharePoint 2010 Knowledge Management Using Pre-Defined Taxonomy Metadata”. The Microsoft Fast road show is wending its way among the Redmond faithful. In its wake, a number of companies see opportunity in the Microsoft demos. But with Microsoft making some tasty offers to incentive those looking for search systems, Microsoft may be doing third-party add-on vendors and Fast ESP consultants a big favor.

The Earth Times’ article said:

In cooperation with WAND, Inc – one of the leading providers of enterprise taxonomies – Layer2 now offers pre-defined Taxonomy Metadata for Microsoft SharePoint Server 2010, a robust and expanding library of taxonomies covering a wide variety of domains to help jumpstart classification projects. Taxonomy Metadata for Microsoft SharePoint 2010 is currently available in 13 languages, e.g. English, French, German, Spanish, Italian, Portuguese, Japanese, Simplified Chinese, Traditional Chinese, Korean, and Vietnamese.

WAND has developed structured multi-lingual vocabularies with related tools and services to power precision search and classification applications. The company asserts that WAND makes search work better. WAND Taxonomies are used in online yellow pages and local search, ad-matching engines, business to business directories, product search, and within enterprise search engines. The firm’s library contains more than 40 domain specific taxonomies. WAND’s taxonomies are available in 13 languages.

Layer 2 GmbH is a specialist for creating custom components and solutions for Microsoft SharePoint Products and Technologies. Based in Germany, Layer2 offers products and solutions that add additional features to portals based on Microsoft SharePoint technology.

My view is that Microsoft may be creating opportunities at the same time it leaves some SharePoint customers wondering why their systems do not work as expected. If taxonomy management was a priority, Microsoft should have included a system to perform this type of work within the SharePoint package. Third party vendors now have an opportunity to sell a “solution,” but customers may have to go through a learning process and then spend additional money to get the functionality required to make SharePoint more useful.

Perhaps another mixed result from SharePoint? Just my opinion.

Stephen E  Arnold, March 19, 2010

Freebie. No one paid me to point out that talking about “taxonomies” is much easier than implementing a high value taxonomy and then enforcing consistent tagging across the processed corpus. I know that the IRS is good at indexing by social security number, so I will report non payment to that agency.

Google, Profile Capturing, and User Intent

March 18, 2010

On March 16, 2010, Google nailed a patent for its invention “Profile Based Capture Component,” US7,680,809. On the surface, maybe not so exciting. Google wizard Steve Lawrence had a hand in this invention filed in early 2004. Don’t you admire the turnaround time? Here’s the abstract:

An indexing system in a computer system may include applications, a capture processor, a queue, a search engine, and a display processor. The indexing system captures events of user interactions with the applications. Events are queued and if indexable, indexed and stored for user access through the search engine. Capture components in the capture processor can include a keyboard capture component that processes user keystrokes to determine events. A display capture component captures event data from windows associated with the applications. Display event data can be captured on a polling schedule or based on state changes of window elements. To determine target applications and window applications of interest application profiles and window profiles can be used.

Google’s predictive methods revealed.

Stephen E Arnold, March 18, 2010

A freebie. I will report non payment to the USPTO which recently rejected a fax because it was “backwards”. The other pages, as I understand the event, were not backwards.

Fabasoft Mindbreeze and Its Lotus Connector

March 18, 2010

I was able to read a white paper prepared by Fabasoft Mindbreeze about its updated Mindbreeze IBM Lotus Connector. The document is “Configuration of Mindbreeze Enterprise Search for IBM Lotus” and is available from the company. When I worked at Ziff Communications in New York City, I had an early exposure to the product. Since that time 20 years ago, Lotus Notes has found its way into many commercial and governmental entities. Those who love the product cannot live without it. People like me tolerate some of the system’s peculiarities exemplified by this question, “Why can’t you restore my email?”

Fabasoft is the successful Austria-based enterprise software and integration company. Mindbreeze is its search, content analytics and content processing subsidiary. The Mindbreeze engineers have developed a solution for organizations with Lotus Domino/Notes as well as a lot of other types of content systems. You can get Mindbreeze and its Lotus Domino/Notes support, snap it into your environment, and search for Notes content, even in mobile environments, including the RIM Blackberry, Apple iPhone, and Google Android devices.

In January 2010, I got a preview of the system and I received a copy of the white paper. I followed up with Daniel Fallmann, founder and managing director of Mindbreeze. Here’s what I learned in an email exchange on March 14 and 15, 2010:

What is the main focus of the IBM Lotus Domino/Notes support you offer?*

What is very important for us is that that the Mindbreeze Connectors run with a minimum of required configuration, even to very large scale. So Notes items and even complex Lotus Domino object models are very easy to adapt to fulfill the need of the customer/users. So Fabasoft Mindbreeze Enterprise makes it easy to search-enable any line-of-business application based on IBM Lotus Domino within a minimum amount of time with great results for the knowledge workers, even with their mobile information needs. Of course our customers get all the needed social search and federated search features built-in. We have a lot of Lotus partners that love the ease you can now search-enable IBM Lotus line-of-business applications.

As we offer an appliance as well you can buy the Fabasoft Mindbreeze Appliance or you can install Fabasoft Mindbreeze Enterprise and run the IBM Lotus Connector on a Linux environment which totally saves you the money of the operating system and enables you to even support our customer’s users with IBM Lotus line-of-business-application in the cloud with a modest and easy to calculate investment.

What is the method for indexing Lotus Mail which has been moved to an archive?*

Fabasoft Mindbreeze Enterprise follows the “link-information” (for example, a link to an archived mail for example in a Fabasoft iArchive for IBM Lotus Notes) left in the remaining item stub and index the archived information by applying the rights based on the stub object that’s left in the IBM Lotus installation.

How are emails across Lotus Notes installations indexed so that only the authorized person can see a single email or a group of emails?

Fabasoft Mindbreeze Enterprise uses the rights based on an IBM Lotus object information to evaluate if a user has the right to read information based on the document level or even extensible to the field level. Things like inherited rights and user name fields are as well taken into account. As Fabasoft Mindbreeze Enterprise is based on a modern distributed service architecture, it is easy to spread queries against several instances and respond to a user’s query. Thanks to our innovative technology and architecture we typically are up and running at customers in between 30min and 2 days, of course this highly varies on the customer’s needs.

When Lotus Notes is used with an IBM collaboration tool like Lotus Notes Traveler, how are the indexes federated so a single query retrieves the content across the Notes’s components?

First: It typically makes sense that Fabasoft Mindbreeze Enterprise crawler, filter, index services are located where the information is, so the best practice is to use the distributed architecture of Mindbreeze Enterprise Search to bring together information from several IBM Lotus databases. Second: As Fabasoft Mindbreeze Enterprise connects against the IBM Lotus web services it is even trivial to get information from different locations and index it in a central location.

How does your system’s pricing work?

We have a per named user pricing model as well as a concurrent use model that is very easy to calculate and use. Moreover the Mindbreeze IBM Lotus Connector supports custom object models as well and you can host the whole product on a Linux platform. As far as I know there is no other IBM Lotus Connector and search product, that can so easy adapt to IBM Lotus Domino object models for your specific line-of-business application. This of course has to be taken into account.

Do you support the Notes – Cisco Unified Meeting Place?

Fabasoft Mindbreeze Enterprise allows you to index all the calendar information for all meetings that you are invited to by Cisco’s Unified Meeting Place. This information can be updated in the index during the meeting place notification mechanisms. You could even index the audio content streamed via a Cisco MeetingPlace Audio Server by using speech to text functionality.

If you want to index Lotus content, we think the Mindbreeze solution warrants a test drive. Contact the company at http://www.mindbreeze.com.

Stephen E Arnold, March 16, 2010

When I am next in Linz, Mindbreeze promised me a pastry. Until then, this is an uncompensated post.

Newsosaurs

March 15, 2010

I read “It’s Hard To Watch The Newsosaurs Turn A Blind Eye To Their Own Extinction” right after I flipped through the New York Times’s Sunday magazine clone from the Wall Street Journal outfit. Let me comment on each information MIRV and offer a couple of observations from my search vantage point.

First, TechCrunch’s write up has a killer comment:

Everyone wants to wall off the Web and keep grazing on declining ad revenues.

I agree. This is a combination of fear, anger, and ostracism. I enjoy pointing out that in the information economy, the traditional giants no longer own the country club. Each day, the former owners find their future will be as caddies to the new information elite. This is, I suppose, a bitter pill to swallow. The TechCrunch article includes the much quoted “burn the boats” admonition from one of the early superstars of the zippy-doo Web that is not the cat’s pajamas. Like Google’s advice to struggling industry, the listeners think that their outfits have already burned the boats, embraced technology, and reinvented themselves. This mismatch between advice and its perception is characteristic of the domain collision that is now taking place. The passage that caught my attention in the TechCrunch write up was:

The longer media companies wait, the bigger disadvantage they will have when they cross over to the other side and find a whole new host of competitors who never had any print legacy businesses to protect. Those competitors right now are blogs and online news hubs who are still furry little rodents in the underbrush, but who won’t stay little forever. The sooner print media companies cross over, the sooner they can be on pure offense. Their online strategies and business models won’t be crippled by any allegiance, or need to protect, to the old print business. If they wait until their online revenues become 25 or 50 percent before they fully commit, it will be too late.

I don’t disagree with the thought. I disagree with the “will be too late.” It is too late.

The example to wish I refer is the oversized, glossy, 80 plus page WSJ Magazine filled with “reading.” Well, that’s interesting. I just counted about 32 pages of ads plus a number of features that are tough for me to determine if these are placed for consideration or are actual editorial. The stories focused on cars and fashion with a profile tossed in for good measure.

I remember being told by my Financial Times’s delivery agent before I dropped my print subscription that he tossed the magazine insert because it was too much of a hassle. I wonder if my delivery person for my Saturday WSJ will follow the same path.

Did I read any of the stories? The answer is, “No.” None of them appealed to me. I have a person who works for me who drives a Mini Cooper and it seems to have constant tire problems. I am tired of with it executives who overcame hardship. Who hasn’t? Fashion? Not interested. I wear black Travel Smith jackets, black never wrinkle pants, and black shoes that do not set off any alarms anywhere I travel. Spare me the trendy. Was there any financial info, business intelligence, or juicy insights into making money grow? Nope. The WSJ added sports and now it is adding a New York Times’s magazine type publication every couple of months.

What’s my take?

  1. WSJ is going after the NYT advertisers. That’s okay but the effectiveness of print ads have to be demonstrable. That might be tough unless the editorial product provides some content consideration. The boundary between an auto story and an advertiser might be getting a few molecules narrower, might it not?
  2. The problem with traditional media is not content; the problem is finance and business models. Offering me 30 pages of ads in 80 pages of paper is somewhat 17th century in today’s world.
  3. The Financial Times’s last home delivery offer to me was $50 a year. Will the Wall Street Journal face the same subscription challenge as readers discover that blending sports, Details magazine editorial, and business profiles might be out of step with what subscribers like me do on a Saturday?

Now search? How will I be able to locate the Gucci suit on the WSJ Web site? Answer: Not until the WSJ figures out image indexing and some other search tricks. I bet that when the iPad version of the WSJ Magazine comes out I will be able to click on a suit and see a map of locations where I can buy a suit that will fit most 20 year old soccer players. Maybe for some folks. Not for me.

Stephen E Arnold, March 14, 2010

No one paid me to write this article. I will report a failure to charge for my writing to the editor of the Army Times, an outfit focused on information in the modern world.

BA-Insight: New Angle on Lead Generation

March 13, 2010

The Microsoft Fast search road show was in New York this week. I stayed in rural Kentucky watching the acid run off trickle into my goose pond. I took time out from this strenuous activity to read “BA-Insight Announces New Direct Access to Free Information and Resources for SharePoint Search and Fast.”

BA-Insight develops software, including Longitude which “helps people find an analyze relevant information across the entire enterprise independently of format or location.” The firm’s Web site has been revamped and features “an enhanced support portal and new free resource library specially designed for enterprise evaluating SharePoint or Fast Search or engaging in SharePoint or Fast Search deployments, including Fast ESP.”

I took a look at the site. The splash page is below, but you will see different graphics because the rectangular area features a slide show of information.

bainsight

Source: http://www.ba-insight.net/Pages/Home.aspx

You can download white papers, get inks to videos, and access the company’s Web logs. One of the documents is the Microsoft Enterprise Search 2010 Roadmap. When I clicked on that link, I saw another link and the icon labeled premium shown below.

bainsight premium

In order to access that document, I was given an option to fill in a form with my name, title, organization, phone, email, and interests. The angle seems to be that to get this document, one must go through a vendor like BA-Insight.

One of the goslings filled in the form and the road map is a single page that explains Microsoft’s five search technologies and lists the capabilities, repository indexing, and manageability features of each product. Interesting stuff.

Here’s one snippet of the roadmap, which is more of a table than a map in my opinion:

bainsight snippet

Interesting stuff. Particularly with regard to scaling, I wonder if organizations will have the appetite for this type of hardware footprint on site. Will enterprise Fast ESP work from the cloud? © Microsoft 2010.

Several questions:

  • Will more search vendors shift into education or missionary marketing mode to move their systems?
  • In today’s financial climate, will the portal approach supplant the more traditional features-benefit type of marketing that characterizes some search vendors’ Web sites?
  • Has the complexity of the product offering broken the back of the adage “KISS” for business oriented communications?

I will watch to see if other vendors embrace the educational portal approach to sales and lead generation. The addled goose just makes information available via a blog, assuming that content with an edge will generate inquiries. Perhaps once again I am wrong?

Stephen E Arnold, March 13, 2010

No one paid me to write this short article. Because of the references to Microsoft and its five search options, I will report non payment to the Department of Defense, an organization with an interest in Microsoft’s technology.

Bitrix in the Enterprise Search Game

March 12, 2010

Short honk: A happy quack to the reader who sent me a link to “Bitrix Introduces the D.I.G.™ Engine: the Ultimate in Enterprise 2.0 and Web 2.0 Search Technology.” Bitrix was a company not familiar to me and there were no data in my Overflight service.

Bitrext, founded in 1998 and based in the Washington, DC are, asserts that it is a “technology trendsetter.” The company says:

Bitrix, Inc. specializes in the development of content management systems and intranet portal solutions for managing web projects and multifunctional information systems on the Internet. Deployed at more than 30,000 customers worldwide, Bitrix products are fast, reliable, easy to use and highly scalable…Bitrix takes pride in serving clients ranging from Fortune 500 companies to funded startups, including enterprises like Xerox, Toshiba, Epson, Samsung, Panasonic, Volkswagen, Hyundai, KIA, Gazprom, VTB, Zurich Insurance, DPD, PriceWaterHouseCoopers, Cosmopolitan, Vogue, PC Magazine, and many more.

The search system makes use of the firm’s D.I.G. Engine. D.I.G. is “an advanced search engine developed specifically for enterprise Intranets and Web sites that enables high-performance data search in texts, media content and documents with smart ranking, sorting and display. The engine is available in the company’s flagship products – Bitrix Intranet Portal and Bitrix Site Manager.”

The system “enumerates texts, media content and documents while looking for morphological stems and considering their density.” The search results are “filtered with respect to the user access rights before being displayed.” The company adds:

D.I.G. offers manual or immediate automatic data indexing, making content searchable right after its submission. Users may create complex search queries using query language, inclusion/exclusion masks and logic operators, as well as choose specific site sections for a highly targeted search. The technology supports AJAX-powered interactive pages, provides advanced taxonomy service with automatic tag cloud generation, allows making Google Sitemap, as well as a user-specific search form design. It covers English, German and Russian and enables fast and painless connecting of other languages with third-party stemming tables.

There are screenshots of the company’s products on the firm’s Media Gallery page, but I did not see a search results example.

The company offers a “virtual appliance”. The idea is that multiple instances of Bitrix products can run on the same computer each in a virtual space.

Prices for the system are located at http://www.bitrixsoft.com/buy/intranet.php with the range in the $1,500 to $20,000 spectrum.

My impression is that search is an embedded feature, which exemplifies the trend of content management vendors trying to improve the utility of their systems.

Stephen E Arnold, March 12, 2010

No one paid me to write this. With the firm’s location near several interesting Federal entities, I will send an email to one of those Dot Mil addresses and report my status of free writer.

PageZephyr 2.0 Searches Popular Desktop Publishing File Types

March 12, 2010

Desktop search products typically support the popular formats of PDF, RTF, and text files, but it is the proprietary formats, such as Microsoft Publisher files, that companies must go to great lengths to index. Enter PageZepyhr 2.0, Markzware’s new flagship product, completing by-passing the original content creation application while capturing, indexing, and delivering the results businesses need in real time. “Markzware’s Deep Desktop Search Finds Smoking Gun,” highlights the top three file formats that PageZepyhr now supports: QuarkXPress, Adobe InDesign, and Microsoft Publisher. In addition, Page Zephyr 2.0 allows content merging, editing, and even Boolean search commands to round out its robust package. While not expensive (currently on sale at $149), PageZephyr 2.0 is aimed at a niche market: those interested in the searching, viewing, extraction, and distribution of these unique file formats. If getting inside those proprietary files is crucial, Markzware’s new product could be the ticket. If you use Framemaker, you are still out of luck for most search and query needs.

Sam Hartman, March 12, 2010

ArnoldIT.com paid Mr. Hartman for this write up.

The Taxonomy Torpedo

March 9, 2010

Quite an interesting phone call today (March 8, 2010). Apparently the article “A Guide to Developing Taxonomies for Effective Data Management” caught this person’s attention. The write up boils down the taxonomy job to a couple of pages of tips and observations. Baloney in my opinion.

The caller wanted to have me and a gosling provide the green light to a taxonomy project. The method was to use a couple of subject matter experts from marketing and an information technology intern. The idea was to take a word list and use it to index content with the organization’s enterprise search system.

The called told me, “We let the staff add their own key words. There has been a lot of inconsistency. We will develop our controlled term list and that way we have date, time, and creator; the terms the users assign; and the words in our taxonomy. What do you think?”

What I think is that no one will be able to find some of the relevant data. I am surprised that so many vendors point out that their systems “discover” metadata and provide users with suggestions, lists of related content, and the ability to search by entities.

Doesn’t work.

Here’s why:

  1. Fancy interfaces (user experience in today’s lingo) requires consistent, appropriate, and known tags. Most organizations, fresh from doing taxonomy push ups for a day, have wildly inconsistent term lists. A user may know how to locate a document in an idiosyncratic way. If that method involves a controlled word, the user may not get the results she was expecting.
  2. Automatic processes work well when the information objects have enough substantive content to make key word indexing work. I have examined a number of organizations’ content and found inconsistencies in the way in which the organization referred to itself. The controlled terms were rarely used. When a query included a controlled term, the user was puzzled why the result set was not complete.
  3. Most organizations lack the expertise and resources to create a well-formed controlled term list. Ad hoc lists are useful sometimes to those who cooked them up. A comprehensive controlled term list is a great deal of work.

What’s this mean? The stampede to taxonomies will yield the same dissatisfaction that other, partially implemented search features. Talk is easy. Taxonomies and controlled term lists are tough to develop and even harder to keep current.

Stephen E Arnold, March 9, 2010

No one paid me to write this. I mention indexing, so I will report non payment to the Librarian of Congress, or maybe the librarian for the House library, or maybe the librarian for the Senate library. I wonder why there are three libraries for Congress.

Lexalytics Pushes toward PR Nirvana

March 6, 2010

I am easily confused when I read about “market intelligence”. I think this is different from “business intelligence” and “competitive intelligence.” I can’t put my finger on the exact meanings of these phrases. I read “Top Market Intelligence Companies Turn to Lexalytics for Powerful Sentiment Analysis” and I believe that “market intelligence” alludes to what customers and observers say in Web logs and other media. Figuring out if customers are happy or sad is important. I recall hearing a presentation by ClearForest about the importance of processing warranty information. In addition to cost savings, potentially dangerous problems with vehicles could, in certain circumstances, be identified from streams of information.

The Lexalytics’ spin on “market intelligence” hooks into text analysis and social media monitoring. The firm has two new projects. One is with a firm called Cymfony. I am not sure how Cymfony connects with people who have influence over others, but I accept the assertion “executive accolades” at face value. The second project is a license deal with Vocus. In both projects, Lexalytics will provide text analysis. Lexalytics is part of Infonics, UK based firm.

Search and content processing vendors are packaging their indexing and retrieval technology in many different ways. Sentiment analysis has been given a boost with the growing interest in monitoring real time streams of content.

Will niche plays generate enough cash to keep the many competitors in search and content processing afloat? With the high cost of sales, companies like Lexalytics will be among the first to provide real life case evidence about the strategy.

Public relations companies have been among the service sectors hit in the nose by the soft economy. The trajectory of these tie ups will be fascinating to watch.

Stephen E Arnold, March 6, 2010

No one paid me to write this short item. Because Lexalytics is part of a UK company as a result of a no-cash transaction, I am not sure which US authority requires me to report non compensation. Perhaps the Council on Foreign Affairs? I will give it a whirl.

Coveo Jumps to Version 6.1

March 4, 2010

As March begins, enterprise search giant has released its Enterprise Search Platform Version 6.1. As announced  by Coveo, the updated version features new interfaces, better integration with your existing infrastructure, and complete indexing of desktop and e-mail (including attachments). Coveo’s “connectors” in this version are important, as they now allow searching of existing social network platforms like Jive and Confluence, and full search options along Enterprise 2.0 systems. In addition, the platform can be integrated into Outlook via the Coveo Outlook Sidebar, allowing users to search e-mails, attachments, and contacts all without leaving Outlook.

The Coveo Floating Desktop Searchbar hovers magically over any Windows application, allowing users to “search where they work.” Search results can now be mulled over in Coveo Search Analytics, giving users insight into the who-what-where of searches being performed within an organization. The most relevant information, if not showing up, can be promoted directly in search results. Information can also be grouped into dynamic mash-ups and dashboards allowing “composite views of key metrics” and a more efficient display of data.

Sam Hartman, March 4, 2010

ArnoldIT.com paid Sam Hartman to write this article.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta