Coveo and GEICO Host Webinar on March 23, 2010
March 21, 2010
Fierce Media has asked Beyond Search to facilitate a discussion about “how GEICO thinks about leveraging its data-rich enterprise systems to generate real-time business value and intelligence.” The participants are GEICO and Coveo as well as Stephen E Arnold.
Topics include how the Coveo system can:
- Enable improved business intelligence and decision making through dynamic dashboards and information mashups that provide actionable business information
- Access structured and unstructured data from across enterprise systems and repositories without complex integration or data migration, improving efficiency and cost effectiveness through a unified indexing layer
- Lower the cost of legacy system integrations and upgrades, and reduce time-consuming data migration
- Optimize social networks and incorporate the value of collaboration and just-in-time information exchange into the knowledge ecosystem
The audio program will be on Tuesday, March 23, 2010 beginning at 11:00am Eastern/8:00am Pacific. More information about Coveo may be found at http://www.coveo.com. You can register here.
Ben Kent, March 21, 2010, Beyond Search
This is a sponsored post.
WAND and Layer2 Team for SharePoint Taxonomy Functions
March 19, 2010
A happy quack to the reader who sent me a link to “Jump-Start Microsoft SharePoint 2010 Knowledge Management Using Pre-Defined Taxonomy Metadata”. The Microsoft Fast road show is wending its way among the Redmond faithful. In its wake, a number of companies see opportunity in the Microsoft demos. But with Microsoft making some tasty offers to incentive those looking for search systems, Microsoft may be doing third-party add-on vendors and Fast ESP consultants a big favor.
The Earth Times’ article said:
In cooperation with WAND, Inc – one of the leading providers of enterprise taxonomies – Layer2 now offers pre-defined Taxonomy Metadata for Microsoft SharePoint Server 2010, a robust and expanding library of taxonomies covering a wide variety of domains to help jumpstart classification projects. Taxonomy Metadata for Microsoft SharePoint 2010 is currently available in 13 languages, e.g. English, French, German, Spanish, Italian, Portuguese, Japanese, Simplified Chinese, Traditional Chinese, Korean, and Vietnamese.
WAND has developed structured multi-lingual vocabularies with related tools and services to power precision search and classification applications. The company asserts that WAND makes search work better. WAND Taxonomies are used in online yellow pages and local search, ad-matching engines, business to business directories, product search, and within enterprise search engines. The firm’s library contains more than 40 domain specific taxonomies. WAND’s taxonomies are available in 13 languages.
Layer 2 GmbH is a specialist for creating custom components and solutions for Microsoft SharePoint Products and Technologies. Based in Germany, Layer2 offers products and solutions that add additional features to portals based on Microsoft SharePoint technology.
My view is that Microsoft may be creating opportunities at the same time it leaves some SharePoint customers wondering why their systems do not work as expected. If taxonomy management was a priority, Microsoft should have included a system to perform this type of work within the SharePoint package. Third party vendors now have an opportunity to sell a “solution,” but customers may have to go through a learning process and then spend additional money to get the functionality required to make SharePoint more useful.
Perhaps another mixed result from SharePoint? Just my opinion.
Stephen E Arnold, March 19, 2010
Freebie. No one paid me to point out that talking about “taxonomies” is much easier than implementing a high value taxonomy and then enforcing consistent tagging across the processed corpus. I know that the IRS is good at indexing by social security number, so I will report non payment to that agency.
InQuira Embraces the Cloud
March 19, 2010
I read “InQuira Puts It Knowledge Solutions in the Cloud” and learned that the approach “is in no way a light weight version.” On premises search systems can be tough to install, tune, and maintain. Blossom has been, in my opinion, one of the trail blazers for hosted search, and it offers a robust, powerful, and customizable solution. InQuira is moving in that direction as well.
According to the write up which quotes an InQuira officer:
InQuira has existing partnerships with Oracle CRM On Demand, Oracle’s Siebel offering, and Genesys Telecommunications Laboratories. The newest on-demand offering will extend the company’s reach…[InQuira] has a really established reputation as the best-of-breed intelligent search vendor that quickly and easily integrates with everyone,” says John Ragsdale, vice president of technology research for the Technology Services Industry Association (TSIA).
One feature of the approach is that storage is provided in an “on demand” model.
You can get more information from www.inquira.com.
Stephen E Arnold, March 19, 2010
Freebie. No one paid me to write this. I will report non payment to the Bureau of Labor Statistics, an outfit who tracks work for no compensation each day, every day.
Fabasoft Mindbreeze and Its Lotus Connector
March 18, 2010
I was able to read a white paper prepared by Fabasoft Mindbreeze about its updated Mindbreeze IBM Lotus Connector. The document is “Configuration of Mindbreeze Enterprise Search for IBM Lotus” and is available from the company. When I worked at Ziff Communications in New York City, I had an early exposure to the product. Since that time 20 years ago, Lotus Notes has found its way into many commercial and governmental entities. Those who love the product cannot live without it. People like me tolerate some of the system’s peculiarities exemplified by this question, “Why can’t you restore my email?”
Fabasoft is the successful Austria-based enterprise software and integration company. Mindbreeze is its search, content analytics and content processing subsidiary. The Mindbreeze engineers have developed a solution for organizations with Lotus Domino/Notes as well as a lot of other types of content systems. You can get Mindbreeze and its Lotus Domino/Notes support, snap it into your environment, and search for Notes content, even in mobile environments, including the RIM Blackberry, Apple iPhone, and Google Android devices.
In January 2010, I got a preview of the system and I received a copy of the white paper. I followed up with Daniel Fallmann, founder and managing director of Mindbreeze. Here’s what I learned in an email exchange on March 14 and 15, 2010:
What is the main focus of the IBM Lotus Domino/Notes support you offer?*
What is very important for us is that that the Mindbreeze Connectors run with a minimum of required configuration, even to very large scale. So Notes items and even complex Lotus Domino object models are very easy to adapt to fulfill the need of the customer/users. So Fabasoft Mindbreeze Enterprise makes it easy to search-enable any line-of-business application based on IBM Lotus Domino within a minimum amount of time with great results for the knowledge workers, even with their mobile information needs. Of course our customers get all the needed social search and federated search features built-in. We have a lot of Lotus partners that love the ease you can now search-enable IBM Lotus line-of-business applications.
As we offer an appliance as well you can buy the Fabasoft Mindbreeze Appliance or you can install Fabasoft Mindbreeze Enterprise and run the IBM Lotus Connector on a Linux environment which totally saves you the money of the operating system and enables you to even support our customer’s users with IBM Lotus line-of-business-application in the cloud with a modest and easy to calculate investment.
What is the method for indexing Lotus Mail which has been moved to an archive?*
Fabasoft Mindbreeze Enterprise follows the “link-information” (for example, a link to an archived mail for example in a Fabasoft iArchive for IBM Lotus Notes) left in the remaining item stub and index the archived information by applying the rights based on the stub object that’s left in the IBM Lotus installation.
How are emails across Lotus Notes installations indexed so that only the authorized person can see a single email or a group of emails?
Fabasoft Mindbreeze Enterprise uses the rights based on an IBM Lotus object information to evaluate if a user has the right to read information based on the document level or even extensible to the field level. Things like inherited rights and user name fields are as well taken into account. As Fabasoft Mindbreeze Enterprise is based on a modern distributed service architecture, it is easy to spread queries against several instances and respond to a user’s query. Thanks to our innovative technology and architecture we typically are up and running at customers in between 30min and 2 days, of course this highly varies on the customer’s needs.
When Lotus Notes is used with an IBM collaboration tool like Lotus Notes Traveler, how are the indexes federated so a single query retrieves the content across the Notes’s components?
First: It typically makes sense that Fabasoft Mindbreeze Enterprise crawler, filter, index services are located where the information is, so the best practice is to use the distributed architecture of Mindbreeze Enterprise Search to bring together information from several IBM Lotus databases. Second: As Fabasoft Mindbreeze Enterprise connects against the IBM Lotus web services it is even trivial to get information from different locations and index it in a central location.
How does your system’s pricing work?
We have a per named user pricing model as well as a concurrent use model that is very easy to calculate and use. Moreover the Mindbreeze IBM Lotus Connector supports custom object models as well and you can host the whole product on a Linux platform. As far as I know there is no other IBM Lotus Connector and search product, that can so easy adapt to IBM Lotus Domino object models for your specific line-of-business application. This of course has to be taken into account.
Do you support the Notes – Cisco Unified Meeting Place?
Fabasoft Mindbreeze Enterprise allows you to index all the calendar information for all meetings that you are invited to by Cisco’s Unified Meeting Place. This information can be updated in the index during the meeting place notification mechanisms. You could even index the audio content streamed via a Cisco MeetingPlace Audio Server by using speech to text functionality.
If you want to index Lotus content, we think the Mindbreeze solution warrants a test drive. Contact the company at http://www.mindbreeze.com.
Stephen E Arnold, March 16, 2010
When I am next in Linz, Mindbreeze promised me a pastry. Until then, this is an uncompensated post.
Limitations of MSFT Exchange 2010
March 16, 2010
I am not sure how one of my goslings came across this spreadsheet tucked away on the Microsoft Exchange Web log. When I tried to access the file, the system did not recognize my “official” Microsoft MSDN user ID nor my Windows Live credentials. So you may have to register to access the blog. Once there, you need to look for the download section and visually inspect the file names for the one that points to the Exchange Performance Excel spreadsheet. Running a query in the blog’s search box produced zero hits for me. But with some persistence and patience I was able to get a copy of the spreadsheet. Latency was a problem when I was fiddling with this download. (Note: if the link is dead, write one of the goslings at benkent2020 at yahoo dot com, and maybe he will email you a copy of this document.)
Once you get the document “Scalability Limitations”, you will see some pretty interesting information. One quick example is that the spreadsheet includes three columns of specifics about scaling amidst the more marketing oriented data on the spreadsheet. These three juicy columns are:
- Limitation
- Issue
- Mitigation.
Here’s the information for the row Database Size:
- Limitation–Exchange 2007 – 200GB; Exchange 2010 – 2TB or 1 disk, whichever is less
- Issue–The DB size guidance changed from 200GB (if you are in CCR) to 2TB or 1 disk, whichever is greater (if you have 2+ copies of the DB in question)
- Mitigation—Blank. No information.
Okay.
I hope you are able to locate this document. For those of you eager to install Exchange 2010, SharePoint 2010, and Fast Search 2010, you will want to make sure you have these type of spreadsheets at your fingertips * before * you jump on the Microsoft Enterprise steam engine. The information in the spreadsheet makes clear why some types of email content processing may be expensive to implement.
Stephen E Arnold, March 16, 2010
This is the equivalent of the free newspaper Velocity in Louisville. Read it for nothing. I will report working for no dough to the Jefferson County agency that thinks I work in Louisville when I spend most of my time in the warm embrace of airlines.
Indexing Craziness
March 15, 2010
I read “Folksonomy and Taxonomy – do you have to choose?,” which takes the position that a SharePoint administrator can use a formal controlled term list or just let the users slap their own terms into an index field. The buzzword for allowing users to index documents is part of a larger 20 something invention—folksonomy. The key segment for me in the SharePoint centric Jopx blog was:
The way that SharePoint 2010 supports the notion of promoting free tags into a managed taxonomy demonstrates that a folksonomy can be used as a source to define a taxonomy as well.
Let me try and save you a lot of grief. Indexing must be normalized. The idea is to use certain terms to retrieve documents with reasonable reliability. Humans who are not trained indexers do a lousy job of applying terms. Even professional indexers working in production settings fall into some well known ruts. For example, unless care is exercised in management and making the term list available, humans will work from memory. The result is indexing that is wrong about 15 percent of the time. Machine indexing when properly tuned can hit that rate. The problem is the that the person looking for information assumes that indexing is 100 percent accurate. It is not.
The idea behind controlled term lists is that these are logically consistent. When changes are made such as the addition of a term such as “webinar” as a related term to “seminar”, a method exists to keep the terms consistent and a system is in place to update the index terms for the corpus.
When there is a mix of indexing methods, the likelihood of having a mess is pretty high. The way around this problem is to throw an array of “related” links in front of the user and invite the user to click around. This approach to discovery entertains the clueless but leads to the potential for rat holes and wasted time.
Most organizations don’t have the appetite to create a controlled term list and keep it current. The result is the approach that is something I encounter frequently. I see a mix of these methods:
- A controlled term list from someplace (old Oracle or Convera term list, a version of the ABI/INFORM or some other commercial database controlled vocabulary, or something from a specialty vendor)
- User assigned terms; that is, uncontrolled terms. (This approach works when you have big data like Google but it is not so good when there are little data, which is how I would characterize most SharePoint installations.)
- Indexes based on parsing the content.
A user may enter a term such as “Smith purchase order” and get a bunch of extra work. Users are not too good at searching, and this patchwork of indexing terms ensures that some users will have to do the Easter egg drill; that is, look for the specific information needed. When it is located, some users like me make a note card and keep in handy. No more Easter egg hunts for that item for me.
What about third party SharePoint metadata generators? These generate metadata but they don’t solve the problem of normalizing index terms.
SharePoint and its touting of metadata as the solution to search woes are interesting. In my opinion, the approach implemented within SharePoint will make it more difficult for some users to find data, not easier. And, in my opinion, the resulting index term list will be a mess. What happens when a search engine uses these flawed index terms, the search results force the user to look for information the old fashioned way.
Stephen E Arnold, March 15, 2010
A free write up. No one paid me to write this article. I will report non payment to the SharePoint fans at the Department of Defense. Metadata works first time every time at the DoD I assume.
Hewlett Packard Trim 7
March 12, 2010
Hewlett Packard, a company that I continue to associate with low cost printers and high cost ink, lit up my radar with its acquisition of Lexington, Kentucky-based Exstream Software two years ago. Exstream (now Enterprise Document Automation), like IBM Ricoh Infoprints and Streamserve, generates outputs like invoices with warranty reminders and auto payment bills with coupons for oil change discounts. I learned that in February 2010, HP stepped up its footprint in document management. One of the source documents I examined is “HPTrim 7… How We Got Here?”. The gray background and the dark blue highlights on text were a bit much for the addled goose’s eyes, however. For me, the most interesting segment in the history of Trim 7 was this passage:
Market consolidation meant that lots of little players were gobbled up, as the larger vendors strived to meet the ever challenging demands of the marketplace, picking up technology from these smaller companies and making them a part of their overall product line. Hewlett-Packard, one of the largest IT companies in the world, did the same, acquiring TOWER Software in 2008, but with one subtle difference. Rather than cannibalize the technology and abandon the product, they kept almost all of the staff from the TOWER acquisition and told them to build the next version of what is now known as HP TRIM. And – there were no other products that HP TRIM had to compete with internally unlike a lot of the other acquisitions: IBM/FileNet, OpenText/Hummingbird/Vignette, and utonomy/Zantaz/Interwoven/Meridio. HP wanted to concentrate on the product that was HP TRIM, and add the backing that only a company like HP can bring to a product. And so, HP TRIM 7 was born.
Digging through the text, HP bought an outfit called Tower and is rolling in other software to create the “new” document management business. You can locate the main page here. Three points jumped out:
First, I did not see any indication that HP’s dynamic document system integrates our “touches” the Trim 7 product. That’s strike me as an indication that HP is chasing revenues from silo sales, not integration.
Second, how does one find a document? I could not locate any information about the search and retrieval functions within Trim 7. I surmise that if I use Trim 7 for SharePoint, I in theory would be able to use the Microsoft Fast ESP system to search for content. That also seems to be quite a bit of work; that is, consulting revenue for HP or its partners. My query “search HP Trim” resulted in 10 hits but noting on point. One result was this page, which was heavy on marketing an light on locating information within the Trim 7 system. After a legal eagle drops a gift on a company named as a party in a legal matter, job one is answering the question, “What’s this about?” Trim 7 may not be able to answer that question.
Third, HP seems to be grabbing enterprise software companies that address really big information problems. With HP’s push into printers and ink, I saw a success that may have caught the firm’s hardware mavens by surprise. The trajectory in enterprise software is being driven from bit money acquisitions. I think that the surprise of printing consumables will be different from the surprise of acquisition-based growth. One was emergent; the latter is closer to MBA spreadsheet fever.
Big bets. Big win or big loss? I am leaning toward the loss option. Outlook: worth monitoring.
Stephen E Arnold, March 12, 2010
No one paid me to write this. Because HP derives significant revenue from ink, I think I have to report non payment to the US government’s printer, GPO.
Bitrix in the Enterprise Search Game
March 12, 2010
Short honk: A happy quack to the reader who sent me a link to “Bitrix Introduces the D.I.G.™ Engine: the Ultimate in Enterprise 2.0 and Web 2.0 Search Technology.” Bitrix was a company not familiar to me and there were no data in my Overflight service.
Bitrext, founded in 1998 and based in the Washington, DC are, asserts that it is a “technology trendsetter.” The company says:
Bitrix, Inc. specializes in the development of content management systems and intranet portal solutions for managing web projects and multifunctional information systems on the Internet. Deployed at more than 30,000 customers worldwide, Bitrix products are fast, reliable, easy to use and highly scalable…Bitrix takes pride in serving clients ranging from Fortune 500 companies to funded startups, including enterprises like Xerox, Toshiba, Epson, Samsung, Panasonic, Volkswagen, Hyundai, KIA, Gazprom, VTB, Zurich Insurance, DPD, PriceWaterHouseCoopers, Cosmopolitan, Vogue, PC Magazine, and many more.
The search system makes use of the firm’s D.I.G. Engine. D.I.G. is “an advanced search engine developed specifically for enterprise Intranets and Web sites that enables high-performance data search in texts, media content and documents with smart ranking, sorting and display. The engine is available in the company’s flagship products – Bitrix Intranet Portal and Bitrix Site Manager.”
The system “enumerates texts, media content and documents while looking for morphological stems and considering their density.” The search results are “filtered with respect to the user access rights before being displayed.” The company adds:
D.I.G. offers manual or immediate automatic data indexing, making content searchable right after its submission. Users may create complex search queries using query language, inclusion/exclusion masks and logic operators, as well as choose specific site sections for a highly targeted search. The technology supports AJAX-powered interactive pages, provides advanced taxonomy service with automatic tag cloud generation, allows making Google Sitemap, as well as a user-specific search form design. It covers English, German and Russian and enables fast and painless connecting of other languages with third-party stemming tables.
There are screenshots of the company’s products on the firm’s Media Gallery page, but I did not see a search results example.
The company offers a “virtual appliance”. The idea is that multiple instances of Bitrix products can run on the same computer each in a virtual space.
Prices for the system are located at http://www.bitrixsoft.com/buy/intranet.php with the range in the $1,500 to $20,000 spectrum.
My impression is that search is an embedded feature, which exemplifies the trend of content management vendors trying to improve the utility of their systems.
Stephen E Arnold, March 12, 2010
No one paid me to write this. With the firm’s location near several interesting Federal entities, I will send an email to one of those Dot Mil addresses and report my status of free writer.
JustSystems in Flux?
March 8, 2010
I received a call about JustSystems, the Japanese company that figured out how to enter complex characters using a three digit code from a mobile phone keypad. A deal with a large mobile device company was the firm’s go-to revenue stream. With changes in mobile technology, that revenue began to dwindle. JustSystems turned to software development and consulting, which are difficult businesses to scale. When I visited the company for my key fob I noted that the firm had more than 500 employees in several locations. The information I reviewed this morning suggested that JustSystems had about 900 employees at the end of 2009.
The firm was of interest to me. I received a Japanese dinner and a key fob after giving a briefing to the company’s owners four or five years ago. I also reported in my story “JustSystems ConceptBase” that the company rolled out a search appliance in some sort of tie up with IBM.
I dug through my files and noticed a data point that I wanted to surface. In April 2009, JustSystems became a subsidiary of the Keyence Corporation. (Asiajin reported this story in April 2009.) Keyence makes a wide range of electronic gizmos. JustSystems pushed into search and content processing, purchasing a US content processing company called Clairvoyance, founded by wizard Dr. David A Evans. The push did not work and the company turned to Keyence, which bought 43.96 percent of JustSystems, valued at a about US$50 million. Six months later in 2009, the founders–Kazunori Ukigawa and Hatsuko Ukigawa, a husband and wife team—resigned as chairman and vice chairperson and quit the Board of Directors. Mrs. Unkgawa was one of the most visible female Japanese company heads in a very male Japanese technology sector.
On Friday, I was able to speak to a customer support representative on the firm’s North American hot line. I was not able to get much information about the status of the products, particularly the search appliance. I asked about the office in Pittsburgh, where Clairvoyance was located. I learned that the Pittsburgh office had been closed.
JustSystems is hosting Webinars and publicizing that it is one of the 100 firms identified as a “company that matters” by the prestigious, widely read KMWorld Magazine. The company lists as its customers, Amazon, Thomson, Symantec, Cisco, WIPO, Jaguar, and other high profile firms.
The company’s flagship product is XMetal and the firm offers a “maturity model” for an enterprise “semantic ecosystem.”
Several observations:
- JSERI–the original Claritech – Clairvoyance – JustSystems Evans Research–seems to have shut down. See drakesbaycompany.com/documents/JSERI_ExecSummary.pdf
- The USPTO published in November 2009 US, “Methods and Apparatus for Interactive Document Clustering.” The assignee is JustSystems Evans Research. The Wikipedia entry is here.
- There is no search function on the company’s English language Web site. A quick look at the Japanese site and I was not able to locate a search function. When I search Keyence’s Web site for the ConceptBase I got zero hits. Maybe the appliance is a goner?
To sum up, I don’t know if the ConceptBase appliance is currently for sale. I will keep poking around.
Stephen E Arnold, March 8, 2010
No one paid me to write this. I did get paid to go to Tokyo to get my JustSystems key fob. I suppose that counts for something.
Lexalytics Pushes toward PR Nirvana
March 6, 2010
I am easily confused when I read about “market intelligence”. I think this is different from “business intelligence” and “competitive intelligence.” I can’t put my finger on the exact meanings of these phrases. I read “Top Market Intelligence Companies Turn to Lexalytics for Powerful Sentiment Analysis” and I believe that “market intelligence” alludes to what customers and observers say in Web logs and other media. Figuring out if customers are happy or sad is important. I recall hearing a presentation by ClearForest about the importance of processing warranty information. In addition to cost savings, potentially dangerous problems with vehicles could, in certain circumstances, be identified from streams of information.
The Lexalytics’ spin on “market intelligence” hooks into text analysis and social media monitoring. The firm has two new projects. One is with a firm called Cymfony. I am not sure how Cymfony connects with people who have influence over others, but I accept the assertion “executive accolades” at face value. The second project is a license deal with Vocus. In both projects, Lexalytics will provide text analysis. Lexalytics is part of Infonics, UK based firm.
Search and content processing vendors are packaging their indexing and retrieval technology in many different ways. Sentiment analysis has been given a boost with the growing interest in monitoring real time streams of content.
Will niche plays generate enough cash to keep the many competitors in search and content processing afloat? With the high cost of sales, companies like Lexalytics will be among the first to provide real life case evidence about the strategy.
Public relations companies have been among the service sectors hit in the nose by the soft economy. The trajectory of these tie ups will be fascinating to watch.
Stephen E Arnold, March 6, 2010
No one paid me to write this short item. Because Lexalytics is part of a UK company as a result of a no-cash transaction, I am not sure which US authority requires me to report non compensation. Perhaps the Council on Foreign Affairs? I will give it a whirl.


