TopQuadrant Earns ReportingHub Contract

January 21, 2012

TopQuadrant, a well-known semantic data integration company has been awarded a major contract by the Exploration and Production Information Management Association or EPIM.

According to the document “TopQuadrant To Deliver New Data Reporting System For Oil and Gas Operators on the Norwegian Continental Shelf” on TopQuadrant.com, EPIM has awarded its ReportingHub contract to TopQuadrant. EPIM focuses on coming up with IT solutions that efficiently moves the information flow path between all users. They have been working closely with the oil industry to come up with a plan that will allow companies to collect, normalize, validate, analyze and report data concerning the daily activities of the North Sea oil and gas drillers.

According to an Executive Director at EPIM,

TopQuadrant offers scalable data integration and reporting solutions that support end-to-end systems for the oil and gas industry. Working with EPIM, TopQuadrant will develop a new reporting system that will provide flexibility to meet current and future information sharing needs of the NCS operators and authorities.

TopQuadrant has worked to position itself as a developer of taxonomy tools. We find it interesting that content processing firms are finding ways to leverage core information retrieval technology in interesting ways.

April Holmes, January 21, 2012

Sponsored by Pandia.com

Taxonomy Presentation from Project Performance Corporation

January 20, 2012

Talk about taxonomy. Synaptica Central announces, “Taxonomy More Complex than Five Years Ago.” While the title states the obvious, the write up points to a presentation that may be worth a look. We learn from the posting:

Zach Wahl of Project Performance Corporation (PPC) said that the average taxonomy application is deeper and more complex than five years ago, and so the need for more sophisticated taxonomy software tools is becoming widely recognized.  PPC is a leading management consultancy with a growing taxonomy practice.  Wahl’s comments drew upon observations of the evolution of RFP requirements over the last few years.

The Project Performance Corporation works to bring efficiency to its clients by divining their best management practices and most effective, up-to-date technology. The company strives to treat its employees well, to give back to communities, and to always continue improving.

There is some room for improvement in this example, I’m afraid. We found the presentation, “Taxonomy Tools Requirements and Capabilities,” to be a gathering of truisms and some tough to understand magic. Check it out, but your mileage may vary.

Cynthia Murrell, January 20, 2012

Sponsored by Pandia.com

Crowdsourcing a Taxonomy: Useful or Useless?

January 18, 2012

We vote for useless.

However, the TopCoder blog recently shared an article that breaks down Crowdsourcing into four categories and combines real world examples within the defined taxonomy they are offering. The post is called “Why the Taxonomy of Crowdsourcing Can Not Categorize Software Development.”

According to the article, there has been a push to categorize what Crowdsourcing is which can be a good thing. However, the blog found that for software developers like TopCoder this can be very difficult to do.

The article states:

As we read through the aforementioned crowdsourcing.org article, it struck us that a taxonomy such as this would have a very hard time categorizing what TopCoder accomplishes. You may or may not know what we do. Through our global competitive community of more than 321,000 professionals – we don’t often use the term crowd – we create innovative software, algorithms that optimize business and scientific solutions and graphical digital assets. The further we studied the 4 different categories presented by Crowdsourcing.org, the more we realized that TopCoder competitions fit into all four categories presented.

If TopCoder feels this way, we wonder if other companies will find crowdsourcing a taxonomy to be a flop as well. There are useless taxonomies which do little to assist findability. Then there are ANSI standard taxonomies which work just for folks who understand Boolean, take care to formulate search strategies, and enjoy “real” research. Most of the world prefers the “slap in a word” or “take what the service delivers” approach. Sigh.

Jasmine Ashton, January 18, 2012

Data Harmony: Sweet Tune for Knowledge Management Experts

January 10, 2012

Short honk: Here in Harrod’s Creek, we find meet ups, hoe downs, and webinars plentiful and out of tune with our needs. We want to put on your calendar an event that seems to offer a sweet tune about knowledge management.

The Eighth Annual Data Harmony Users Group (DHUG) meeting, scheduled February 7 to 9, 2012, in Albuquerque, New Mexico will focus on helping users get the most from their investment in the knowledge management software suite, which helps users organize information resources based on a well-built and systematically applied taxonomy or thesaurus.

We learned:

This meeting is an exciting opportunity to learn how to fully utilize the power of Data Harmony software to maximize the effectiveness and profitability of your organization for your members, customers and staff,” said Marjorie M.K. Hlava, president of Access Innovations.

You can get complete details from Access Innovations. The widely read Web log Taxodiary  is encouraging anyone who wishes to share their story at the meeting to contact Data Harmony at this link. Registrations are also now being accepted. For more information about the Eighth Annual Data Harmony Users Group meeting, click here or call (505)998-0800 or 1-800-926-8328. We hope that Access Innovations captures their knowledge in a monograph. Too many amateur taxonomists and knowledge mavens pumping out inaccurate or incomplete information. In our experience, the go-to experts gravitate to the performances by the Mozarts of mark up.

Sounds excellent to us.

Stephen E Arnold, January 10, 2012

Sponsored by Pandia.com

60 Months, Minimal Search Progress

January 1, 2012

When I was writing the Enterprise Search Report, I was younger, less informed, and slightly more optimistic. I wrote in August 2005 “Recent Trends in Enterprise Search”:

The truth is that nothing associated with locating information is cheap, easy or fast.

I omitted one item: accurate. About five years after writing this sentence, I have come to my senses. The volume of information flushing through the “tubes” continues to increase. To explain what petabytes means to the average liberal arts major now working at a services firm, someone coined the phrase “big data.” Simple. Tidy. Inaccurate.

That’s why the notion of accurate information is on my mind. I am tough to motivate in general, and burro like when I have to admit that something I wrote in one of my addled states is incomplete, stupid, or just plain wrong.

Let me start the New Year correctly. Here are four observations which will probably annoy the “real” experts, the self appointed search mavens, and the failed middle school teachers now consulting in the fields of ontology, massive parallelization in virtual environments, and “big data.” I don’t plan to alter my rhetorical approach, so too bad about giving some of these rescued Burger King workers some respite. Won’t happen.

First observation: Even a person as wild-and-wonderful as Jason Calacanis, the much admired innovator who makes a retreating Russian army’s scorched earth policy look green, wants to limit Internet content. “Jason Calacanis: Blogging Is Dead & Why Stupid People Shouldn’t Write” captures his take on accuracy. If one assumes stupid people should not write, then one reason may be that stupid people produce inaccurate information. Sounds okay to me, so let’s go with the stupid angle. In the era of “big data”, trimming out the stupid people should result in higher value information. Keep in mind I am addled. I am not sure where to stand on the “stupid” thing.

Image source: http://www.northernsun.com/Boldly-Going-Nowhere-T-Shirt-(8257).html

Second observation: Disinformation is becoming easier for me to spot. For you? I am not so sure. Let me give you a couple of examples. Navigate to the now out of date list of taxonomy systems prepared by Will Power. The page is available from Willpower Information in Middlesex. Now scan the description of the taxonomy system called MTM. Here’s a snippet:

MTM is the software for multilingual thesauri building and maintenance. It has been designed as a configurable system assisting a user in creating concepts, linking them by means of a set of predefined relations, and controlling the validity of the thesaurus structure…

The main features of the software are inter alia:

  • thesaurus maintenance and support system;
  • KWOC and full tree representation and navigation tools available on-line;
  • KWIC, KWOC and full tree printouts (in an alphabetic and systematic order);
  • defining and customization of up to 100 conceptual relationship types;
  • management of facets, codes (top classification), sources, regional variants, historical notes, etc.;
  • support of the various types of authority files;
  • computer assisted merging;
  • thesauri comparison by means of windows;
  • support of the various alphabets;
  • support of linguistic and orthographic variants;
  • sorting facilities consistent with national standards;
  • variable length data handling;
  • flexibility in defining input and output forms;
  • versatility in terms of relative ease of configuring the software for the various sets of languages;
  • flexibility in defining data structures needed for a given application;
  • a possibility to exchange data with other organizations and systems through exporting and importing terms and relations.

Read more

Taxonomy: More Marketing Craziness in Play?

December 12, 2011

For whatever reason, I have been picking up rumors, factoids, and complaints about the sales and marketing tactics of various search and content processing vendors. With holidays just around the corner, one would think that in run up to Kwanzaa, Christmas, Hanukkah, and Boxing Day folks would chill.

Ah, Agility!

The first dust up concerns tag lines. At issue is the word “agile”, which is becoming one of more popular terms. I was in a meeting at which a heated discussion about whose search and content processing system is agile. Endeca claims agility. I am not going to dispute that a 13 or 14 year old system is not agile, but in Internet years, there may be some flexibility lost. Run a query for “agile” and “search” and you get a hit to a recruitment firm, a marketing outfit, and something called the Tamilan Search Engine. I also spotted PolySpot, a French infrastructure, solutions, and applications company. The problem is that words are slippery. What are the synonyms for “agile”? I expect to see some of these turning up in 2012. How about gazelle search or spry search?

In though economic times, financial pressures can distort business methods.

Circular Partnerships: Snakes Eating Their Tails

The second dust up concerns partnerships. I have been looking through the list of partners identified by such companies as Microsoft, WAND, and others. What I have discovered is that most of the partners are either household names like IBM or companies I have never heard of. Furthermore, when I dig into the partners’ names unfamiliar to me, I discover companies which are consulting firms or resellers who offer a roster of “stuff.” I understand the importance of amplifying a sales force. A partnership plan is little more than a way to reduce the cost of getting a lead and making a sales call. One of the experts in this game is the struggling giant Thomson Reuters. The company signs up partners when sales flag. In the taxonomy game, the partnerships have another twist. The linkages are circular. Antidot or Modeca points to partners and partners point to other search and content processing vendors which point to the original company. I find this confusing because “partner plays” are gaining momentum among specialist firms. I think the “partner” card is an indication that a search and content processing firm may be beating the bushes to get revenue. Just my opinion, of course.

Today, everything is for sale. Be wary if a pitch sounds too good to be true. Image source: http://asksistermarymartha.blogspot.com/2009_10_01_archive.html

Pitching Automation No Matter the Consequences

The third dust up involves taxonomies and is related to the circular nature of partnerships and financial pressures. Now there is considerable contention in the market with regard to taxonomies. The word “taxonomy” itself is a shuttlecock with software badminton players swinging with abandon. The idea is simple: A hierarchical word list. But with hot new spins like ontology (not to be confused with the branch of metaphysics that deals with the nature of being), metatagging, and categorization.

On one side of the dictionary are those who want the software to discover the concepts, terms, and bound phrases. Then these terms are automatically assigned to content processed by the system. If this sounds like the Bayesian magic associated with Hewlett Packard Autonomy or Recommend, you are on the money. There alternative approaches which have considerable payoff. A good example is the work done by Tim Estes and his team at Digital Reasoning, a firm which received financial goodness from SilverLake Sumeru. The idea is that humans play either a modest role or no role at all. Because of the volume of data flowing through a system, human intermediated systems struggle to keep pace with fluidity of human discourse. On one side, therefore, automation. For simplicity’s sake, let’s call this the Google approach.

On the other side of the dictionary are those who see humans with subject matter expertise playing an important role. The idea, which seems quaint to many of the self appointed experts and azure chip consultants, is that human beings can set up a conceptual scheme, populate it with words, terms, and bound phrases. Thus, armed with a controlled term list, a system can use those terms to index or tag content. The idea has merit because the American National Standards Institute has spelled out guidelines for controlled term lists.

Here’s how the battle shapes up. One one side are the “we don’t need any humans” crowd. In my opinion, some enthusiasts for this no-humans position are TEMIS, Google, and in some cases Autonomy. Many of the automated indexing and tagging systems work quite well when the corpus of content is tightly bounded. What do I mean by “tightly bounded?” Pick up a hard copy of a medical journal about cancer or about nuclear engineering. The vocabulary does not vary too much from article to article within each topic area. In fact, once you learn about 2,000 nuclear terms, you can figure out the basic idea of most nuclear power write ups.

Are some search and content processing vendors taking notice of sales methods associated with used car sales professionals? Even Google is advertising on the “vast wasteland”. Image source: http://www.townhillautosales.com/?24

What happens when you process unbounded content? Well, real life language use is more tricky. Non experts simplify complex ideas, often importing non specialist terms for arcane jargon. Do you know what an ECCS is? Probably not. A “real” journalist or consultant will convert the notion of an emergency core cooling system to something along the lines of a “spare radiator.” Not exactly on the money, but indicative of how precise language is softened. In these situations, it is useful to have a term list of the specialist words, terms, and bound phrases. Subcategories under Cooling Systems can contain the ECCS entry and others. The idea is that content can be assigned certain terms no matter what the words and phrases in the source document may be.

Some companies like TEMIS, Google, and Yandex are not to keen on the human involvement. The reasons range from the cost of getting humans to do index and taxonomy development to an arrogance about how software performs. Wizards see the world in terms of their wizardry which is okay with me. I think it is silly to assume software can handle language with the facility of humans, but I am have some experience with what happens when “good enough” is not.

Other companies like Access Innovations (a former client from days of yore)  and (believe it or not) Dow Jones (a component of the exciting Murdoch organization) believe that humans are important. The humans can develop the lists, set up guidelines or rules for the indexing system to consult, and provide interfaces to allow subject matter experts to adjust the term list and tune the indexing system. The benefit is that the accuracy of the indexing, based on my real life experience, is much better. There is language drift, but there are methods to intervene and correct that drift.

Without a method to adjust to what software is too stupid to see, the indexing “drifts”. The impact of this is not too good. You run a query for a particular snake bite treatment, and you cannot locate the content. The term you use is not assigned by the system and it does not appear in the source document. So what? Well, how about your child dies. Maybe this is an unpleasant thought, but the consequences of lousy indexing and concept assignment are often more serious than not finding a pizza joint in San Jose.

Here’s what one indexing professional told me. I have to mask the name and company to avoid a hassle, but you will get the idea from this comment I captured:

Some companies such as a certain Paris-based company sell expensive software to clients and then leave. People don’t know what to do with it.  So they have an expensive difficult to implement natural language processing systems which could work but are left hanging.  The package from us is the whole thing we are big on total service, follow up training, and getting people implemented and using it without our help but we are there – just a phone call or email away to help and support them. The Paris based company says companies like Access Innovations are not a natural language processing system and although we do have the natural language processing  we don’t make people pay for it separately. With most systems, rules are often needed to achieve more than “good enough” tagging.  Access Innovations, a specialist able to generate ANSI compliant term lists, delivers 85 to 90 percent accuracy. The Paris-based outfit delivers far lower accuracy. Clients don’t understand the issues with low accuracy tagging, findability, and long term system usability.

So What?

What we have, gentle reader, is an example of the automation crowd glossing over the need for human-intermediation solutions. What disturbs me is that the chatter about taxonomy in boot camps, companies which are coming from left field, and self appointed experts is putting the spotlight on indexing and classifying content.

That’s a plus.

The downside is that when the indexing goes off the rails, the user may not be able to find the needed information. That’s why companies like Digital Reasoning and Access Innovations have the ability to deliver automation plus human-intermediated interactions. The licensee suffers when automation goes wrong. The users suffer. The search system vendor may be blamed. Beware the taxonomy vendor spouting glittering generalities about smart software. Usually the “spout” dispenses tainted outputs.

Bottom line: I avoid vendors who present to me the “one true way.” This approach may work when preparing foie gras. For some taxonomy vendors hungry for cash, the traditional, labor intensive methods get in the way of making a quick sale. Unfortunately when humans create language, more traditional methods are often completely appropriate for mission critical indexing tasks. Honk!

Stephen E Arnold, December 12, 2011

Sponsored by Pandia.com

BA Insight Interview

December 11, 2011

Short honk: We overlooked a new interview with Guy Mounier, BA-Insight. If you track the vendors who provide components to extend and enhance Microsoft SharePoint, you may find the interview with BA Insight interesting.

image

You can find the interview at this link. The interview carries the date of September 27, 2011. Our error. At age 67, I lose my pen several times a day.

Stephen E Arnold, December 11, 2011

Protected: Is SharePoint the End All Content Management Solution?

December 5, 2011

This content is password protected. To view it please enter your password below:

ConceptSearching Add-Ons

December 1, 2011

In “Useful Enterprise Search Hastens SharePoint User Adoption,” the author briefly discusses conceptSearching and its business case targeted at the heavily regulated industries – banking, finance, healthcare, energy, etc. – and government SharePoint customers. So what are the user benefits? We learned:

Very fast recognition of any/all compliance sensitive…Further benefits include automatic recognition of document themes, referred to as “concepts” in the conceptSearching promotional material on their website, which should be particularly useful as compliance officers review and assimilate new documents to ensure that published information meets the objectives of the enterprise and conforms to guidance.

And the caveat? The benefits are delivered via a suite of products and add-ons including conceptClassifier and conceptTaxonomy Manager. Excellent search results rely on the right structure and the right tools.

We want to spend less time configuring add-ons and more time developing business intelligence. While Sharepoint adoption continues to grow, it still is not an out-of-the-box solution for enterprises.  If you need a bit of help with improving SharePoint search,you might check into Mindbreeze, the company seems to have the installation and the benefits of a proper installation down pat.

Sara Wood, December 1, 2011

Sponsored by Pandia.com

New Google App Puts Chrome on the iPad

November 30, 2011

Clever? Stealthy? Sneaky? Cute? Smart?

Don’t know.

In the battle between Apple and Google, it appears that the search giant has come out with new way to come out on top, and search is not the primary focus. Google has created a search app that is superior to the experience of any Android tablet and puts the core Chrome elements onto an Apple product.
If you are wondering why Google has suddenly decided to stop innovating for it’s own products, and has chosen to invade Apple’s, The Next Web article“Google Just Used It’s Search App to  Sneak Most of Chrome OS onto the iPad” states:

The reasons why it has shipped a pack of its most potent apps in one convenient dashboard are evident if you look at the tablet landscape as we know it. Google’s “official” version of Android is losing the tablet race, flat out. Products from manufacturers that have no access to an ecosystem beyond the Android Market have proven not to work. Now, Amazon has launched the Kindle Fire, which stands to quickly attain ’2nd place’ status behind the iPad, utilizing a tweaked version of Android that Google will gain nothing from.

iOS devices account for 2/3 of mobile searches on Google’s platform, making it the largest outlet for Google’s primary product, ads. Google recognizes this fact and has created an app for its fans who use Apple products.

With such a seamless integration, it appears that Apple may not be able to separate itself from Google, no matter how hard they try. This is a certainly a clever move on Google’s part but definitely not the most innovative. Is this the new Google?

Jasmine Ashton, November 30, 2011

Sponsored by Pandia.com

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta