Content Transformation: A Challenge that Won’t Go Away

May 15, 2008

We live in a world of Web 2.0 and Web 3.0 goodness. At the Where 2.0 conference in Burlingame, California on May 14, 2008, I overheard this snippet of conversation:

We had everything working, but when we imported content, the system crashed. I reinstalled. I checked the config files. It still crashed. I have to open each file, resave it as an RTF, and import them one at a time. Grrrr.

Sound familiar?

I have heard this complaint many times before. In our content-savvy, XML-ized era, moving a source file into a content processing system should be trivial. The content processing system can extract entities. It can metatag. Some can slice, dice, and cook a chicken. But unless the system can intake content and transform it to something that the content processing subsystem understands, the system is dead in the water. Even worse, the text processing system only processes some of the source documents. In certain mission critical applications, kicking out documents is a no-no. Not only is the manual manipulation expensive, it’s time consuming. In those minutes or hours of fiddling, potentially significant data are not available to the analysts. What does missing information cost? Well, it depends on your work situation. In the Wall Street world, investment information can turn a win into a loss in a millisecond. In certain military applications, the information may mean the difference between health and harm.

Transforming a square into a circle or a circle into a square looks easy. With a triangle and a compasss you can create two objects. Its the intermediate steps that become tricky for an artist or a budding mathematician.

What is file or data transformation? In its simplest form, you have a file in Microsoft Word 2007 format, and you want to “transform” or change the file into a format recognized by another system’s import filter. So, one approach would be to open the File in Word 2007, click on File Save As, select RTF (Rich Text Format), and save the file. You can then allow your search or content processing system to suck the file into the conversion subsystem and turn the RTF into whatever target output format the filter generates. In a more sophisticated form, you take an unstructured document or a database table, and you transform it into some file type that your system can process. A more interesting task is to convert a file into a file with a comparable structure; for instance, take and SGML instance and convert it to HTML. Some search system vendors include filters and transformation tools with their system. Others provide an application programming interface. The idea is that you will write a script to perform whatever conversion you require, handle entities in an appropriate manner, and preserve the information and metadata (if available) throughout the process.

Let’s take a quick look at several transformation challenges and then step back to consider what steps you can follow to minimize these problems. Before jumping into the causes, keep in mind that as much as 30 percent of an information technology department’s budget is consumed by transformation costs. This astounding number surfaced in a presentation given by a Google engineer in 2007. If that number seems high, you can knock it down to a more acceptable 10 or 20 percent. The point is that fiddling with data when moving it from one system and format to another is a common task. Any transformation activity can go off the tracks. Read more

Written by Stephen E. Arnold · Filed Under Database, Enterprise, Feature, Search, Text processing | Comments Off on Content Transformation: A Challenge that Won’t Go Away

Collective Intelligence Anthology Available

May 14, 2008

The Arnoldit.com mascot admires the new collection of essay by Mark Tovey. Collective Intelligence: Creating a Prosperous World at Peace, published by the Earth Intelligence Network in Oakton, Virginia (ISBN: 13: 978-0-97-15661-6-3) contains more than 50 essays by analysts, consultants, and intelligence practitioners. You can obtain a copy from the publisher, Amazon, or your bookseller.

The ArnoldIT mascot completed reading the 600-page book with remarkable alacrity for a duck.

The collection of essays is likely to find many readers among those interested in social phenomena of networks. Many of the essays, including the one I contributed, talk about information retrieval in our increasingly inter connected world.

This essay will provide a synopsis of my contribution, “Search–Panacea or Play. Can Collective Intelligence Improve Findability”, which I wrote shortly before completing Beyond Search: What to Do When Your Search System Doesn’t Work“. My essay begins on page 375.

Social Search

The dominance of Google forces other vendors to look for a way over, under, around, or through its grip on the Web search. The vendor landscape now offers search and content processing systems that arguably do a better job of manipulating XML (Extensible Markup Language) content, figuring out who knows whom (the social graph initiative), and the “real” meaning of content (semantic search). There are more than 100 vendors who have technology that offers, if one believes the marketing collateral and conference presentations, a way to squeeze more information from information.

Social search is the name given to an information retrieval system that incorporates one or more of these functions:

Users can suggest useful sites. Examples: Delicious.com and StumbleUpon.com
The system discovers relationships between and among processed documents and links: Powerset.com and Kartoo Visu
The system analyzes information extracts entities and identifies individuals and their relationships: i2 Ltd (now part of ChoicePoint) and Cluuz.com
Monitoring of user behavior and using data to guide relevance, spidering and other system functions: public Web indexing companies

There are other types of social functions, but these provide sufficient salt and pepper for this information side dish. The reason I say side dish is that social functions are not going to displace the traditional functions on which they are based. Social search has been in the mainstream from the moment i2 Ltd. introduced its workbench product to the intelligence community more than a decade ago. “Social” functions, then, are a recent add-on to the main diet in information retrieval.

Old Statistics and Cheap, Powerful Computers

What’s overlooked in the rush to find a Google “killer” is that the new companies are using some well-known technologies. For example, the inner workings of Autonomy’s “black box” is somewhat dependent on the work of a slightly unusual Englishman, Thomas Bayes. Mr. Bayes left the world a couple of centuries ago, but his math has been a staple in college statistics courses for many years. To deploy Bayesian techniques on a large scale is, therefore, not exactly a secret to the thousands of mathematicians who followed his proofs in pursuit of their baccalaureate.

Written by Stephen E. Arnold · Filed Under Database, Feature, Online (general), Search, Semantic, Social | Comments Off on Collective Intelligence Anthology Available

Groping the Enterprise Search Elephant

May 12, 2008

In the 2000 to 2003 period, ArnoldIT.com delivered a number of tutorials about search. Some of these presentations were held in conjunction with conferences such as the Boston Search Engine Meeting, Gilbane’s conferences, and the Information Today line up of professional programs. Others were delivered to small groups at various financial institutions, search vendors, and government entities.

This is the search elephant. In a meeting, you will hear many people talk about search. Each person will have a specific meaning and assume that the others in the room will know exactly what’s meant when she uses the word search. If you take all these individual meanings of search and put them together, you have a better idea of what a search system is supposed to deliver.

In each case, I had to take more time than budgeted to define the different types of search encountered in enterprise behind-the-firewall deployments. This issue surfaced this week end when I spoke with a colleague grousing about the different perceptions of search in a consulting firm in Europe.

The purpose of this essay is to provide an abbreviated and hopefully useful look at the different meanings of search. To help make these ideas concrete, You can learn more about this subject in Enterprise Search Report and the brand-new Beyond Search study that came out in April 2008. I wrote the first three editions of ESR and played a minor part in the current edition, but you will get some color on this topic in those for-fee analyses.

Everybody Knows about Search

The definition issue is skipped over because most people today believe they know about search. At dinner last night, people said, “I did a search for a cruise to Brazil”, “I looked up my health care benefits and found they were reduced” and I’m not sure it’s worth seeing” and “My boss had me find a proposal he thought he had lost when his laptop was stolen”. None of these people were information retrieval professionals or computer scientists. But each of them talked about search as if it were a routine activity like finding a parking space.

The need for a definition goes up when people assume others mean the same thing for search. Let’s look at the meanings for search in an enterprise.

Enterprise Search or Behind-the-Firewall Search

This is the buzz word of the moment. Companies know intuitively that if a worker can’t find information on the company’s own internal network, the worker is going to waste time looking for what’s needed. Even worse, the employee can’t find the accurate information and makes a bone head decision.

Enterprise search is a contradiction. No boss in the world wants “everything” indexed and searchable. Problems come from indexing “everything”. A few of the bombs in the enterprise search mine field are:

Email on topics that are or can be problematic
Information about company secrets like Coca Cola’s formula for the fizzy drink
Information about legal matters
Information an employee puts on a company server about non-company activities
Personal, salary, and medical information
Pricing information
Stolen software, information from a third-party provider without paying a license fee or obtaining a copyright permission, information about a competitor that was obtained via an email from a friend

Search works best when the domain of information to index is narrowly defined, reviewed, and subject to a formal approval and review policy. Ad hoc indexing of behind-the-firewall information can trigger big trouble fast.

Written by Stephen E. Arnold · Filed Under Enterprise, Feature, Search | 2 Comments

Kartoo’s Visu: Semantic Search Plus Themescape Visualization

May 11, 2008

In England in December 2007, I saw a brief demonstration of Kartoo.com’s “thematic map”, which was announced in 2005.

The genesis for the company was developed from the relationships with large publishing groups into 1997. Mr. Baleydier was working to make CD-ROMs easily searchable. Founded in 2001 by Laurent and Nicholas Baleydier to provide a more advanced search interface. You can find out more about the company at Kartoo.net. Kartoo S.A. offers a no-charge metasearch Web system at Kartoo.com.

The original Kartoo service was one of the first to use dynamic graphics for Web search. Over the last few years, the interface became more refined. But the system presented links in the form of dynamic maps. Important Web sites were spherical, and the spheres were connected by lines. Here’s an example of the basic Kartoo interface as it looked on May 11, 2008, for the query “semantic search” run against the default of English Web sites. (The company also offers Ujiko.com, which is worth a quick look. The interface is a bit too abstract for me. You can try it here.)

The dark blue “ink blots” connect related Web sites. The terms provide an indication of the type of relationship between or among Web sites. You can click on this interface and explore the result set and perform other functions. Exploration of the interface is the best way to explore its features. Describing the mouse actions is not as effective as playing with the system.

Another company–Datops SA–was among the first to use interesting graphic representations of results. I recall someone telling me that the spheres that once characterized Groxis.com’s results had been influenced by a French wizard. Whether justified or not, when I saw spheres and ink blots, I said to myself, “Ah, another vendor influenced by French interface design”. In talking with people who use visualizations to help their users understand a “results space”, I’ve had mixed feedback. Some people love impressionistic representations of results; others, don’t. Decades ago I played a small role in the design of the F-15 interface or heads-up display. The one lesson I learned from that work was that under pressure, interfaces that offer too many options can paralyze reaction time. In combat, that means the pilot could be killed trying to figure out what graphics means. In other situations where a computational chemist is trying to make sense of 100,000 possible structures, a fine-grained visualization of the results may be appropriate.

Written by Stephen E. Arnold · Filed Under Feature, Online (general), Search, Semantic | Comments Off on Kartoo’s Visu: Semantic Search Plus Themescape Visualization

The AP Analyzes Microsoft’s Live Search Options

May 9, 2008

ut of the aether, I received Jessica Mintz’s story “With Microsoft Mum, Analysts Mull Next Moves for Live Search”. You can read the story here or here. As I often say, snag it quickly. The wild and wonderful world of the Associated Press’s online system can baffle even a skilled researcher.

I scanned the story, intrigued that “analysts mull” much of anything related to search, text processing, or information retrieval. The sector has a glass ceiling that kicks in at the $350 to $400 million level with most companies in the market trying to make the losses and voracious appetite for investment look like a great business.

Her analysis, which I’m confident is Grade A for the AP arrived as I read the PCWorld story “Microsoft’s Answer to Google Sky to Launch at End of May”.

Microsoft has to do more than play me-too if Microsoft is going to hobble Googzilla. The GOOG isn’t very good at PR, marketing, or sales. At least, Microsoft pays attention. That’s a good thing, I suppose.

Ms. Mintz’s interesting essay is about Micrsooft after Yahoo. I think her point is that without Yahoo, Microsoft has no easy, fast, cheap way to increase its search traffic and, hence, its online ad revenue. She writes:

Some analysts say Microsoft must increase its search traffic to attract advertisers. Others believe Microsoft should concede that market to Google Inc. and find success elsewhere — leapfrogging rivals in areas such as display and mobile advertising. All that is clear is Microsoft must come up with a Plan C soon, after acknowledging that its Plan A of going solo was troubled, forcing it to turn to Plan B of acquiring Yahoo. Part of the problem analysts face predicting Microsoft’s next moves is that the company has already tried the obvious tactics. It built its own search-ad platform from scratch and spent $6 billion to buy a major online advertising company, aQuantive. Microsoft overhauled its search engine technology, and most analysts agree that its results are at least as good as Google’s. It tweaked the design of its Live Search service to become more like Google.

Whoa, Nellie!

The most interesting information for me was Ms. Mintz presents a series of action items. I’m not sure if these are Mr. Ballmer’s or if these have been constructed from the search experts Ms. Mintz interviewed for this story. Set aside provenance for a moment. Let’s look at each action item. For ease of comparison, I put Ms. Mintz’s suggestions in the column “Microsoft Tasks” and my comment in the column labeled “Beyond Search”.

*Microsoft Tasks*	*Beyond Search Comment*
Do the basics	Google’s been at the basics since 1998. Time to start I guess
Innovate in “quick waves” to force Google to play catch up	“quick” and “Microsoft” are an oxymoron
Change the basic experiences of communication and search	Microsoft needs to deliver search that people actually use
Gain scale	Good idea. Google’s been building plumbing for a decade. Microsoft’s just started

I don’t think my research for Google Version 2.0 supports the idea that Microsoft can catch Google with these four actions, individually or collectively. Let me run through my reasoning based on the information available to me.

First, Google delivers a search experience that is increasing its market share. Google’s approach works. Microsoft’s approach hasn’t. What’s astounding to me is that with Internet Explorer’s default search the Live.com service, the canyon in market share is almost unbelievable. IE users are ignoring the default search box and consciously selecting Google. That’s just amazing. One bit of bad news. The market share data are not accurate. Google’s market share is in the 80 percent range. In countries like Denmark, Google’s share is over 90 percent.

Written by Stephen E. Arnold · Filed Under Feature, Google, Microsoft, Search | Comments Off on The AP Analyzes Microsoft’s Live Search Options

Search Mountebanks

May 8, 2008

Author’s Note: This is an opinion piece, and it relates to the challenges that organizations face when trying to get the straight dope on an enterprise search, text mining, or any other complex enterprise software solution. If you are supremely confident of your knowledge, enjoy cutting corners, or perceive yourself as smarter than your customers when it comes to business–do not read this essay. Others may proceed at their own risk. I have masked the identify of the companies and individuals in the two “stories” so these folks can continue to pull skunks from their hats without their colleagues and customers seeing the reality behind the stage dressing.

Let’s look at this statement from a ZDNet Web log on May 7, 2008. The post is an interview conducted by Michael Krigsman, a good writer. The subject is US government information technology failures. Mr. Krigsman interviewed technical professionals working at CA (I think that’s the acronym for the “old” Computer Associates). One CA participant is Gil Digioia, a CA vice president, and the other is Jose (sic) Mora, a senior director. Both of these CA specialists are involved with “Federal Project Portfolio Management Sales for CA Clarity”. You can read more about CA Clarity here.) I’m not sure whom Mr. Krigsman is quoting in the segment below, but I thought these comments were remarkable:

There is room for execution improvement regarding the triple constraints of scope, time, and budget. The big reason is lack of “critical corrective action” from high-level decision makers within the organization. This can result from either lack of decision-making or leaders who don’t have the proper information to make decisions that ultimately impact the project.

Requirements also tend to change after projects have been awarded, and are often different at project conclusion from what was specified in the original proposal. These changes tend to disrupt project work flow and collaboration. Such challenges pose particular difficulties for organizations that don’t have a repeatable governance process in place or lack the proper technology to react easily to those changes.

Problems can arise at the project management level, the executive decision-making level, and with technology. For example, problems are sometimes caused by legacy systems that can’t adapt to the rapid changes in information these organizations face.

My interpretation of these statements is that Federal managers can’t manage. The folks involved in requirements don’t know what they need, so technical requirements are built on Jello. Legacy systems screw up the newer systems. What this says to me is: “The vendor is NOT at fault when project fail.”

Is this your IT or search consultant? Are you getting a skunk instead of a more tractable animal? Image source: http://cjonline.com/images/092906/41543_270.jpg

Powerful stuff. I relished how the interview subjects shifted the problems to the client. In fact, these comments have been made about enterprise search (what I call behind-the-firewall search or Intranet search). A search vendor groused to me at the Boston Search Engine Meeting that one of his largest clients doesn’t know what search is supposed to do. The client is the problem. When I pay to have my roof repaired, if the roof leaks, am I at fault. I expect the roofing guy to fix the roof. I guess my simple reasoning doesn’t apply to information technology projects.

Written by Stephen E. Arnold · Filed Under Enterprise, Feature, Search | Comments Off on Search Mountebanks

Cluuz.com: Military Intelligence-Like Functions for Web Metasearch

May 8, 2008

One of my business associates in Canada sent me a link to an interesting search engine named Cluuz.com. The system–unlike the shy Powerset, a media darling developing a semantic search engine–is available for anyone to use. Navigate to Cluuz.com. Make sure you add the extra “u”, or you will be looking at a plain text page from the graphically restrained Clue Computing operation in cow country.

Cluuz.com takes results and applies semantic processes to them. Some of the company’s display options are a bit too sophisticated for my 64-year-young eyes, but I found the system quite useful. Let’s run through a basic search and take a cursory look at some of the features that I found interesting. Then I want to comment on the semantic search boom or boomlet (depending on how jaded you are), and conclude with several observations. In the last few days, the shrinking violets in the Big Name search vendors’ public relations department have reduced their flow of 30-something insights. Perhaps my comments about semantic search will “goose” them into squawking. I certainly hope so. Life’s no fun in rural Kentucky without well-groomed Ivy League wizards asserting their intellectual superiority in email speak.

A Query for Cluuz.com

Navigate to the Cluuz.com splash screen. Make certain that you have checked the option under the search box for “Charts”. We’ll look at the other options in a moment. Now enter the test query as shown in italics: Google +”programmable search engine”. Here’s my result for this query on May 7, 2008:

The system processes results from MSN (search.live.com) and Yahoo, processes them, and displays this map. Note that the system identifies important people and companies. The system correctly identifies the Google Forms service as related to the “programmable search engine”.

The system offers other ways to view the results set. For example, you can look at hits from the search engines to which the query is passed as a traditional laundry list. Other choices include a cluster display and a Flash display which is, in my opinion, cluttered with sliders, controls, and options.

You can also enter a more complex query using the Cluuz.com advanced search page. In my tests, the system did a good job of dealing with specific Boolean queries. You can also set preferences, which may not be necessary for a metasearch-based approach to generating hits.

Written by Stephen E. Arnold · Filed Under Enterprise, Feature, Online (general), Search, Text processing | 6 Comments

EasyAsk: Business Intelligence for End Users

May 7, 2008

Progress Software purchased EasyAsk in May 2005. Prior to the change in ownership, EasyAsk offered natural language search to a range of government and commercial clients. After the buy out, Progress narrowed the focus of EasyAsk, as I understand the transition, from a broad search vendor to eCommerce.

The initial positioning, according to information in my files, was:

The Progress EasyAsk Division provides natural language ad-hoc query solutions that empower non-technical users to quickly find and retrieve critical business information from multiple enterprise data sources. In addition, EasyAsk provides an integrated search, navigation and merchandising platform that optimizes the shopping experience on many of the world’s most successful eCommerce sites.

The value add that EasyAsk offered customers was a higher conversion rate than the conversion rate achieved by competitive software. In 2006, some of the company’s licensees–for example, Redcats USA and Lillian Vernon–reported conversion rates 15 percent or higher than the rates from competitor’s software. You can try out the EasyAsk system yourself by navigating to Lillian Vernon or Lands’ End. EasyAsk’s commercial customers don’t make their system accessible to outsiders. If you get a chance to access Ceridian’s Intranet , you can check out EasyAsk in a behind-the-firewall setting because EasyAsk is now pushing into the business intelligence market.

Now EasyAsk is expanding its scope and asserting that its system is a front end for the data mart or data warehouse. EasyAsk calls its approach “operational business intelligence”. EasyAsk describes its system as being “closer to the ground”; that is, it’s more accessible than traditional BI systems. Users require little or no training to create a custom report. Interaction is via a traditional search box or a point-and-click, assisted navigation interface. If a data warehouse is already built, EasyAsk can deploy its system in a matter of days.

In an interview on the Business Intelligence Network, EasyAsk’s Dr. Larry Harris, vice president and general manager of the EasyAsk division of Progress Software, said:

The inherent complexity of traditional BI tools prevents organizations from deploying these solutions company wide, and this inhibits individuals who might otherwise be able to act on the insight these tools provide from making better business decisions. EasyAsk for Operational BI provides employees at all levels of the organization with the ability to perform ad hoc business analysis as well as search for existing reports through the familiar search box interface, empowering them to make better business decisions more quickly.

A number of vendors are addressing the knowledge barrier that prevents industrial-strength business intelligence systems from broader use in an organization. If you know how to code and have a degree in statistics, the complexities of building queries and manipulating data cubes are trivial. For the average MBA, building a chopper from a stack of parts would be less difficult.

This graphic shows typical outputs from EasyAsk in response to a user’s natural language query.The user types a query; for example, “Crosstab sales by customer’s state and category” or “What account in the Bay area had the most orders in Q4, 2007?”

That’s the hurdle BI or business intelligence must leap over without tripping. EasyAsk’s trampoline is its NLP or natural language processing capability. The idea is that the user can type a “natural” question. The EasyAsk system “understands” the user’s query, converts it to a form understandable by the system, retrieves the needed information, and displays an answer.

Written by Stephen E. Arnold · Filed Under Database, Enterprise, Feature, Search | Comments Off on EasyAsk: Business Intelligence for End Users

Enterprise Search and Train Wrecks

May 7, 2008

After I completed my interview with the Intelligenx executives, I thought about one of their comments. Iqbal Talib said, “We have many clients who want a point solution, not an enterprise solution”. An executive at Avalon Consulting wrote me today and echoed the Intelligenx comment.

Enterprise search may be a train wreck for more than half of the people who use today’s most popular systems. The Big Name vendors can grouse, stomp, and sneer at this assertion. Reality: Most of these systems disappoint their licensees. When a search system “goes off the rails”, the consequences can be unexpected.

When an enterprise system goes off the rails, the damage is considerable. Even worse, moving the wreckage out of the way is real work. But even more difficult is earning back the confidence of the passengers.

A Case Example

A major European news organization licensed a Big Name system. The company ponied up a down payment and asked for a fast-cycle installation. After six months of dithering, the Big Name admitted that it did not have an engineer available who could perform the installation and customization the paying customer wanted.

The news organization pulled the plug. The company then licensed one of the up-and-coming systems profiled in Beyond Search. The revamped system was available in less than three weeks at a fraction of the cost for the Big Name system.

The new system works, and it has become a showcase for the news organization. For the Big Name, the loss of the account eroded already shaky finances and became the talk of cocktail parties at industry functions.

Ever wonder how much churn Big Name enterprise search vendors experience in a year? You can get a good idea by comparing the customer lists of the best-known enterprise search vendors. The overlap is remarkable because large companies work their way through the systems. Now more are turning to up-and-coming vendors’ systems. The Big Names are facing some sales push back. Take a look at the financials for publicly traded search vendors. Look for days-sales-outstanding data. Look at the cash reserves. Look at the footnotes about restating financials.

What you may find is that fancy dancing is endemic.

How Many Search Systems Does One Company Need?

What haunts me is the overlap among vendors. Early in 2003, I conducted a poll of Fortune 1000 companies. The methodology was simple: I sent an email with several basic questions to people whom I knew at 150 different large organizations. I received a response rate of about 70 percent, which was remarkable. One question I asked five years ago was, “How many enterprise search systems do you have?”

Written by Stephen E. Arnold · Filed Under Enterprise, Feature, Search | 3 Comments

Selecting an Enterprise Search System: The Mid-Sized Company Dilemma

May 6, 2008

Earlier today (May 5, 2008) I received a telephone call from a journalist seeking my thoughts about this question: How do mid-sized companies select an enterprise search system. As you know, I call this type of search “behind-the-firewall search”. There’s considerable confusion about Web search, search on a particular Web site, ecommerce search, and the other denizens of the search phylum in the kingdom of information within the domain of knowledge. (I feel biological at the moment after a day of considering how Google vapor sucked oxygen from the Microsoft-Yahoo deal.)

This morning (May 6, 2008), my RSS reader proudly displayed an Information Week article penned by George Dearing “Why Is It So Hard to Be Found“. The article was interesting because Mr. Dearing used the phrase “within the firewall” to describe enterprise search. As you know, I prefer the phrase “behind the firewall”, but he’s close enough for horseshoes. You can read this essay here. Click quickly, articles on the pop-up beseiged Infomation Week Web site can be, as Mr. Dearing notes, “so hard to be found”.

The point he makes that stuck me was:

For something so critical to content as search, you’d think that companies would have more to show for it than misguided enterprise search implementations… I’ve always had a hard time getting my arms around the space, much less the application of specific search-oriented approaches…. Sam Mefford, a search consultant with Avalon Consulting, made me feel a little better recently when he told me, “I’m moving away from the terminology Enterprise Search wherever possible, and moving to just Search, because most organizations simply aren’t ready for Enterprise Search.” After talking to him, it seems the challenge for enterprise search is the same as for other enterprise software sectors: A lot of work was put into technology and software development but the needs of users have largely been ignored.

My bandwagon is no longer holding me. Avalon consulting is on board. I was encouraged by the journalist’s call as well. Some are looking at the search vendors’ assertions and seeing the handiwork of PR mavens, not programmers.

In Beyond Search, I provide quite a bit of tactical information for fixing a search system gone bad. But in the 3rd edition of the Enterprise Search Report, I slogged through the formal procurement process applicable to organizations of almost any substance. And what about the journalist’s questions? The young lass had done her homework. She wanted to know about user dissatisfaction (hovering around 60 percent), methods of selecting systems in mid-sized companies (an underserved sector), and the pay off from a good system (it is easier to explain the cost of not having information access).

The business end of a piranha. Imagine a procurement team swimming in a calm Brazilian river. Above the giant search vendors circle like hungry vultures. In the river, a swarm of ravenous up-and-coming search engine vendors want to nibble on the procurement team. Mid market companies find themselves in the middle when it comes to licensing a search system. Big, aggressive folks above and fiesty smaller ones below make it tough for mid-sized firms to make a well-reasoned search system acquisition.

After the pleasant telephone talk with the reporter, I continued thinking about the characteristics of a mid-sized company. I define “mid sized” as a firm with revenues between $50 million and $300 million in revenues. This is a company size that is caught in the middle. Vulnerable to incursions by far larger companies in search of new revenue, the mid-sized company is a tempting target. IBM, Microsoft, and Oracle have signaled an interest in the companies in this sector. Nibbling away like tiny piranha on the toes of swimmers, start ups and small companies with revenues below the magic $50 million threshold want to gobble the swimmers’ calves, maybe the entire swimmer.

In general, occupying the mid-market requires considerable attention to bigger companies and to an innumerable swarm of smaller outfits. With 15 or 18 million businesses in the US, most are smaller. The 2,000 or 3,000 giant-sized enterprises have the resources to prey on the mid-market.

To survive, mid-market companies have to work hard, deliver acceptable customer service, and market effectively. One slip up, and the weaker mid market company is a snack for a larger organization or a feast for a smaller predator.

Not surprisingly, selecting a behind-the-firewall search system boils down to one of three broad strategies. I’ve substantiated these via my survey and interview work. I am on the look out for more anecdotes, including survey data, that can illuminate the interesting world of mid-market companies.

Written by Stephen E. Arnold · Filed Under Enterprise, Feature, Search | 3 Comments

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Employment
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Telegram
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Content Transformation: A Challenge that Won’t Go Away

Collective Intelligence Anthology Available

Groping the Enterprise Search Elephant

Kartoo’s Visu: Semantic Search Plus Themescape Visualization

The AP Analyzes Microsoft’s Live Search Options

Search Mountebanks

Cluuz.com: Military Intelligence-Like Functions for Web Metasearch

EasyAsk: Business Intelligence for End Users

Enterprise Search and Train Wrecks

Selecting an Enterprise Search System: The Mid-Sized Company Dilemma

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta