Information 2009: Challenges and Trends

December 4, 2008

Before I was once again sent back to Kentucky by President Bush’s appointees, I recall sitting in a meeting when an administration official said, “We don’t know what we don’t know.” When we think about search, content processing, assisted navigation, and text mining, that catchphrase rings true.

Successes

But we are learning how to deliver some notable successes. Let me begin by highlighting several.

Paginas Amarillas is the leading online business directory in Columbia. The company has built a new systems using technology from a search and content processing company called Intelligenx. Similar success stories and be identified for Autonomy, Coveo, Exalead, and ISYS Search Software. Exalead has deployed a successful logistics information system which has made customers’ and employees’ information lives easier. According to my sources, the company’s chief financial officer is pleased as well because certain time consuming tasks have been accelerated which reduces operating costs. Autonomy has enjoyed similar success at the US Department of Energy.

Newcomers such as Attivio and Perfect Search also have satisfied customers. Open source companies can also point to notable successes; for example, Lemur Consulting’s use of Flax for a popular UK home furnishing Web site. In Web search, how many of you use Google? I can conclude that most of you are reasonably satisfied with ad-supported Web search.

Progress Evident

These companies underscore the progress that has been made in search and content processing. But there are some significant challenges. Let me mention several which trouble me.

These range from legal inquiries into financial improprieties at Fast Search & Transfer, now part of Microsoft to open Web squabbles about the financial stability of a Danish company which owns Mondosoft, Ontolica, and Speed of Mind. Other companies have shut their doors; for example, Alexa Web search, Delphes, and Lycos Europe. Some firms such as one vendor in Los Angeles has had to slash its staff to three employees and take steps to sell the firm’s intellectual property which rightly concerns some of the company’s clients.

User Concerns

Another warning may be found in the results from surveys such as the one I conducted for a US government agency in 2007 that found dissatisfaction with existing search systems in the 65 percent range. AIIM, a US trade group, reported slightly lower levels of dissatisfaction. Jane McConnell’s recently released study in Paris reports data in line with my findings. We need to be mindful that user expectations are changing in two different ways.

First, most people today know how to search with Google and get useful information most of the time. The fact that Google is search for upwards of 65 percent of North American users and almost 75 percent of European Union users means that Google is the search system by which users measure other types of information access. Google’s influence has been essentially unchecked by meaningful competition for 10 years. In my Web log, I have invested some time in describing Microsoft’s cloud computing initiatives from 1999 to the present day.

For me and maybe many of you, Google has become an environmental factor, and it is disrupting, possibly warping, many information spaces, including search, content processing, data management, applications like word processing, mapping, and others.

Microsoft is working to counter Google, and its strategy is a combination of software and low adoption costs. I believe that Microsoft’s SharePoint has become the dominant content management, collaboration, and search platform with 100 million licenses in organizations. SharePoint, however, is not well understood as technically complex and a work in progress. Anyone who asserts that SharePoint is simple or easy is misrepresenting the system. Here’s a diagram from a Microsoft Certified Gold vendor in New Zealand. Simple this is not.

Finally, there is a significant shift from key word indexing to different types of information access. The interest is fueled in part what I call the “legacy of the past” in search and retrieval. Organizations want more than Boolean, and in the present financial climate, higher value and lower costs are imperatives. The buzzwords in my opinion provide a rallying point for a sales call; buzzwords do not communicate. As a result, confusion about search and its close cousins is rampant.

Three Contextual Issues

Before commenting on the outlook for information retrieval in 2009, let me mention three broader issues to provide you with the context for my concluding remarks

First, organizations have different users and different users have quite specific information access requirements. Customer support requires a combination of functions such as assisted navigation so a customer can locate needed information on a public facing Web site, automatic indexing and concept identification to allow a person in the customer support department to find specific information from an FAQ, a database, or technical documents in the engineering department. A “wiki” or other user generated content repository is becoming an important way for the company to obtain specific use case information. The need, therefore, is to provide answers, preferably in such a way that eliminates the need to guess which magic combination of words is required for the search engine to provide the needed facts.

Second, there are significant cultural shifts taking place. At the firsts online show held in London in 1979, the users were trained information professionals. The systems were not designed for a person needing information in the marketing department. The way it worked was that the information professional conducted a reference interview, ran the search, winnowed the results, and delivered the output to the end user. Today, most people in an organization run searches themselves and perceive themselves as good or excellent searchers. As the workforce intakes university graduates, the notion of search and Internet access as standard will bring more rapid information innovation.

Third, with information access pervasive, context becomes more important. In order to understand the reliability and value of an item of information, we need different types of metadata. For example, it is not enough to index the words in an email with a PowerPoint attachment. We need to know who saw the document, where specific snippets of data went, how long a reviewer spent on the document and what changes that person made, and so on.

Trends in Search, 2009

Now, let’s look at the trends to which I will attend in 2009.

First, the market for search and content processing is changing, quickly in some sectors like business intelligence and less quickly in others like taxonomy building. Most vendors will offer “vertical” solutions in order to have products tailored to each sector or business function. Vendors will participate in different sectors and try to move to sectors where there is funding or new opportunities. I find it difficult to know how to characterize a particular vendor’s market focus. I think this fuzziness will be more evident in 2009. My research has identified some no go zones; that is, vendors are avoiding certain types of business solutions because margins are thin or customers are not interested in solutions in these spaces. Newcomers, therefore, have some green fields to develop.

Second, organizations will be shifting from existing systems to combinations of on premises and cloud services. Economics will play a larger role in this shift than technology.

The “old” systems may be left in place for those users who depend on them. This means that there will be more information complexity. Costs will rise even if some of the newer systems are commoditized, from open source, or delivered from the cloud. The consequences of this additive approach will be increased complexity and cost despite an organization’s efforts to eliminate cost and complexity. SAP is suffering from this problem at this time. The SAP disease is likely to spread to other vendors as well.

Third, Google will continue to push into new markets. With the downturn in online ad revenue growth, the company has little choice but compete where it once was indifferent or benign. I find Google difficult to characterize because it attacks markets in ways that make it hard to answer the question, “What’s next?”

Let me give you one concrete example of Google’s potential energy in publishing.

Here is an image from Google patent application 2007/0198481. One Google employee told me that I created this output. Googlers perceive themselves as quite intelligent, but there is no excuse for being uninformed about what one’s own company is putting in its open source documents. But that arrogance may be Google’s Achilles’ heel.

The company will, therefore, squeeze companies in traditional business sectors such as publishing as well as the those in search and information access.

Fourth, traditional business models will become caught in the new economic environment. Just as General Motors and Harcourt Brace find that their basic business assumptions are invalid, many information access, content processing, and search companies will face similar challenges. Knowing that a cherished business assumption is not working does not mean that the organization can change. Look at Dialog Information Services, now part of Cambridge Scientific Abstracts. The Dialog operation is essentially unchanged after 30 years, and yet the demonstrations and approach are the same as they were when I attended my first online show a quarter century ago. Companies unable to change are destined to face the hard choices confronting GM and Ford. I find it ironic that Microsoft has compared its approach to data centers as similar to Henry Ford’s manufacturing approach for the Model T.

Conclusion

Let me conclude with three observations about 2009. I have been incorrect enough times that even a Kentucky boy knows not to make predictions.

Observation One. Information access must address new modes of work. Mobility, context sensitive search, and answers—these shifts require that information access respond to continuous inputs from users, constituents, and colleagues.

Observation Two. An easy to use, appropriate interface is becoming more and more important. However, to assert that search or content processing is a simple procedure or that vendors are laying their systems bare is wrong headed. Vendors must protect their intellectual property while creating “hooks” that allow their systems’ functions to be integrated into other enterprise applications. Misjudging complexity means that some problems may arise and be difficult to remediate.

Observation Three. Rigor will be increasingly important in 2009. Martin White and I in our new study “Successful Enterprise Search Management” emphasize that expediency may create more problems than investing adequate time in defining requirements, figuring out what information must be processed, and how that information can be presented to allow users to solve real-world problems. Without a method, we will face only moving targets in information. One side effect is that we will repeat the procure deploy repeat cycle that characterizes many firms approach to electronic information.

Let me close by summarizing my key points:

Beyond search
Less slack for a bad decision
Consolidation and business failure
Financial issues
Confusion about “which system does what”

Let’s move through 2009 committed to solving users’ information problems. Solutions, not silliness.

Stephen Arnold, December 4, 2008

Written by Stephen E. Arnold · Filed Under Enterprise, Feature, Online (general), Search, Semantic, Text processing

Comments

One Response to “Information 2009: Challenges and Trends”

ChunkIt’s Evolution of Search : Beyond Search on December 21st, 2008 12:02 am

[…] I have no comments about the future of search. My position has been articulated in my summary of my talk in London on December 4, 2008, here. […]

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Employment
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Information 2009: Challenges and Trends

Successes

Progress Evident

User Concerns

Three Contextual Issues

Trends in Search, 2009

Conclusion

Comments

Search the site

Categories

Archives

Recent Posts

Meta

Beyond Search

Information 2009: Challenges and Trends

Successes

Progress Evident

User Concerns

Three Contextual Issues

Trends in Search, 2009

Conclusion

Share this:

Comments

Search the site

Categories

Archives

Recent Posts

Meta