Mark Logic: Content Applications Fuel Company’s Growth

June 17, 2008

Mark Logic provides information access and delivery solutions that accelerate the creation of content applications. Customers across a range of industries rely on Mark Logic to repurpose content and deliver that information through channels. Some vendors describe this suite of functions as an enterprise publishing system.

The company has been growing at a furious pace. Dave Kellogg, former Business Objects’ executive, said:

Mark Logic… is a database management system built to natively manage XML documents and optimized for handling vast numbers of them (I mean hundreds of terabytes) with high performance. It’s a read/write system. It has a query language (XQuery). It has transactions and logging. You can use it, by itself–without the need to bolt it on to either a relational database or an application server–as the basis for content applications.

The company’s customers include Oxford University Press, O’Reilly Media, and the Congressional Quarterly. The company builds relationships with its customers. Mr. Kellogg says, “Our philosophy is to sell sell solutions to problems and avoid the stereotypical “drive-by” technology sale, where companies dump the software in the parking lot and leave.”

The full interview appears as part of the Search Wizards Speak series published by ArnoldIT.com. You can read the transcript of the interview with Mr. Kellogg here. The index to the full series of interviews is here.

Stephen Arnold, June 17, 2008

Microsoft Plans European Search Center

June 17, 2008

A news release popped into my mailbox this morning (June 17, 2008). The surprising announcement was that Microsoft will set up a search technology center in Europe. You can read the full announcement here.

(Access this story quickly, PR Newswire is one of the organizations whose content can be difficult to track down. Access to some content may require you to pay a fee. This link worked at 6 46 am on June 17, 2008.)

According to the announcement:

“The new center will be designed to help accelerate Microsoft’s investments in Live Search and disrupt the search and advertising marketplace to the benefit of both the consumer and the advertiser, in line with Microsoft’s recent announcement in the U.S. of Live Search cashback.”

What is startling to me is that Microsoft has not set a location for the search center. With the acquisition of Fast Search & Technology, I thought that Trondheim, Norway, would have been the default choice. Microsoft has a big operation in Cambridge, England, as well. The news release does say, “We’re already doing some great work in Europe in the enterprise search space through our January 2008 acquisition of Fast Search & Transfer SA…”

Fast Search’s AllTheWEb.com Web site was arguably the only challenger to Google’s Web index prior to Fast Search’s selling the AllTheWeb.com site to Overture. My experience with Fast Search is that its core Web crawling and indexing system is one of the firm’s core strengths. Furthermore, Fast Search had developed its own advertising technology, also sold to Overture. But the company reentered online advertising with internal work and the acquisition of Platfood.com.

One big question remains unanswered: Why not turn to Fast Search, a company that successfully challenged Google for Web search?

One thing is clear. Microsoft is making an effort to close the gap with Google. Perhaps the 2009 research center will do the trick. But the center will open in 2009. In the fast-moving Web world, the timing to me seems leisurely.

Stephen Arnold, June 17, 2008

Google: From the Disruptor to the Disrupted

June 17, 2008

I am a fan of ReadWriteWeb, and I found the essay by Bernard Lunn quite interesting. Mr. Lunn has identified the “11 Search Trends that May Disrupt Google.” ReadWriteWeb.com makes it easy to locate its articles, so you can track this story down easily.

I found the list of factors that may be moving Google into a different role: from disruptor to a company that is itself disrupted. On the whole, I agree with the ReadWriteWeb analysis. Of particular importance is the notion of “start ups using a new outsourced infrastructure.” Powerset is an example of a company taking a different approach. I have heard that Powerset makes use of Amazon Web Services, and I think this is an important aspect of the company to monitor if my information is accurate.

The other point that I found on target is the impact tagging may have upon Google. Not long ago Vivisimo announced that its system made it possible for a user to add a tag–that is, index term–to an item in a result list. Tagging is becoming one of the everyday activities for those who write Web logs. The Semantic Web has been slow in coming, but I think the “social tagging” function may be providing some opportunities that search engines, including Google, have yet to exploit fully.

I would add one other point to the factors that are likely to influence Google–the challenge of size. Google is now 10 years old, and it is getting big enough to encounter the friction that plagues any large organization. Google, therefore, changes more slowly even though certain innovations make users gasp. A competitor can exploit Google’s own inertia but that competitor must take care to stay clear of Google’s momentum.

A happy quack for a useful and thought provoking write up, ReadWriteWeb!

Stephen Arnold, June 17, 2008

Google and Mobile Search

June 17, 2008

I have been a fan of Ars Technica. Jacqui Cheng has an excellent essay about Google and mobile search. My eyes are too lousy for me to be enamored of small screens and smaller type. But I am out of the mainstream. Ms. Cheng reports that Google has captured 60 percent of the mobile search market. You should read the full story here.

But it is early in the mobile search world. Mobile search brings new challenges because “regular” search does not work too well. I prefer to get answers, pick choices from lists, or just accept the default link. I am lazy but the form factor inhibits me. Other demographics will behave differently. Over time, the mobile search market will undergo some rapid evolution. I will start paying more attention to this area.

Stephen Arnold, June 17, 2008

Microsoft’s Web Search Strategy Revealed: The Scoble Goldberg Interview

June 16, 2008

Online video does not match my mode of learning. Robert Scoble, a laurel leaf wwearerin the new world of video and text Web logs, conducted an interview with Brad Goldberg.

The interview is part of the Fast Company videos, and it is available here. The interview is remarkable, and I urge you to spend 31 minutes and listen to Brad Goldberg, General Manager of Microsoft Search Business Group.

The interview reveals useful information about the time line for Microsoft to capture market share fro9m Google and Microsoft’s ideas for differentiating itself from Google in Web search.

Surprisingly, there were no references that I could pick up to enterprise search, nor was there any indication that Mr. Goldberg was aware of the Fast Search & Transfer Web search technology which was quite good. As you may know, Fast Search withdrew from Web search in 2003, selling its AllTheWeb.com Web index to Overture. Yahoo gobbled Overture and used bits and pieces of the Fast Search technology recently. The “auto suggest” feature is still available from Yahoo’s AllTheWeb.com site. My tests suggest that today’s AllTheWeb.com uses the Yahoo Search index built by the Slurp crawler and the Fast Search technology for some of the bells and whistles on the site. The news search function is actually quite useful. If you are not familiar with it, you can try it here.

During the interview, Mr. Goldberg uses some sample queries to illustrate his claims about Live.com’s search performance, precision, and recall. I ran the “Paris” query on each of these systems, and I ran comparative queries on this Web log as well. After the interview, I took a look at the 2005 analysis of mainstream Web search systems here so I could gauge how much change has taken place in the last three years. Quick impression: Not much. You may want to perform similar as-you-listen tests. It is easy to see what search system responds most quickly, how the search results differ, and the features that each system makes available.

Three points in Mr. Goldberg’s remarks stuck in my mind. I want to mention each of these and then offer a few observations. Judging from the edgy comments to some my essays, I want you to know that you may not agree with me. That’s okay with me. Please, use the comments section to set me straight. Providing some facts to go along with your push back is helpful to me.

Key Points for Me

1. Parity or Microsoft’s Relevancy Is As Good as Google’s

Mr. Goldberg asserted that the major search services were at parity in terms of relevance and coverage. I found this notion somewhat difficult to comprehend. The data about Web search market share undermines any argument about parity which means, according to my understanding of the word “equality” or “equivalence”. I have had difficulty interpreting comments by whiz kids before, so I may be off base. My thought was that Google continues to gain market share at the expense of both Microsoft and Yahoo. The dis-parity is significant because Google, according to data mavens, accounts for 60 percent of more of user queries in the US. In Europe, the market share is higher. US search systems do not hold commanding leads in China, Korea, and other Eastern markets.

Should parity mean visual appearance, yes, Microsoft is looking more like Google. Here is the result of one of my test queries: “real estate baltimore maryland”.

googlesearch live search

On the surface these look alike. Closer inspection reveals that Google includes a canned form so I can narrow my result by location and property type. Google eliminates a step in looking for real estate in Baltimore. Microsoft’s result does not offer this feature, preferring to show “related searches”. I like the Google approach. I don’t make much use of machine-generated related queries. I have specialized tools to discern relationships in result sets.

Read more

Search Security: Not a Chance

June 16, 2008

USA Today (June 16, 2008) contains a story by Michelle Kessler. “Some Employees Buy Own Laptops, Phones for Work.” Navigate to a USA Today-equipped news stand or try to locate the story on the USA Today Web site here. Gannett is one of the traditional publishers whose Web site often befuddles me. Chasing down the story is worth the effort.

The key point for me is that the USA Today figured out that the expediency, management laxity, and financial pressures have changed the rules for providing employees with company-purchased computers and mobile devices. The data in the article are the usual “big numbers for big impact” and nifty looking charts. So forget the assertion that 39 percent of employees buy a laptop for work or that 43 percent purchase their smartphone. Divide these figures in half. The problem is clear.

If you have any illusions about the security of information in a search and retrieval system, you want to rethink your assumptions and question the assertions about secure behind-the-firewall search. In the work that I do, when an outside device is live and connected within an organization, a security problem exists. True, nothing may happen. But when a single outside device is behind the firewall and capable of receiving information from the organization’s system, a security risk exists. Feel free to pooh-pooh this if you wish.

In one investment bank, the information technology department locked down access to the Internet, instant messaging, and external mail accounts. What was the work around? A personal smartphone or a low cost access device with a secure digital slot or a USB connector.

I chuckle when search system vendors make their security features the key differentiator for their online search systems. Most vendors use the security procedures in place and do not try to layer more security on top of the organization’s existing security methods and systems. Verity’s token system was one of the better approaches. But even that method is vulnerable when users have their own gizmos behind the firewall.

The USA Today article makes it clear that organizations are allowing employees to purchase computing devices. The horse is out of the barn. A happy quack to the USA Today editor who okayed Ms. Kessler’s story.

Stephen Arnold, June 16, 2008

Google Customers, Actually Superstar Customers

June 15, 2008

Google is a secretive outfit, despite the river of information about various doings at the Googplex in Mountain View, California. If you want to know who some of Google’s enterprise customers are, you can find six of them with profiles here. I assume the notion of a “superstar” is different from being a real live Googler, but the designation is interesting as is the first name familiarity. These superstars may warrant a contact at Google who takes their calls and answers their emails. Grab the names and profiles before the range information drifts away. The enterprise applications range from federated search to collaboration.

Stephen Arnold, June 15, 2008

IBM Explains Text Analytics

June 15, 2008

A colleague called my attention to this April 2008 description of IBM’s view of text analysis. The essay “From Text Analytics to Data Warehousing” is more than processing content. The article by Matthias Nicola, Martin Sommerlandt, and Kathy Zeidenstein points toward what I call a “metaplay” or “umbrella tactic”. (You will want to read the posting here. When accessing IBM content, it is important to keep in mind that it can be difficult, if not impossible, to locate IBM information via the IBM search function. Pages available via a direct link like this “From Text Analytics to Data Warehousing” may require that you register, obtain an IBM user name and password, and then relaunch your search to locate the information. Other queries will return false drops with the desired article nowhere to be found. I’m not sure if this is OmniFind, Fast Search, Endeca, or some other vendor’s handiwork. But search and retrieval of IBM information on the IBM site can be frustrating to me. Click here now.)

The authors state:

This article review[s] the text analysis capabilities of IBM OmniFind Analytics Edition, including an analysis of the XML format of the text analysis results, MIML. It then examined different approaches that can help you extend the value of OmniFind Analytics Edition text analysis by storing analysis results from the MIML file into DB2 to enable standard business intelligence operations and reporting using the full power of SQL or SQL/XML.

As I worked through this article, reviewed the diagram, and explored the See Also references, one point jumped out at me. The mark up generated by the IBM system can be verbose. The emphasis on the use of DB2, IBM’s database system underscored for me that IBM text analytics requires software, DB2, and storage. In fact, without storage, the IBM text analysis system could grind to a halt. To increase the performance, the licensee may require additional IBM servers, management software, and other bits and pieces.

You have to store the “star schema for MIML” somewhere. Here’s what the structure looks like. Of course, the image is IBM’s and copyrighted by the company.

ibm architecture

I want to point out that this is one of the simpler diagrams in the write up.

Observations

  1. This write up suggests to me that IBM is defining text analytics as a component in a much, much larger array of software, hardware, and systems. My hunch is that IBM wants to shut the barn door before more standalone text analytics tools are sold into IBM shops.
  2. IBM is making explicit that text analytics is an exercise in data management. Google, I think, has much the same notion based on my reading of its technical papers.
  3. IBM has done a good job of making clear that software alone won’t deliver text analytics. Without the ability to scale, text analysis can choke most systems. Now IBM has to get this message to the information technology professionals who assume that their existing servers and infrastructure can handle text analytics.
  4. IBM has done an excellent job of moving the concept of text analysis as an add on into a larger constellation of operations. The notion of a metaplay or an umbrella tactic is important because individual vendors often ignore or understate the broader impact of their content processing subsystems.

I think this is an important write up. A happy quack to the reader who called the information to my attention.

Stephen Arnold, June 15, 2008

Search Wizards Speak Interviews

June 15, 2008

One of the handful of people who read my musings in this Web log told me that it was hard to locate the interviews with influential people in the enterprise search market. You can find the index to the 18 interviews at http://www.arnoldit.com/search-wizards-speak/ or click on this link.

On Monday, we will post another interview. This one is particularly exciting because it takes a look at a company with sophisticated technology based on search, database and content processing. The firm provides a solution of which search and indexing are components. I will give you one clue: the firm is growing at a double-digit pace in the enterprise publishing system sector. The company competes with some of the Big Names in search as well as with firms that are off the radar of many information access vendors. We’ll post the new interview on Monday, June 16, 2008. With this interview, I am going to take a hiatus because it is becoming quite difficult to reach people over the summer period in the United States.

However, I will be posting brief profiles of companies in the search and content processing sector. These will be updated on an irregular basis, and I will post an index page to these profiles. I want to create several test profiles, gauge reader reaction, and then finalize a standard format for the information.

Here are the direct links to the interviews completed through June 12, 2008. These are listed in alphabetical order.

Company Interview Subject Date of Interview
Attivio Ali Riaz

26-May-08

Bitext

Antonio S. Valderrábanos

14-Apr-08

Blossom Software

Alan Feuer, Ph.D.

18-Feb-08

Brainware

James Zubok

31-Mar-08

Coveo Solutions Inc.

Laurent Simoneau

11-Mar-08

Deep Web Technologies

Abe Lederman

10-Jun-08

Endeca

Pete Bell

17-Mar-08

Exalead

François Bourdoncle

25-Feb-08

Intelligenx

Iqbal & Zubair Talib

12-May-08

ISYS Search Software

Ian Davies

5-Mar-08

Kroll

David Chaplin

28-Apr-08

Northern Light

David Seuss

2-Jun-08

PolySpot

Olivier Lefassy

19-May-08

Silobreaker

Mats Bjore

12-Jun-08

Sinequa

Jean Ferré

21-Apr-08

Thunderstone

John Turnbull

7-Apr-08

Vivísimo

Raul Valdes-Perez & Jerome Pesenti

24-Mar-08

ZyLAB

Johannes Scholtes

5-May-08

A special thanks to the executives who participated in the interviews. I extended invitations to Microsoft and Fast (sorry, we can’t talk to you), Google (no one responded including the top gun in enterprise search David Girouard), and Autonomy’s Andrew Kanter (no, can’t talk. Life is too manic). Too bad for me, I guess. For the folks who could talk, were thoughtful enough to respond to email, and in control of their time–thank you and a contented quack, quack.

Stephen Arnold, June 15, 2008

The Duh Factor: Email, Distractions, and Workers

June 15, 2008

When I awakened at 6 30 am, I took a look at what my crawlers snagged as I slept. Email stories. Hundreds of email stories. You can sample the floods on Techmeme.com, Megite.com, and other aggregation services. The catalyst for this blog-astrophy appears to be this essay by Matt Richtel of the New York Times. I’m not sure you need a link in this Web log because this story has gone viral. Email, distraction, and digital addiction are, in my view, part of the furniture of living. These behaviors will be with us for some time.

Now, let me summarize “Lost in E-Mail, Tech Firms Face Self-Made Beast.” Email is a problem. Workers, companies, and any one else in the message flow spends time fiddling around. Wasted time means an expensive, often futile, experience.

Some of the pundits commenting on this essay by Mr. Richtel has made the leap to distractions of which email is one in the modern work space. You can sample this line of thought in the Business Week article “May We Have Your Attention, Please?” by Maggie Jackson.

More, Not Fewer, Messages

I have a different view of the email problem, and I am not sure what to make of the furor over any digital messaging.

First, we have more types of electronic messaging that are exponentiating the cost and attention problems and their costs in money and time. SMS, for example, adds to the message traffic. When BearStearns used to be in business, my client sent me SMS messages, and these to him were must-answer communications. Email was too slow. SMS traffic I learned in one of my studies is larger than email traffic. I don’t have the motivation to dig out the 2007 data I have but I recall being flabbergasted at the number of SMS fired off and the revenue these generate for telcos.

Second, we now must deal with micro blogging. Twitter.com, the Silicon Valley in crowd communication medium, allows one to broadcast information of great import to anyone interested in receiving a friend’s postings. Here’s “tweet” I cadged from Popurls.com, a service which presents random tweets:

YoungnRich Apparel california. @holli I nap every Saturday there [sic] so rejuvenating

Very helpful this post, Holli.

Third, the notion of “soft interruptions” just adds to the flow of message traffic. Toss in instant messaging–a form of information that pops up–that runs across networks. I dislike instant anything, but that’s my 64-year-old biases coming to the fore.

Stepping Back

Distraction is a fact of life. When I leave my log cabin in rural Kentucky and venture to the big city, I find myself in meetings. One experience I had in Seattle in the last month is illustrative of the situations I encounter.

I am giving a talk about a company’s technology. I don’t work for the company whose technology I am describing. I don’t even care if it works or not. I’m describing what this company says its technology will do. There are five people in the room. Each has a laptop with a wireless connection, a smartphone, and a beverage. A person rushes into the room with a laptop, smartphone, and dog. The late comer is the “boss” and he says, “That diagram is wrong. That technology will never work.” He then sits down, attends to his laptop, and sends one message on his smartphone. He then interrupts and asserts, “That device can’t perform that function. It doesn’t have two radios. Quit telling us about that function. It won’t work.”

I’ m not sure what to do, so I say, “No problem. I go on to another slide.” A short time later the “boss” leaves. After the presentation, the other attendees wander off.

This situation is representative of what I find in my work.

  • Many employees who are bright and confident in their ability to multi task. I find that humans who multi task may be operating at less than 100 percent efficiency, not the 100 percent plus that these folks think their doing two tasks at a time delivers. This old human does not multi task. I do one thing at a time. In fact, I have to work to keep my mind from wandering. Many of the people I encounter actively embrace wandering thoughts.
  • Over the years, I have found that people in knowledge jobs go to great lengths to appear busy, engaged, and in demand. At Booz, Allen & Hamilton in the late 1970s, the fellow who trained me–Dr. William P. Sommers–appeared calm and unhurried. It was a false front. He was busy, but he managed his time effectively and separated himself from the lesser beings at Booz, Allen because he was under control. Today, the appearance of busy-ness is highly valued. Intrusive crap is embraced because it connotes success.
  • The devices are fun for many people. I find small gadgets, including mini-notebook computers, maddening. I can’t see the screen. The keyboard is too small for my large, increasingly clumsy fingers. The gizmos are fragile, and I drop small slippery gadgets with great frequency. Younger folks enjoy the complexity and some watch videos on screens the size of match books. My hunch is that these professionals are playing with toys. Instead of Lego blocks, professionals today keep their childhood habits alive with digital play things.
  • Most professionals I encounter don’t know what the heck they are doing. Their expertise often lacks a broader business context. Reinventing the wheel is a popular pass time in many of the high-tech environments in which I find myself. The interest in mobile search is somehow new. Nope, like metatagging, it is the same old stuff gussied up with a new name. If you don’t know what to do to make a direct and immediate contribution in your work, humans generate fake smoke. The blue flickers on digital gizmos are the equivalent of laser light shows for a touring rock band’s stage dressing. Distraction is a bit of fakery.

You probably disagree with my take on this email discussion. My reaction is like the Cheers’ character who says, “Duh.” As professionals more cut off from meaningful work, distractions become more attractive. I can gauge the focus of a meeting by counting the number of laptops and smartphones in the room. When there are more gizmos than people, I know the company is in a management whirlpool. In one meeting, I had an audiences of 62 people. There were 109 devices. This company, if I were to name it, is one the media, investors, and customers believes is in a death spiral.

So, as the economy falters, the pressure on employees goes up. If the employee doesn’t have a clue about managing time and setting priorities, the distractions flow. Forget how many emails pile up. When I return from a trip out side the US, I winnow emails ruthlessly. I don’t waste time on email. If a person wants me to do something, there are ways to get my attention. One young consultant at an Internet research firm wrote me to participate in a survey. I wrote back, “No. I will now delete email from you and your company automatically.” End of problem from my point of view.

Distractions, therefore, provide a way to measure the intellectual and managerial skill of a worker. The best employees and colleagues know how to manage distractions. I like to think about Alexander, sitting in a stinking tent, somewhere east of modern Afghanistan and his ability to manage distractions. I can imagine his hearing, “The troops don’t have water” and “We don’t know where the enemy is” and “We don’t have enough fodder for the pack animals”. A New York Times writer observing this situation could easily report that Alexander is overwhelmed by yammering requires from his lieutenants, too many parchment or wax messages, and intrusions such as a raiding party intent on killing him. Alexander dealt reasonably well with distractions.

Is it possible that these squawks, cheeps, howls, and tweets about “the email problem” reveal the flaws of the individuals, not the problems of the messaging environment. Agree? Disagree? Let me know if you are not too distracted.

Update. Times of London introduces the notion of a “pond-skater mind” here. The fix may be to use other tools.

Stephen Arnold, June 15, 2008

Update 1: June 23, 2008 You may find “The Myth of Multitasking” by Christine Rosen germane. Writing in The New Atlantis, she summarizes the challenges of mutli tasking. I particularly liked her use of the phrase “acquired inattention”. You can read the full essay here. Highly recommended.

Update 2; June 24, 2008 Ars Technica has a useful essay about multi tasking. J.M. Gitlin’s “The Boss Made Me Do It” is here.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta