SAP CEO Revelation: Costs Are Bad

December 31, 2008

The only reason I pay any attention to SAP, the German software giant founded on IBM-type thinking, is that the company has a search system called TREX (few know much about this puppy), the company pumped several million into Endeca (a search vendor), and demonstrated an interest in hooking SAP into SharePoint for information access. I have to admit that there is not much information available about any of these search related activities. I do get pumped into my digital playpen quite a few news items about annoyed SAP customers and resellers. Based on my narrow view in Kentucky coal country, I think these news items underscore problems with bloated, expensive, time intensive middleware. Search is not much of a priority, if my reading of the news items is accurate. Imagine my surprise when I read Spiegel Online International’s interview with SAP co founder, Hasso Plattner, here. The Der Spiegel article is in English, so maybe the translator muddied some of the crystal clear ideas. Herr Plattner is described as a “benefactor,” a surprising adjective for a fellow who can get millions from a client before the SAP system is up and running. Might be a translation issue? Several points warranted my jotting them down; to wit:

No reference to search. I assume that its omission is related to the news hole into which the article had to squeeze.
The present financial crisis is due to greed.
Americans have minimal health insurance.
Americans bought “massive cars”. (Some of these are German vehicles and some American companies bought massive, bloated, expensive middleware from German software companies I opined.)

For me the most significant comment was this whizzer:

The financial crisis was unbelievably quick to affect real business. Many companies are considering layoffs. Not even a success story like SAP can give a clear outlook for 2009.

SAP is a success story. I wonder if some of the stakeholders would agree with this view? Check out this interview. I wish information access, content processing and search had been discussed. I think 2009 will be a challenging year for SAP and those who have a big position in the company. A London investment banker told me that he was certain SAP would not go away. I am not as confident.

Stephen Arnold, December 31, 2008

Written by Stephen E. Arnold · Filed Under Business strategy, Enterprise, Financial, News | Comments Off on SAP CEO Revelation: Costs Are Bad

Duplicates and Deduplication

December 29, 2008

In 1962, I was in Dr. Daphne Swartz’s Biology 103 class. I still don’t recall how I ended up amidst the future doctors and pharmacists, but there I was sitting next to my nemesis Camille Berg. She and I competed to get the top grades in every class we shared. I recall that Miss Berg knew that there five variations of twinning three dizygotic and two monozygotic. I had just turned 17 and knew about the Doublemint Twins. I had some catching up to do.

Duplicates continue to appear in data just as the five types of twins did in Bio 103. I find it amusing to hear and read about software that performs deduplication; that is, the machine process of determining which item is identical to another. The simplest type of deduplication is to take a list of numbers and eliminate any that are identical. You probably encountered this type of task in your first programming class. Life gets a bit more tricky when the values are expressed in different ways; for example, a mixed list with binary, hexadecimal, and real numbers plus a few more interesting versions tossed in for good measure. Deduplication becomes a bit more complicated.

At the other end of the scale, consider the challenge of examining two collections of electronic mail seized from a person of interest’s computers. There is the email from her laptop. And there is the email that resides on her desktop computer. Your job is to determine which emails are identical, prepare a single deduplicated list of those emails, generate a file of emails and attachments, and place the merged and deduplicated list on a system that will be used for eDiscovery.

Here are some of the challenges that you will face once you answer this question, “What’s a duplicate?” You have two allegedly identical emails and their attachments. One email is dated January 2, 2008; the other is dated January 3, 2008. You examine each email and find that difference between the two emails is in the inclusion of a single slide in the two PowerPoint decks. You conclude what:

The two emails are not identical and include both and the two attachments
The earlier email is the accurate one and exclude the later email
The later email is accurate and exclude the earlier email.

Now consider that you have 10 million emails to process. We have to go back to our definition of a duplicate and apply the rules for that duplicate to the collection of emails. If we get this wrong, there could be legal consequences. A system develop who generates a file of emails where a mathematical process has determined that a record is different may be too crude to deal with the problem in the context of eDiscovery. Math helps but it is not likely to be able to handle the onerous task of determining near matches and the reasoning required to determine which email is “the” email.

Which is Jill? Which is Jane? Parents keep both. Does data work like this? Source: http://celebritybabies.typepad.com/photos/uncategorized/2008/04/02/natalie_grant_twins.jpg

Here’s another situation. You are merging two files of credit card transactions. You have data from an IBM DB2 system and you have data from an Oracle system. The company wants to transform these data, deduplicate them, normalize them, and merge them to produce on master “clean” data table. No, you can’t Google for an offshore service bureau, you have to perform this task yourself. In my experience, the job is going to be tricky. Let me give you one example. You identify two records which agree in field name and data for a single row in Table A and Table B. But you notice that the telephone number varies by a single digit. Which is the correct telephone number? You do a quick spot check and find that half of the entries from Table B have this variant, or you can flip the analysis around and say that half of the entries in Table A vary from Table B. How do you determine which records are duplicates.

Written by Stephen E. Arnold · Filed Under Database, EDiscovery, Enterprise, Feature, Online (general), Search, Text analytics, Text processing | Comments Off on Duplicates and Deduplication

Information 2009: Challenges and Trends

December 4, 2008

Before I was once again sent back to Kentucky by President Bush’s appointees, I recall sitting in a meeting when an administration official said, “We don’t know what we don’t know.” When we think about search, content processing, assisted navigation, and text mining, that catchphrase rings true.

Successes

But we are learning how to deliver some notable successes. Let me begin by highlighting several.

Paginas Amarillas is the leading online business directory in Columbia. The company has built a new systems using technology from a search and content processing company called Intelligenx. Similar success stories and be identified for Autonomy, Coveo, Exalead, and ISYS Search Software. Exalead has deployed a successful logistics information system which has made customers’ and employees’ information lives easier. According to my sources, the company’s chief financial officer is pleased as well because certain time consuming tasks have been accelerated which reduces operating costs. Autonomy has enjoyed similar success at the US Department of Energy.

Newcomers such as Attivio and Perfect Search also have satisfied customers. Open source companies can also point to notable successes; for example, Lemur Consulting’s use of Flax for a popular UK home furnishing Web site. In Web search, how many of you use Google? I can conclude that most of you are reasonably satisfied with ad-supported Web search.

Progress Evident

These companies underscore the progress that has been made in search and content processing. But there are some significant challenges. Let me mention several which trouble me.

These range from legal inquiries into financial improprieties at Fast Search & Transfer, now part of Microsoft to open Web squabbles about the financial stability of a Danish company which owns Mondosoft, Ontolica, and Speed of Mind. Other companies have shut their doors; for example, Alexa Web search, Delphes, and Lycos Europe. Some firms such as one vendor in Los Angeles has had to slash its staff to three employees and take steps to sell the firm’s intellectual property which rightly concerns some of the company’s clients.

User Concerns

Another warning may be found in the results from surveys such as the one I conducted for a US government agency in 2007 that found dissatisfaction with existing search systems in the 65 percent range. AIIM, a US trade group, reported slightly lower levels of dissatisfaction. Jane McConnell’s recently released study in Paris reports data in line with my findings. We need to be mindful that user expectations are changing in two different ways.

First, most people today know how to search with Google and get useful information most of the time. The fact that Google is search for upwards of 65 percent of North American users and almost 75 percent of European Union users means that Google is the search system by which users measure other types of information access. Google’s influence has been essentially unchecked by meaningful competition for 10 years. In my Web log, I have invested some time in describing Microsoft’s cloud computing initiatives from 1999 to the present day.

For me and maybe many of you, Google has become an environmental factor, and it is disrupting, possibly warping, many information spaces, including search, content processing, data management, applications like word processing, mapping, and others.

Microsoft is working to counter Google, and its strategy is a combination of software and low adoption costs. I believe that Microsoft’s SharePoint has become the dominant content management, collaboration, and search platform with 100 million licenses in organizations. SharePoint, however, is not well understood as technically complex and a work in progress. Anyone who asserts that SharePoint is simple or easy is misrepresenting the system. Here’s a diagram from a Microsoft Certified Gold vendor in New Zealand. Simple this is not.

Written by Stephen E. Arnold · Filed Under Enterprise, Feature, Online (general), Search, Semantic, Text processing | 1 Comment

Search: Simplicity and Information Don’t Mix

December 1, 2008

In a conversation with a bright 30 something, I learned that a person insisted that the Google Search Appliance was “simple and easy”. I asked the person, “Did the speaker understand that information is inherently difficult so search is not usually simple?”

The 30 something did not hesitate. “Google makes the difficult look easy.”

The potential search system customer might hear the word “simple” and interpret the word and its intent based on the listener’s experience, knowledge, and context. “Simple”, like transparency, is a word that covers a multitude of meanings.

My concern is that search has to deliver information to a user with a need for fact, opinion, example, or data. None of these notions is known to the software, electrical devices, and network systems without considerable technical work. Computers are generally pretty predictable. Smart software improves the gizmo, but the smarter software becomes the less simple it is.

So, when a system like the Google Search Appliance or any search system for that matter is described as simple, I have questions. I don’t think the GSA is simple. The surface interface is simplified. The basic indexing engine is locked up and accessible via point and click interfaces or scripts that conform to the OneBox API. But anyone who has tried to cluster GSAs and integrate the system into proprietary file types knows that the word “simple” is pretty much wrong.

Now what about search becoming “simple and easy”?

Search is simple because of the browser and the need to type some words in a search box or look at a list of links and click one. Search is not simple. I would go so far as to say that any system that purports to allow a user to access digital information is one of the most complex technical undertakings engineering, programmers, and other specialists have undertaken.

That’s why search is generally annoying to most of the people who have to use the systems.

Now let’s consider the notion of a “transparent search system.” I have to tell you that I don’t know why the word “transparency” has become a code word for “not secret”. When someone tells me that a company is transparent, I don’t believe them. A company cannot be transparent. Most outfits have secrets, market with ferocity first and facts second, and wheel and deal to the best of their ability. None of this “information” becomes available unless there’s a legal matter, a security breach, or a very careless executive.

Are search systems transparent? Nope. Consider Autonomy, Google, or any of the high profile vendors of information access systems. Google does not allow licensees to poke around the guts of the GSA. Autonomy keeps the inner workings of IDOL under wraps. I have heard one Autonomy wizard say,”Sometimes we need to get Mike Lynch to work some of his famous magic to resolve an issue.” I track about 350 companies in the search and content processing space. I make my living trying to figure out how these sytems work. Sue Feldman and I wrote a 10-page paper about one small innovation that interests Google. Nothing about that innovation was transparent, nor was it “simple” I might add.

What’s Up?

I think that consultants and parvenues need an angle on search, content processing, text mining, and information access. Since search is pretty complicated, who can blame a young person with zero expertise for looking at the shopping list of issues that are addressed in Successful Enterprise Search Management, and deciding to go the “simple” route.

I understand this. I worked at a nuclear consulting firm for a number of years. I always thought I was pretty good in math, physics, and programming (if the type of programming done in 1971 could be considered sophisticated). Was I wrong? I was so wrong it took me one year to understand that I knew zero about the recent work in nuclear physics. By the end of the second year, I had a new appreciation for the role of Monte Carlo calculations in nuclear fuel rod placement. For example, you don’t inspect nuclear rods in an online reactor. You would have some helath problems. So, you used math, and you needed to be confident that when you moved those bundles of nuclear fuel around, you got the used up ones where they were supposed to go. Forget the modest health probem. The issue would be a tad more severe.

Search shares some complexity with nuclear physics. The essence of search today is hugely complex subsystgems that must perform so the overall system works. Okay, that applies to a nuclear reactor. You can’t really inspect what’s going on because there are too many data points. Yep, that’s similar to the need to know what’s happening in a reactor using math and models. A search system can exhibit issues that are tough to track down because no one human knows where a particular glitch may touch another function and cause a crash. Again, just like a nuclear reactor. Those control rooms you see in the films are complicated beasties for a reason. No one really knows what exactly is happening to cause an issue locally or remotely in the system.

Now who wants to say, “Nuclear engineereing is simple?” I don’t see too many people stepping forward. In fact, I think that most people know enough to not offer an opinion when it comes to nuclear engineering and the other disciplines required to keep the local power generation plant benign.

I can’t say the same for search. Serach is popular and it has attracted a lot of people who want to make money, be famous like a rock star, or who know one way to beat the financial down turn is to cook up an interesting spin on a hot topic. I congratulate these people, but I think the likelihood of creating trouble is going to be quite high.

I have learned in my 65 years one thing:

What looks simple isn’t.

Try and do what a professional does. You probably won’t be able to do it. Whether physical or intellectual, if you haven’t done the time, you can’t equal the professionals’. Period.

At a conference, a speaker mentioned that for a person to become accomplished, the individual has to work at a particular skill or task for 10,000 hours. I know quite a few people who have spent 10,000 or more hours working on search. I wrote a book with one of these people, Martin White. I am a partner with another, Miles Kehoe. I know maybe 50 other people in the same select group. Most of the consultants and experts I meet are not experts in search. These people are expert at being friendly or selling. Those are great compentencies, but they are not search related.

If you have read a few of my previous posts in this Web log, you know that any search or content processing system described as “simple” or “easy” is most definitely not either. Search is complicated. Marketing and sales “professionals” routinely go to meetings and say, “Search is simple. Our system is completely open. Your own technical team can maintain the system.” In most cases, I don’t believe the pitch.

That’s why the majority of users are annoyed with search in an organization. And why most of the search systems end up in quite a pickle. See the upside down and back wards engine in the picture below. How did this happen? I haven’t a clue, and that is how I react when I see a crazy search and information access system at an organization.

Let me give you an example. A large not for profit and government subsidized think tank had the following search systems: Microsoft SharePoint, Open Text, multiple Google Search Appliances, and a couple of legacy systems I had not encountered for a decade. Now the outfit wants to provide a single interface to the content processed by this grab bag of systems. What makes this tough is that one can use any of the systems to provide this access. The organization did not know how to do this and wanted to buy a new system to deliver the functionality. Crazy. What the outfit now has is another search system and the problem is just more complicated. The “real fix” required thinking about the needs of the users and performing the intensive informatoin audit needed to determine the scale of the project. This type of “grunt work” was not desirable. The person describing this situation to me said, “We want a simple solution.”

I am sure they do. I want to be 18 again and this time I want to look like Brad Pitt, not some troll from the catacombs in Paris. Won’t happen.

How did we get our search system in this predicament?

Three Types of Simple Search

Let me give you three examples:

Boil the ocean easy. Some vendors pitch a platform. The idea is that a licensee plugs in information connectors, the system processes the content, and the user gets answers. Guano. In fact, double guano. This approach is managerially, technically, and financially complex. Boiling the ocean solutions are the core reason why such outfits as IBM, Microsoft, Oracle, and SAP give away search. By wrapping complexity inside of complexity, the fees just keep rolling in. The multi month or multi year deployment cycles guarantee that the staff responsible for this solution will have moved on. Search in most boil the ocean solutions only works for some of the users.
Buy ’em all. Use Web services to hook ’em up easy. Quite a few vendors take this approach. The verbal acrobatics of “federated search” or “metasearch” gloss over the very real problems of acquiring disparate content without choking the network, building a fortune on a repository infrastructure, and transforming the content to a representation are happily ignored or marginalized. Unfortunately these federated solutions require investment, planning, and building. I wish I had a dollar every time I have heard one vendor struggling to make significant sales say the words “federated” and “easy” in the same sentence.
Unpack it, plug it in, and just search easy. This argument is now coming from vendors who ship search appliances and from vendors who ship software described as appliances. Hello, earth. Are you sentient? Plugging in an appliance delivers one thing: toast. These gizmos have to be set up. You have to plan for failure which means two gizmos and maybe clusters of gizmos. In case you haven’t tried to create hot spares and fail over search systems, the work is not easy. And you haven’t tackled the problem of acquiring, transforming, and processing the content. You haven’t fiddled with the interface that marketing absolutely has to have or the MBAs throw a hissy fit. Get real. When a modern appliance breaks, you don’t fix it. You buy another one. You don’t open a black box iPod or BlackBerry and repair it. You get a new one. The same applies to search. What’s “easy” is the action you take when the system doesn’t work.

Written by Stephen E. Arnold · Filed Under Business strategy, Feature, Financial, Search, Technology | 10 Comments

Microsoft and Pricing

November 19, 2008

I saw a new story in Seattle Tech Report here that Microsoft is making is OneCare security service free. A short time later I came across Microsoft’s own news release about this pricing change here. Bundling or giving away services free is not a new idea in software. The notion is to give customers a taste and then sell them more has worked many times. In the Microsoft news release, the company says:

Windows Live OneCare will continue to be sold for Windows XP and Windows Vista at retail through June 30, 2009. Direct sales of OneCare will be gradually phased out when “Morro” becomes available. Regardless of their method of purchase, Microsoft will ensure that all current customers remain protected through the life of their subscriptions.

The marketing technique is little more than shareware or freeware with a catch.

Then I remembered that Microsoft was reducing prices for its Dynamics products. The prices for its cloud services for Exchange and SharePoint were quite competitive as well. Even the Zune, according to CNet news is getting new features and a lower price. You can read “Microsoft Chopping Zune Prices” here.

The question I asked myself, “Will Microsoft’s price cutting and no fee initiatives extend to Microsoft Fast enterprise search?” My hunch is that the Fast ESP search technology may become more affordable in the months ahead. Here’s my reasoning:

A number of high profile vendors have rolled out more robust content processing solutions that “snap in” to SharePoint. Examples range from Autonomy to Coveo to Exalead to Interse to ISYS to dozens of other vendors. Companies who want to “work around” SharePoint search problems have an abundance of options. Microsoft Fast may have to use severe price cuts to keep customers from getting out of the corral
As the economic noose tightens on organizations, some vendors may offer a two-fer deal; that is, sign up now, get one year free and pay only for the second year. This approach may be quite appealing in some organizations. In fact, in a recent review of Google prices for the US government, one could easily conclude that Google is keeping this option available to its resellers. The idea is to get shelf space or the camel’s nose into the tent.
New players may be willing to install a proof of concept for little or no money. These upstarts may provide “good enough” solutions that allow an organization to solve a tough content processing problem without spending much money.

I see the present economic climate forcing some Darwinian actions and Microsoft Fast may have to move quickly or face escalating competition within the Microsoft ecosystem. After spending $1.2 billion for a Web part and a police raid, there may be some strategic pricing changes Redmond may have to consider to adapt to the present enterprise market for search and content processing. If you are a Microsoft champion, please, help me understand if my analysis is on track or off track. Use the comments section and bring along some facts, please. I have enough uninformed inputs from my pals Barry and Cyrus to last the winter.

Stephen Arnold, November 19, 2008

Written by Stephen E. Arnold · Filed Under Financial, Microsoft, News, Online (general), Semantic, Technology, Text analytics, Text processing | Comments Off on Microsoft and Pricing

Search Security: An Oxymoron

November 18, 2008

ZDNet runs a nifty feature called IT Facts. Every day, the ZD editors and writers post interesting statistics. Now if you have a knack for numbers, you can make those equations sit up, roll over, and play dead. Despite the “flexibility” of statistical methods, I found “42% of Organizations Reported Unauthorized Access to Their Active Directory.” If you don’t know what an AD or Active Directory is, click here and drink deeply of the information. If you are in a hurry, its a gizmo Microsoft cooked up to make security quick and easy for certified Microsoft engineers to set up. According to Osterman Research, found security loopholes in two fifths of the sample’s servers. Think about third party search systems that “look” at AD and use it to manage user access to search. In my opinion, problems with one percent of the accesses warrants concern on my part. I know folks are rushing to SharePoint, but could Active Directory be this bad? I hope not. My hunch is that anyone with a Microsoft AD will want to do some checking.

Stephen Arnold, November 18, 2008

Written by Stephen E. Arnold · Filed Under Uncategorized | Comments Off on Search Security: An Oxymoron

Azure as Manhattan Project

November 3, 2008

I usually find myself in agreement with Dan Farber’s analyses. I generally agree with his “Microsoft’s Manhattan Project” write up here. Please, read his article, because I can be more skeptical about Microsoft’s ability to follow through with some of its technical assertions. It is easy for a Microsoft executive to say that software will perform a function. It is quite a different thing to deliver software that actually delivers. Mr. Farber is inclined to see Microsoft’s statements and demos about Microsoft Azure as commitment. He wrote:

Microsoft’s cloud computing efforts have gotten off to a slow start compared with competitors, and it’s on the scale of a Manhattan Project for Windows. Azure is in pre-beta and who knows how it will turn out or whether consumers and companies will adopt it with enough volume to keep Microsoft’s business model and market share intact. But there is no turning back and Microsoft has finally legitimized Office in the cloud.

My take is similar but there is an important difference between what Microsoft is setting out to do and what Google and Salesforce.com, among others, have done. Specifically, Google and Salesforce.com have developed new applications to run in a cloud environment. Google has many innovations, including MapReduce and Salesforce.com has its multi tenant architecture.

Microsoft’s effort will, in part, involve moving existing applications to the cloud. I think this is going to be an interesting exercise. Some of these targeted for the cloud applications like SharePoint have their share of problems. Other applications do not integrate well in on premises locations so those hiccups have to be calmed.

The big difference between Azure and what Google and other Microsoft competitors are doing may be more difficult than starting from ground zero. Unfortunately, time is not on Microsoft’s side. Microsoft also has the friction imposed by the bureaucracy of a $60.0 billion company. Agility and complexity may combine to pose some big challenges for the Azure Manhattan Project. The Manhattan Project was complex but focused on one thing. Microsoft’s Azure by definition has to focus on protecting legacy applications, annuity revenue, and existing functions in a new environment. That’s a big and possibly impossible job to get right on a timeline of a year and a half.

Stephen Arnold, November 3, 2008

Written by Stephen E. Arnold · Filed Under Business strategy, Cloud computing, Microsoft, News, Online (general), Technology | 1 Comment

Microsoft and Search in a Time Warp

November 2, 2008

In grade school in Illinois, I recall learning the meaning of the word anachronistic. For some reason, the explanation has stuck with me for more than 55 years. The teacher, Miss Chessman, told the class, “Ancient Greeks did not have an alarm clock.” The idea is that if you mix up when things occur, you run the risk of creating the equivalent of a dog’s breakfast.

My newsreader delivered CIO Magazine’s “Search Will Outshine KM” by Mike Altendorf to read. I don’t know Mr. Altendorf, and I have to admit that I disagree with a couple of the points he makes in this two part article, which you can read here. I am a tired goose, and I don’t want to trigger a squabble in the barn yard. I do want to point out where Mr. Altendorf and I part company.

First, the notion of search outshining KM is not something I have thought much about. KM is mostly baloney, one of those “trends” that promise much and deliver little that one can measure. When I watch intelligent people leaving one company for another, no software system captures what that person knows. IBM is trying to prevent a chip designer from leaving Big Blue to join Apple. If KM worked, IBM wouldn’t take such extreme action to prevent a person from changing jobs. That’s KM for you. It doesn’t deliver. And search? It is, in general, not too helpful either. In fact, search is one of the few software systems to engender a dissatisfaction rate among its users of 60 to 70 percent. In my opinion, search outshining KM is a silly assertion, and one that makes it seem that one lousy system can deliver information better than another lousy system. Both search and KM work best when applied to specific problems and bounded by realistic expectations and budgets.

Second, the reference to Microsoft’s acquisition of Fast Search and Transfer and Powerset is startling. First, Mr. Altendorf makes no reference to the police action in Oslo that threatens to undermine the credibility of the Fast Search technology, finances, and executives. Second, Powerset is not complement to Fast Search technology for two reasons: [a] Powerset technology is part of the Live.com service and has not to my knowledge been hooked to the Fast Search system at this time and [b] notion that Microsoft can “tie together” disparate technologies is out of touch with reality. Let me be clear. Microsoft has compatibility issues within its own product families; specifically, SharePoint and the Dynamics range of software. When you toss in the Fast Search conglomeration of original code, the bits of open source Fast Search has used in its system, and the technology from the acquisitions Fast Search made prior to its purchase by Microsoft, you have quite a bit of integrating to do. Now add the Powerset original code with the licensed technology from Xerox Parc, and you have even more work. Microsoft’s units can’t make the ribbon interface consistent across Outlook, Word, and Visio in Office 2007. Mr. Altendorf’s blithely reassures me that Microsoft can work out these incongruities. I beg to differ.

Written by Stephen E. Arnold · Filed Under Business strategy, Feature, Microsoft, Online (general), Technology | 2 Comments

Microsoft Azure

October 28, 2008

The most useful write up about Microsoft’s cloud computing play is Mary Jo Foley’s. You can find “Microsoft’s Azure Cloud Platform: A Guide for the Perplexed” here. Her approach is to describe the layers of Azure, highlighting important components like Red Dog, the base operating system. Please, read her write up. It’s an excellent summary. On the other hand, Azure might be a big demo. Click here for this view. The Microsoft Azure splash page is here.

The questions that I have about price, licensing, service level agreements, and deployment data remain unanswered. I watched a couple of videos today, but the Microsoft engineers were too cheerful for me. I tuned the programs out, but I do recall the word “great” being used several times. The layers are not surprising at all. The engineering details about resolving bottlenecks, eliminating manual tasks and moving them to smart software, and getting away from expensive, high performance data center gear are lacking. I remain baffled about SharePoint search running from Azure. In my experience, performance is a challenge when SharePoint runs locally, has resources, and has been tuned to the content. A generic SharePoint running from the cloud seems like an invitation to speeds similar to my Hayes 9600 baud dial up modem. I am taking a wait and see approach. Clouds are wonderful as long as the user has bandwidth, the cloud does not crash, and unexpected software problems don’t make an application sit and wait while the operating system tries to figure out what to do what an unexpected event occurs. Some of the engineering issues are described in the Monsoon paper by Albert Greenberg, et al, which is available from the ACM as 978-1-60558-181-1/08/08. Azure has some interesting engineering short cuts baked into it if this paper “towards a Next Generation Data Center Architecture: Scalability and Commoditization” is accurate.

Stephen Arnold, October 28, 2008

Written by Stephen E. Arnold · Filed Under Cloud computing, Microsoft, News, Technology | 3 Comments

Search Meltdown at the Digital Studio 54

October 8, 2008

After a whirlwind series of meetings outside the U.S. I picked up information about growing problems in the enterprise search sector. Not surprisingly, my newsreader offered a trigger for this Web log post. Navigate to Dot.Life, one of the bewildering array of Web logs from the BBC, here. Rory Cellan-Jones, whom I have never met, wrote “Technology – The Party Really Is Over” on October 8, 2008. For me the most interesting point was this comment:

And, almost unnoticed, technology companies have been sucked into this vortex of gloom…. Autonomy, which specializes in search technology for big businesses, has recently entered the FTSE 100. But over the last week its shares have been tumbling as rapidly as the index as a whole. They started above £10, and last time I looked they were around £7.60. This despite a trading update in which the chief executive said Autonomy expected its third quarter results to be “significantly ahead of expectations”. The market has decided that the big enterprises which are Autonomy’s customers will be trimming their spending too.

Let’s step away from Autonomy and think about the mood at the Enterprise Search Summit held a fortnight ago in San Jose, California. Here are several “economic” observations that provide some color for my observations about what’s coming in search, content processing, and text mining:

I learned of several executive shake ups. These have not been announced, but when heads roll at a major conference, I sense that sales performance may be triggering the game of musical chairs. As I visited vendor stands, I had to ask, “Didn’t you work at X?” Old faces in new booths were common enough for me to notice. (I was jet lagged and bored, so it took some effort to light up my radar.)
I heard that one vendor pulled out at the last minute, deciding that waiting for prospects to walk by booth was less effective than making direct sales calls or working the email marketing system. I wondered why there were several yawning gaps amidst the exhibitors. Priorities or cash may have been the deciding factor.
My hallway conversations with PR folks left me with a sense that this is not a good time to be in the buzz business. Several PR wizards told me that their clients wanted to get coverage on the hot Web logs. Too bad this Web log — Beyond Search — has only two or three readers. Corporate honchos and honchettes want their firm’s solutions on the major news aggregation sites and showing up on StumbleUpon.com and other trend makers. One person told me, “Web PR is tough.”
Talk about Mercado and SurfRay suggested that both firms were trying to dog paddle in rough seas. The Mercado news, which I reported on this Web log, if true, may be ominous. SurfRay is best known for its Mondosoft SharePoint solution, and the financial reports from Denmark just arrived. Those documents may shed some light on that firm’s health, but two people asked me what I knew about the company.

Is the party really over? No, I don’t think so. There will be some severe dislocations and realignments. But in the search and closely related markets, the task of selling a complex system mean long sales cycles. With cash drying up, the sluggish sales and executive churn are symptoms of a disease that has been infecting silently for a long time. The good news is that once the system adapts, new opportunities will poke their noses from behind the CFOs’ locked doors.

Investment firms with money in search and content processing firms will demand more from their stable of search stallions. I think more focus will be brought to the sales process. A good example will be search firms who deliver solutions that work with a minimum of the three to six month set up period. Units with specific problems will license solutions from firms who can deliver a system that works, not a bunch of jargon about intelligent system, latent semantic indexing, and automatic taxonomy generation in real time.

The financial downturn will motivate customers to demand results and more for whatever the customer pays a vendor. The vendors will be working in a world “red in tooth and claw”. I for one will be delighted to cull down the list of 350 to 400 vendors who assert that their firm offers “enterprise search”. I don’t know what enterprise search means, and if some vendors are finding customers unwilling to write checks for their systems, I submit respectfully that the customers on a budget want to buy something that delivers a pay off, can be explained to its users, and solves a specific problem.

For these firms — what I call the vendors to watch in the Gilbane study — the party may just be beginning for a select few. For those who make the cut, the old world of Studio 54 will keep on trucking.

Stephen Arnold, October 8, 2008

Written by Stephen E. Arnold · Filed Under Business strategy, News, Search, Technology | 3 Comments

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

SAP CEO Revelation: Costs Are Bad

Duplicates and Deduplication