SurfRay, Dec 4 Update

December 4, 2008

A quick look at the Danish Company Registrar listing here has several SurfRay related listings. Using Google Translate, I thought one of the Mondosoft listings reported the firm as insolvent. Another uses the phrase ‘judicial winding’, which may suggest a legal process. If you have any information related to these listings, use the comments section of this Web log to keep me up to date. A happy quack to the reader who tipped me to this information.

Stephen Arnold, December 4, 2008

Google: More Aggressive Sales in DC

December 4, 2008

The Washington Post here revealed that Google wants to move the US government toward Google’s application platform. The newspaper and Google called this cloud computing, but the notion is that the US government can save money and improve its work flow by letting Google handle the computers and the software. Government professionals can just use the GOOG. “Google Goes to Washington, Gearing Up to Put Its Stamp on Government” by Kim Hart does a good job of reporting a push that has been increasing over the last 12 months. For me the most interesting portions of the article are those which provide data about Google’s reach; for example, the District of Columbia has more than 30,000 people using Google’s services. Another comment that caught my attention was:

Google’s foray into government business is not only a sign of the company’s expansion into other industries, it’s also a sign of the changes underway in Washington’s technology landscape. New firms are moving in, branching out and making deals, perhaps beginning to blur the line between the robust government contracting world and the consumer-minded firms that continue to take chances and thrive.

My take on this is slightly different, which is normal goose behavior. Specifically, Google’s office is busy because government agencies are contacting Google to talk about the firm’s products and services. In my experience, Google needs only to answer email and the phone. The firm’s magnetic pull is responsible for the strong uptake of Google Search Appliances, Google Maps, and other services. Other companies have to work much harder to get in to see top executives. Google is a big deal and the Googlers are viewed as minor celebrities.

As the ad business growth slows, Googzilla will become more of a disruptive force in the government sector and in the enterprise. Companies that once ignored the GOOG will have to adjust and fast.

Stephen Arnold, December 4, 2008

Cloud Downer

December 4, 2008

SAP, once desired by Microsoft, finds itself the baby in the super platform enterprise software game. SAP is trying to pump up revenues, deploy new on premises services, and make its investments in cloud computing pay off. Intelligent Enterprise has an interesting story about this juggling act here. (This is one of those wacky urls that could go dead at any moment.) The article “SAP Pays Price for SaaS Maturation” is by Rajan Chandras. For me the key point in the write up was:

SAP’s Business ByDesign, the newly introduced SaaS version for ERP, is “ready and done” and “the coolest app ever written,” according to Apotheker. Yet, he admits, it’s a bad time, financially, for doing a big market push — “hurting our margin, and hurting our stock,” is how he describes it

Blue-chip, white shoes consulting firm Forrester chipped in some insights for Mr. Chandras’ article. The BCWS outfit figured out that SAP was rethinking its commitment to SaaS or Software as a Service. The SAP financials make the plight of the German software company easy to grasp. Click here to see what I mean. The softening is evident in stock price, earnings, and soft inputs in the commentary about the company.

The SAP search system TREX is simply  not pinging the radar of the people with whom I speak. In fact, some SAP customers are not up to speed on the system.

In my opinion, the shift from on premises’ software to cloud based services is going to impose some hefty penalties on companies like SAP. Cloud services offer cost savings to corporations that are struggling, often without success, to contain the information technology costs of on premises software. But if an on premises licensee jumps to the cloud, the incumbent vendor may suffer cannibalization of on premises revenue. SAP may sense the danger in a lousy financial climate and find itself unable or unwilling to push forward with its cloud initiative. SAP then becomes more vulnerable because cloud service vendors can try to poach SAP customers. In short, I think SAP is in a world in which the three other super platform enterprise software vendors will squeeze down on SAP. Cloud based vendors will push up on SAP. Unless SAP comes up with some viable options. Otherwise, R/3 may find itself marginalized as TREX.

Stephen Arnold, December 4, 2008

Surprising YouTube.com Data Point

December 4, 2008

Yahoo News posted a story about YouTube’s crackdown on inappropriate content. You can read the Yahoo version of the AFP article here. For me the key data point in the story was this factoid: YouTube… receives 13 hours of video from users every minute.” Since a reader groused about my extrapolating from Google numbers for which there is no verification, I will leave it to you, gentle reader, to calculate how much data Google gets every 24 hours. A very rough mental calculation performed by the addled goose suggests that this is one million videos every couple of works. Think this is too high. Divide my number in half. That is still a lot of digital video.

Stephen Arnold, December 4, 2008

Accelerating XML Parsing

December 4, 2008

I have received a number of comments about the high speed indexing referenced in the interview with Perfect Search. One reader asked me to call attention to the open source XML parser VTD-XML. The acronym means Virtual Token Descriptor for eXtensible Markup Language. The suite of open source software may not meet the needs of some content processing applications because the number of large documents imposes additional work for the developer. However, for database type and other types of records, the method can eliminate redundant parsing, which is computationally expensive. One reader sent me a link to a useful description of VTD-XML. Here are the links to this write up by James Zhang. The original series–“Index XML Documents with VTD-XML of VTD-XML”–was published by SOA World Magazine, whose url is www.soa.sys-con.com. (Note: Sys-con has republished at least one of the articles from  this Beyond Search Web log.) The explanation of the method is in five parts. The first section provides a general description and the last section spells out the performance improvements:

  1. How to turn the indexing capability on in your application here
  2. Part 2 here — Sample code
  3. Part 3 here — Sample code
  4. Part 4 here — A discussion of application scenarios
  5. Part 5 here — The benchmark table

The conclusion to the write up made this point:

It’s not uncommon that those overheads [redundant parsing of XML] account for 80%-90% or more of the total CPU cycles of running the application. VTD-XML obliterates those overheads since there’s not much overhead left to optimize. Using VTD-XML as a parser reduces XML parsing overhead by 5x-10x. Next VTD-XML’s incremental update uniquely eliminates the roundtrip overhead of updating XML. Moreover, this article shows VTD-XML’s innovative non-blocking, stateless XPath engine significantly outperforming Jaxen and Xalan. With the addition of the indexing capability, XML parsing has now become “optional.”

A happy quack to the reader who called the VTD-XML method to my attention.

Stephen Arnold, December 4, 2008

Really MarkLogic

December 4, 2008

MarkLogic occupies an interesting market space. The company’s approach to information management appeals to publishers and other organizations wanting to convert wild content ponies into work horses. A reader alerted me to “RSuite Content Management System Based on XML Database”. You can read the full text of the story on IT.Enquirer here. Really Strategies, founded by former Reed Elsevier executives, implements “an enterprise scale publishing system” using MarkLogic’s technology. The article revealed that Audible.com uses the RSuite system to move content to Apple iTunes. For me the most significant comment in the article was:

Metadata in the system can be universal layered or contextual layered, which means you can either create metadata used across all content snippets, or only metadata that’s relevant for the current publication. Searching happens on a level where XML as well as PDF content is being run through, and queries—or rather their results—can be copied into new or existing articles as a sort of live “Smart content” snippets.

If you are not familiar with MarkLogic’s system, you can get more information here. What I noted when reading this article is that unlike traditional original equipment manufacture arrangements in search and content processing, Really Strategies is quite up front about its building upon MarkLogic’s technology. I like knowing what’s under the hood of enterprise software systems.

Stephen Arnold, December 4, 2008

Surf Ray Rumblings and Questions

December 3, 2008

After a quiet period, my anonymous readers have provided me with a wide range of comments and “information” about SurfRay. I am not sure what to make of some the inputs, so I want to address these issues as questions in the hope that one or two of Beyond Search’s readers might be able to provide links to information about the company. The most substantive comment appeared as a remark to one of my earlier SurfRay posts. The writer pointed out that there was strong demand for products like Mondosoft and Ontolica. I would agree. In my endnote on Thursday, December 5, 2008, I will mention the appetite in the 100 million licenses that make up the SharePoint market for products that address some of the shortcoming in SharePoint search. From my research for the first three editions of the now out of print Enterprise Search Report and my Gilbane study here, I thought the Mondosoft and Ontolica products offered some useful functionality. I don’t know if these two systems have been upgraded on a regular basis, however. If the remarks in the comments to my SurfRay posts are indicative of the market perception of these products, my hunch is that somewhere along the line either updates or information about the updates went off track.

The questions that appear to be more timely than annual software updates are:

  1. Has the Copenhagen, Denmark, office been closed?
  2. Are there employees working in some other locations on the SurfRay products?
  3. Are there financial issues involving bank or government officials and SurfRay?

Links to newspapers or other Web logs would be useful. Opinions are okay, but they are what they are.

Stephen Arnold, December 3, 2008

FastForward Search Blog on the Future of Blogs

December 3, 2008

I find sponsored Web logs fascinating. These quasi-promotional information services can be informative and quirky. Years ago, the Fast Search & Transfer company fired up the FastForward Web log to provided me and thousands of others with snippets of information about the Fast ESP user group conference. I asked about the user group focus a while back and learned that the Fast Forward Web log was reaching beyond that narrow focus.

That extension is quite evident. In fact, I recently read two posts here about the future of Web logs. One article was “The Uncertain Future of Blogging” by Jevon MacDonald; the other, “In 2010 What Will Replace Newspapers and Network TV?” I found the information in both interesting, in Mr. MacDonald’s piece, the data about media found their way into my statistics file. Then I began to reflect on a sponsored Web log’s role in the future of media. Here’s my chain of reasoning:

  1. A company Web log morphs into a community Web log and the company that started the Web log is acquired by Microsoft. I have little doubt about the potential financial support for the Web log will be available no matter what happens in the wide world of blogging in the months ahead.
  2. The future of media appears to be pretty grim with big companies embracing Web logs. Furthermore, the tools of blogging will now become powerful instruments in the hands of trained media professionals. If print newspapers can’t fly, the pilots will get a new airplane. That airplane may be blogging.
  3. Web log writers today have to innovate and shake blogging out of its doldrums. Big changes are coming fast.

I have over simplified the arguments in these two posts, so you must read the original write ups. What troubles me is that I expect to read about search and content processing, not about the problems of newspapers and other media companies. I want to know about the method Microsoft Fast used to get a government installation in a Scandinavian company up and running to make its spotlight function work well. I want to know how Microsoft Fast will handle voice to text in media files? I want to know how Microsoft Fast will integrate with Dynamics’ information stores held in SQL Server tables? I to know the status of the Microsoft Fast investigation underway in Norway and how to explain the issue to a contract officer who asks me for my view on the subject?

My opinion is that these search-centric topics are now out of bounds or out of information gas. I also think that the Web log is now a philosophical sounding board with a touch of consultant flummery added for color. To some search is less exciting than thinking about the future of Web logs when more newspapers bite the dust. Not to me. I want to read about ESP.

I would be eager to read FastForward if it returned to its roots and presented more substantive information about Microsoft Fast search, content processing, and information technology. I may be too limited in my thinking but a Web log anchored in Fast ESP should address topics germane to the software. But I’m an addled goose, easily confused by buzzwords like Enterprise 2.0 and analyses of the death of old media. What do you think? Should I re evaluate the FastForward blog?

Stephen Arnold, December 3, 2008

Open Source Search

December 3, 2008

A happy quack to the reader who called this write up about open source search to my attention. The author is Steve at the Web log Tek-Dev here. Steve has written a four-part series, and the information is quite useful.

Part One here is a quick run down of the core components of a search system. Part Two here takes a look at Lucene’s architecture and presents a chunk of the parsing code. Part Three here tackles crawling a Web site and adding to the Lucene index. Part Four here discusses the Lucene function to perform operations with the Lucene Index Reader and the Lucense Analyzer. Part Four contains useful code examples as well. You will want to download and retain this series.

Stephen Arnold, December 3, 2008

Search and Windows 7

December 3, 2008

Ars Technica has a good summary of the search features in Windows 7; that is, Vista, Service Pack 2. You can find Emil Protalenski’s “Search in Windows 7 Will Go Beyond Local”. The article is here. Ars Technica reports that the system performance is faster, but I don’t pay much attention to benchmarks. I want to know how a search performs against my test data. The major difference is the Windows 7 can search content on drives and repositories to which the user has access. For me the most important point in the write up was:

…developers can add Windows 7 compatible OpenSearch support to any existing searchable web application by adding RSS or ATOM output (the desktop client can then have a Search Connector for the service). SharePoint Search Server can also query these compatible OpenSearch services (as shown at PDC 2008).

The idea is that users can share searchers with the help of a Microsoft Certified Professional. Other systems have had this ability for quite a while. But what crossed my mind is that Windows 7 is going to provide more search functionality than previous Microsoft embedded search systems have. What I want to watch is the way in which Microsoft hooks Windows 7 to SharePoint and SharePoint to Microsoft FAST ESP. Google has its approach to the enterprise, and I think Windows 7 shows one small part of Microsoft’s defensive tactics which include connectors to hook into propriety content stores. The challenge for Microsoft will be to get its various systems to work with one another. In my experience, inter operability among Microsoft’s own applications remains a gold mine for Certified Professionals and the Redmond giant.

Stephen Arnold, December 2, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta