SharePoint: Improving Performance
September 23, 2008
In my opinion, SharePoint is a slow poke. Among the reasons:
- SQL Server bottlenecks
- My old pal IIS
- Churning when complex pages experience latency because needed data are scattered far and wide across the SharePoint landscape.
In what has to be the most amazing description of sluggish performance, Microsoft has released SharePoint Performance Optimization: How Microsoft IT Increases Availability and Decreases Rendering Time of SharePoint Sites . This is a 27 page Word document, which I was able to download here.
I scanned the white paper. I did not dig through it. The good stuff appears after the boilerplate about how to find out what part of the SharePoint system is the problem. In my experience, it’s not “one part”. Performance issues arise when there are lots of users, complex “sites”, and when some of the other required servers are tossed into the stew.
A happy quack to Nick MacKechnie who pointed to this Microsoft white paper in his Web log here.
Stephen Arnold, September 23, 2008
VideoSurf: Video Metasearch
September 23, 2008
I received an invitation to preview VideoSurf, a video metasearch provider, based in San Mateo, California. I tested the system whilst recovering from my wonderful Northwest Airlines flight from Europe to the US of A. When I fired up my laptop with the high speed Verizon service, I couldn’t get the video to run. When I switched to a high speed connection in my office, the search results were snappy and the videos I viewed ran without a hitch. Nice high speed network, Verizon.
The system offers a number of useful features:
- When I misspelled Google, the system offered a “did you mean” to fix up my lousy typing
- A handy checkbox in the left hand column allowed me to exclude certain video sites from the query. I noticed that the “world’s largest video search engine” Blinkx was not included.
- There’s a porn and no porn filter, which you can use to turn on porn. However, when I ran my test query “teen dancing” on the non-porn setting, I got some pretty exciting videos in my result set. I was too tired to watch more than a few seconds of gyrations to conclude that the non porn filter needs some fine tuning.
VideoSurf analyzes the contents of video. Most video search engines work with metadata and close caption information. Googzilla, not surprisingly, has introduced its own technology to index the audio content of files. For now, I thought VideoSurf was useful for general purpose queries. I did not find it as helpful for locating Google lectures at universities or for pinpointing presentations given at various Microsoft events. But it’s early days for the service.
This is what I saw when I ran my test query “Bill Gates”.
The company says here:
VideoSurf has created a better way for users to search, discover and watch online videos. Using a unique combination of new computer vision and fast computation methods, VideoSurf has taught computers to “see” inside videos to find content in a fast, efficient, and scalable way. Basing its search on visual identification, rather than text only, VideoSurf’s computer vision video search engine provides more relevant results and a better experience to let users find and discover the videos they really want to watch. With over 10 billion (and rapidly growing!) visual moments indexed from videos found across the web, VideoSurf allows consumers to visually navigate through their results to easily find the specific scenes, people or moments they most want to see. Users can now spend less time searching and more time being entertained! VideoSurf was founded in 2006 by leading experts in search, computer vision and fast computation technology and aims to become the destination for users looking to find, discover and watch online videos. The company is based in San Mateo, California.
The company was founded by Lior Delgo of FareChase.com fame. The technical honcho is Achi Brandt, who is a certified math whiz. The rest of the company’s management team is here.
The service merits a closer look.
Stephen Arnold, September 23, 2008
Autonomy: Compliance Initiative
September 23, 2008
Autonomy bought Zantaz in July 2007 for $375 million. The company continues to enrich its compliance line of services. For example, Autonomy has been quick to roll out services that need information management, search, and content processing. Examples include the firm’s Zantaz bundle described here in April 2008, and its recent compliance with the UK’s FSA Conduct of Business Sourcebook (COBS) requirements. Competitors in the search, content processing, and records management markets will want to pay close attention to what Autonomy is doing. I’ve been convinced for several years that Autonomy is one of the quickest reacting search vendors. New opportunities appear in Autonomy’s marketing collateral and news releases with greater precision than in the mid range consultants’ reports about industry trends. Autonomy has a nose for trends and beats many of its competitors to these markets.
As I was thinking about Autonomy, I recalled an article that appeared in Silicon Valley Watcher in April 2008. I was able to locate a copy of that article here. Written by Tom Foremski, the write up had the zippy title “A Policeman Inside Your Commuter and Inside Your Corporate Blog. Autonomy Releases Software that Flags Illegal Communications and Other Corporate Content.” For me, the most interesting comment in the article was:
There are some good and bad aspects to this software. The bad is a big brother type use for it…It could be used to restrict blogging. A lot of people tell me that large corporations are scared of blogs violating a regulation and so every corporate blog entry has to be run through lawyers– it has to be “lawyered.” This can take time, days, even weeks. Paradoxically, I think AIG could be used to clear a blog post in real-time and could thus increase the amount of good, legal information that company workers can share in public. Either way, it automates some of the tasks of a lawyer…. Less lawyering, means lower operating costs, which maximize share holder value, and that’s what corporate officers are required to do.
With the great concern about Google I heard in my various meetings in Europe last week, I was surprised that most of those Google critics were blissfully ignorant of vendors such as Autonomy who have robust tools for monitoring available and in use. I suppose the difference is that an organization can monitor in order to comply with regulations. In the next month or so, I want to profile some of the companies with content monitoring systems. I will pick a handful of representative companies. Google’s not the only game in town, not by a long shot.
Stephen Arnold, September 22, 2008
Cognition’s Semantic Map
September 22, 2008
I profiled Cognition Technologies in my April 2008 “Beyond Search” report for the Gilbane Group here. I can’t reproduce the profile in my Web log, but you can find out about Cognition by reading the information on the company’s Web site. My take on the firm was that it was working to tame the semantic beast that is prowling around many procurement team meetings. The company has released a knowledge base that “teaches computers the meanings behind words.” You can read more about the semantic map in the RawStory.com article “Computers Figuring Out What Words Mean” here. Cognition has, according to RawStory, licensed the map to LexisNexis, one of the early entrants in online for-fee content access. If you are in the market for a semantic map, check out Cognition’s new offering. My view of semantic technology is that Google seems to be ideally positioned to become the Semantic Web. I provided details behind this assertion in the 2007 report I did for BearStearns before it went down in flames earlier this year. Google has quite a few of its Googley souls laboring in the semantic vine yard. As a result, the semantic efforts of smaller companies and larger outfits like Microsoft have to make significant progress and fast. Cognition’s Web site is here.
Stephen Arnold, September 22, 2008
Business Intelligence: Getting Smarter in a Class with Some Lousy Students
September 22, 2008
Business intelligence sounds more up town than search. Analytics resonates with quantitative goodness. Most employees look back on their classes in mathematics with a combination of nostalgia as in “I wish I would have taken more math” and horror as in “I hated Miss Blackburn’s algebra class”. I did a job for a major university to answer the question, “Can we be number one in computer science?” The answer was, “No.” There were not many math majors who planned on working in the US once the sheepskins were handed out. It’s tough to rise to the top when your future endowment funding sources are working in Wu Han or Mumbai. Loyalties and money may go to the local high school where the math wizards’ genius was first recognized and cultivated.
I find it amusing that search vendors are rushing to become players in the business intelligence arena. Now established business intelligence companies are encouraging the running of the bull-oney. SPSS, SAS, Cognos, and Business Objects have learned to love text because their customers demanded that structured and unstructured data be mind for insights. Ignoring comments on warranty cards, in emails, or in voice calls to a help desk do yield useful information. Some companies learn what customers loathe and then don’t fix the problem. Called your mobile provider lately? How about your bank, assuming it’s still in business? See what I mean.
When I read a good analysis of how business intelligence vendors are getting smarter, I learn something about how the market perceives business intelligence. But I wonder why these analyses don’t dig into the deeper issues associated with vendors who reinvent themselves in order to make sales. I’m not sure the product innovation is of the same quality as the marketing collateral. In short, vendors talk a good game, but the delivery remains much the way it always has. Math and programming people have to be taught the system. The business intelligence system is then set up with rules spelled out. The biggest change is that the traditional method is too expensive, so companies want short cuts to business intelligence goodness. Enter the search and content processing vendor. The idea is simple: index content and convert a user’s query to a form that generates a report. Now will the report have the same concern with the niceties and nuances of hand crafted statistical instructions operating on a well formed data cube? Maybe? But the new approaches are a heck of a lot easier, faster, and cheaper. Licensees are asked to conclude, “You get all three with our new system.”
Take a gander at the well written “Business Intelligence Gets Smart” published on September 5, 2008, by Intelligent Enterprise’s Doug Henschen here. You will have to put up with an annoying ad flop over, but the content is worth the annoyance. The key point of the write up is that business intelligence “improves business performance.” This is a key point. Most search and content processing systems don’t generate a hard return on investment. Business intelligence, according to the Information Week Research Business Intelligence Survey cited by Mr. Henschen does. That’s good news, and it encourages vendors with non-ROI systems to repackage these products as bottom line centric solutions. For me, the most important parts of this write up were the charts and graphs. Mr. Henschen does a good job of pulling together the numbers that help put business intelligence in context.
I would like to offer several observations and, of course, invite comment:
- Business intelligence remains a complicated area, and it does not lend itself to facile solutions.
- Most business intelligence systems require that content be transformed, then processed, and finally analyzed. If the content processing goes off track, the fix can be time consuming and expensive. BI systems, like search and content processing systems, can experience cost overruns because the assumptions about the source information were wrong or shallow.
- Business intelligence even when implemented with some of the search centric solutions on the market like Endeca’s Latitude require a math or programming wizard to configure the systems.
Quite a few search and text analytics companies are asserting that “we do business intelligence”. The statement is both true and false. In order to avoid coming down on the false side of the statement, short cuts should be avoided. Implementing business intelligence is similar to Miss Blackburn’s algebra class. It’s demanding, a great deal of work, and usually disliked by those without the appetite or the aptitude for the tasks.
Stephen Arnold, September 22, 2008
Autonomy: Compliance Initiative
September 22, 2008
Autonomy bought Zantaz in July 2007 for $375 million. The company continues to enrich its compliance line of services. For example, Autonomy has been quick to roll out services that need information management, search, and content processing. Examples include the firm’s Zantaz bundle described here in April 2008, and its recent compliance with the UK’s FSA Conduct of Business Sourcebook (COBS) requirements. Competitors in the search, content processing, and records management markets will want to pay close attention to what Autonomy is doing. I’ve been convinced for several years that Autonomy is one of the quickest reacting search vendors. New opportunities appear in Autonomy’s marketing collateral and news releases with greater precision than in the mid range consultants’ reports about industry trends. Autonomy has a nose for trends and beats many of its competitors to these markets.
As I was thinking about Autonomy, I recalled an article that appeared in Silicon Valley Watcher in April 2008. I was able to locate a copy of that article here. Written by Tom Foremski, the write up had the zippy title “A Policeman Inside Your Commuter and Inside Your Corporate Blog. Autonomy Releases Software that Flags Illegal Communications and Other Corporate Content.” For me, the most interesting comment in the article was:
There are some good and bad aspects to this software. The bad is a big brother type use for it…It could be used to restrict blogging. A lot of people tell me that large corporations are scared of blogs violating a regulation and so every corporate blog entry has to be run through lawyers– it has to be “lawyered.” This can take time, days, even weeks. Paradoxically, I think AIG could be used to clear a blog post in real-time and could thus increase the amount of good, legal information that company workers can share in public. Either way, it automates some of the tasks of a lawyer…. Less lawyering, means lower operating costs, which maximize share holder value, and that’s what corporate officers are required to do.
With the great concern about Google I heard in my various meetings in Europe last week, I was surprised that most of those Google critics were blissfully ignorant of vendors such as Autonomy who have robust tools for monitoring available and in use. I suppose the difference is that an organization can monitor in order to comply with regulations. In the next month or so, I want to profile some of the companies with content monitoring systems. I will pick a handful of representative companies. Google’s not the only game in town, not by a long shot.
Stephen Arnold, September 22, 2008
Virtual Servers: It Is Recrawl and Reindex Time
September 22, 2008
The malarky about virtualization has many information technology professionals courting chimeras. Some virtualization is good. For example, we have a couple of quad core, four gigabyte servers that are four to five times faster on our benchmark tests than the aged NetFinity 5500s we retired. The new servers have the moxie to run virtualization software. No problems so far. In fact, chopping boxes into separate virtual servers makes sense and is tame compared to some of the technologies that arrive at our office door.
Virtual storage, however, is another kettle of fish. Our experience has been that complex directory structures such as those spawned by SharePoint and certain enterprise applications are complicated. When these complex structures are mixed with virtual storage, we have encountered some excitement. We test software, so our trashed files provide us with useful data, not long weekends and sleepless nights.
InfoWorld on September 19, 2008, here called attention to some of the issues virtual storage drags along with the snappy marketing messages and rah rahs for cheaper administration. “Virtual Server Backups Prone to Failure, Survey Finds” makes clear that virtual solutions are not without some problems. The InfoWorld write up reports on a survey that asserts more than half the virtual server backups don’t restore. The article has some other data but I want to focus only on the backups not restoring.
Here’s the problem. Search is a storage intensive application. The indexes can be big. If an index doesn’t start out big, in a matter of months the index gets big. Logs get big. When a search or content processing system crashes or an index update corrupts the master index, an administrator turns to the back up sytem. If the search system is using a whizzy new virtual storage system, the backup won’t work. The problem is that rebuilding the index is not always a five minute or even a five hour job.
Recrawling and reindexing can be tricky. Systems that perform significant content processing can crunch for a day,. maybe more generating metadata. Our suggestion is to skip virtual storage for search and content processing systems. Already have one? You may want to devirtualize and quickly.
Stephen Arnold, September 22, 2008
Microsoft Yahoo: Woulda, Coulda, Shoulda
September 21, 2008
Hindsight is 20 20. Actually hindsight is what college professors do. I’m no academic, but I write “woulda, coulda, shoulda” reports now and then. These are easy to do. The facts are right there, and I can almost always develop a nifty timeline. With a bit of math magic, I can output “what if” results. I have even sold some of these spreadsheet fever reports to investment banks and rich people with more money than the average Roman patrician who is a pal of Augustus Caesar.
I thought about “woulda, coulda, shoulda” when I read the well written essay “Yahoo Should Have Sold Search to Microsoft”. You can read Henry Blodget’s analysis here. I liked Mr. Blodget’s argument. My thoughts shifted away from “woulda, coulda, shoulda” to “it is what it is”. I snagged a notepad and jotted down three “it is what it is” points.
First, the Yahoo search is a clunker in my opinion. Now I know there are some people who do work for me who love Yahoo search. I don’t like it at all. When the Cluuz.com front end is slapped on Yahoo, I like Yahoo a lot better. Why? Yahoo does not deliver useful results without including some clunkers in each hit list. Whatever Cluuz.com is doing, Yahoo becomes more useful to me. Why doesn’t Yahoo improve its search? The company is trying to do too many things. No focus translates to search results that I avoid. Maybe I’m wrong, but when a tiny Canadian company can make Yahoo a lot better, I think the problem resides within the Yahoo search team. Bad management plus an aging search engines aren’t going to close the gap between Yahoo and Google. If Microsoft bought another clunker, it is what it is–a clunker.
Second, Microsoft is not making progress despite the reorganizations, the acquisitions, and the pay for traffic ploys. Why? Users have had a decade to get used to Google’s being “good enough”. Some people think Google’s great. I don’t. Much of Google’s success comes from having competitors who don’t know what to do to leapfrog Google. Instead of going for the jugular with privacy and usage tracking as the business end of a marketing sword, Microsoft has too many chiefs or cooks. Whatever these managers are, they are not able to respond to Google after years of trying. So if Microsoft buys Yahoo search what difference will it make? Not much. Google touches about two thirds of the searches in North America. Buying the aging Yahoo will be like the purchase of Fast Search, Powerset, and Ciao.com–too little too late. In short, it is what it is.
Third, I track 52 search vendors. I have a list of 300 or so companies competing in search and content processing. Know how many can compete with Google? Two. Why doesn’t Yahoo buy one of these outfits? Why doesn’t Microsoft? The answer is that neither company takes time from their busy meeting filled days to sit down and think about Google’s vulnerabilities, which technologies are able to out Google Google in some key area, and do a head-to-head analysis that considers significant issues, not older stuff like Fast Search’s number of customers or Yahoo’s nifty banner advertising system.
The “it is what it is” analysis is sorely needed at a number of companies, not just Microsoft or Yahoo. Google has been running free for a decade. Buying yesterday’s notions of great technology won’t do the job now or in the next six to nine months.
I agree with Mr. Blodget’s analysis, but I prefer the “it is what it is” approach. Getting real about Google is the first step toward a search response to Google.
Stephen Arnold, September 21, 2008
Google Yahoo: A Contrarian’s View
September 21, 2008
In high school, I would get into trouble by asking, “What if we look at this idea from a different point of view?” My high school teachers were kindly but not too eager to listen to a question and then a suggestion that their world view was out of kilter. I am not sure why I developed this habit of mind, but I learned when I got to my first real job at Halliburton (Nuclear Utility Services), I discovered that the nuclear physicists and mathematicians that made up 80 percent of the unit liked my approach. Instead of ignoring me or putting my desk in the hall as my high school teach Miss Sperling did, these guys and gals would light up light up like white LEDs and dig in, intellectually speaking.
After reading Randall Stross’s analysis here, I felt he was on the roight track for 180 degree thinking, but he was hitting the snow covered peaks, ignoring the basalt layers on which his big idea rests. Then I read with enjoyment Michael Arrington’s “Why the Google Yahoo Ad Deal Is Something Fear.” You can read that essay here. Not only did I enjoy his writing, several of his points resonated with me. Nevertheless, my contrarian approach levered both of these astute gentlemen’s comments into several ideas that are rotated a few degrees from each’s positions.
First, the barn is on fire. It’s burning fast. The horses gone. The hay is burning fiercely and the Harrod’s Creek fire engine aided by fire engines from elsewhere can’t douse the flames. So the fire fighting professionals hose some bushes, squirt water on the roof of an adjoining building, and watch the barn burn. My view is that Google was ignored in the period from 1995 to 1998 when Messrs. Brin and Page were fooling around with BackRub. Then in the period from 1998 to 2004, some smart money urged the Googlers and their small cohort of former DEC / AltaVista.com, Bell Labs, and Sun Microsystems’ colleagues forward. When the IPO loomed, Google settled with Yahoo for about a billion dollars. Yahoo realized that Google had learned from early GoTo.com, Overture.com, and other ad efforts. Instead of reinventing the stone wheel, Google had vulcanized a Michelin radial. Clumsy metaphor but if you have a stone wheel and your friendly competitor has Michelins you have limited choices. Yahoo sued and elected to keep using stone wheels. The result is that Yahoo has the kind of choice that BF Goodrich gives NASCAR teams. Use our tires or don’t race. Works in auto racing, and it is working in online advertising.
Who wants to stand in front of this and slow down this bullet train?
Second, Google really doesn’t “sell” advertising. Like the local utility monopoly or the local water company, you can sign up for power and water or spend your money drilling a geothermal hole, erecting solar panels, and buying Evian by the truck load. Google is a service, and if the users and the advertisers did not want to make whoopie, there’s not much Google can do about it. In fact, legislating to company dependent on Google traffic that it can no longer advertise is probably one of those remarkable opportunities to explore the law of unintended consequences in detail. I don’t know about you, but I have yet to meet a government paenl or regulatory committee who has a solid grasp that Google is a giant digital computer. Ad matching and users searching are just applications. If Google removes these functions, develoeprs can use Google’s APIs to build their own systems and Google can charge a fee and take a piece of the action. The result? No changes and maybe even more money for Google because there is a great deal of interest in tapping into Google traffic.
Autonomy Tweaks French Vendors Again
September 21, 2008
When I was in Europe last week, I learned that Autonomy gobbled another juicy cerise from the mouths of French software vendors. Autonomy must be half French so keen is its sensitivity to the French market. This deal is for Autonomy to index France 24’s video archives. These archives at present contain more than 50,000 rich media
files, consisting of the channel’s daily broadcasts in French, Arabic and English, as well as other purchased programs, and is added to daily. You can read more about this important financial cerise here.
Stephen Arnold, September 21, 2008
.