Google and Its Alleged SSD Innovations

October 18, 2010

I have shifted my attention from Google to Facebook. Nevertheless, my full scale Overflight continues to spit out information from open sources about Google. (This Overflight link shows a handful of features in the commercial version.) I wanted to capture what may be old news to my two or three readers in this blog post. As I shift from the uninteresting world of brute force indexing to more easily manipulated world of social search, some technical innovations at Google remain interesting in a general way.

You will need to navigate to the USPTO, click Search, and then download these documents. I am not providing explicit links to the source documents due to the “free” nature of the blog.

The subject is the usefulness of solid state storage devices as a speed up and cost down method of dealing with the need to fetch and write data. Solid state devices or SSDs are a mixed blessing. There are some performance and failure benefits, but there is also flakiness, particularly with certain vendors’ products.

image

One example of an SSD for scale. Source: http://commons.wikimedia.org/wiki/File:MicroSD.jpg

Google has been looking for many years – probably as early as 2003 – at ways to get around the hassle of spinning disc failures, heat generation, and size. As silicon fabs push to smaller traces, the cost and availability of SSDs becomes increasingly attractive.

My Overflight system lit up with a series of patent applications that featured inputs from a Googler by the name of Albert T. Borchers, more easily findable as “Al Borchers.” Don’t get too revved up looking for information. He like other post Alta Vista hires, is not a high profile type of guy in the Facebook sense of the word. With some poking around you can find some info like this bio at a conference site:

Al Borchers joined Google in 2004 in the Platforms group, developing system software for Google’s servers. In the last few years he has been working on high performance storage devices. He received a Ph.D. in theoretical computer science from the University of Minnesota in 1996, and has worked in industry for many years developing Unix and Linux device drivers and system software.

Read more

Floss Plone Information

October 4, 2010

I have been listening to podcasts when at the gym. New to the podcast world, I have been downloading programs to try and find out which ones have consistent, solid content. Yesterday I listened to Floss Weekly Number 137: Plone, produced by an outfit called Twit. You can get the show and information about Twit from the company’s Web site at http://twit.tv. I was surprised with the information revealed on this particular podcast, hosted by Randal Schwartz (aka merlyn), a Perl expert.

The guest on the program to which I listened was Alexander Limi, former Googler, employee at Mozilla, and user experience specialist for Plone. If you are not familiar with Plone, it is an open source content framework. You can use it to create content for industrial strength applications like the FBI and Discover Web sites. For more information about Plone, navigate to http://plone.org/.

I have no solid information about the accuracy of this particular podcast. I do want to highlight two points made in the podcast because I don’t want them to slip away.

image

The first point concerns Microsoft SharePoint. On the podcast I heard that Microsoft is not really selling or licensing SharePoint. Instead the model is shifting to providing the software and relying on services to generate revenue. I will have to poke around to find out if this is an early warning of a shift in the SharePoint business model or if there are only certain situations in which Microsoft is providing access to SharePoint in this way. The reason this is important is that SharePoint is, in my opinion, the fertile soil of an ecosystem that supports quite a few third-party vendors. These range from Microsoft Certified Partners who produce software that snaps in or overlays SharePoint. Example range from European vendors like Fabasoft to US firms like BA-Insight. In addition, there are many engineers who take some Microsoft classes and then support themselves making SharePoint work as the licensee requires. The notion of a “free” SharePoint or even a low cost SharePoint can explain why so many English majors, unemployed journalists, and third string business school MBAs are vociferously marketing their SharePoint expertise. This is a big ecosystem and it is going to get even bigger. I documented a study that suggested some SharePoint installations were challenges. The pricing implications are significant and the outlook for companies which can actually make SharePoint work are significant as well. I think most of the SharePoint snap in vendors could still be walking on a knife edge. The reason is that big accounts will be sucked up by Microsoft itself. Why let that revenue go to those who cultivated the cornfield? Just like big agriculture, the small farmer gets an opportunity to find a new future.

Read more

Search, Commoditization, and the Vulnerable Vendors

September 27, 2010

We are starting our fall swing for clients who want to know the outlook for search and content processing in 2011. I want to select one point from our briefing and relate it to a topic that some of the azurini are missing. I am not involved with any of the mid tier consulting firms, but these firms’ information has a way of turning up in quotes that may create an impression that all is well in search engine markets.

Search vendors are under pressure—enormous pressure. And the G forces are going up. At the same time, new players enter the market; for example, the academic spin out Sophia Search. Established players have obtained fresh infusions of capital to deal with the “opportunities” that exist in specific market segments like Microsoft SharePoint.

Let me point out that low end search solutions are now essentially free. A download of Lucene/Solr, FLAX, or some variant available with a bit of poking around via Bing.com is a click away. These systems work well, but you may want to have a friendly programmer at hand to help you over any bumps. For most organizations, open source works and works well. One doesn’t have to look much farther than Netflix to see how open source works in a high demand, high profile system. For more clues about what big firms are jumping on the open source search solution, navigate to www.lucenerevolution.com and look at the line up of speakers.

image

Image source: http://img.dailymail.co.uk/i/pix/2008/05_04/FlattenedCarG_850x649.jpg

Open source search is flowing into market sectors where certain historical trends have created a vacuum. Open source search is not so much muscling into these market sectors as being sucked in. Nature abhors a vacuum as most people learned in high school physics. (Modern physicists have diff3ernt views, but this is a blog post about markets, not theoretical physics.)

The market that is most directly affected are those where the perceived value of commercial search systems is low or modest and where the complexity of the search problem from the point of view of the licensee is manageable. Search is very complicated, but I am talking about perception of customers, procurement teams, and developers on staff who want to solve a problem. Open source is often the choice that our research suggests bubbles up from the technical members of the organization. The MBAs think IBM. The young engineers from CalTech think open source search.

So what happens?

Open source seeps into the organization, and when it works, it gains momentum. We did not find this trend particularly surprising because it replicates the diffusion of other technologies in other industries. I recall learning that the method of making hot air popcorn evolved from a hair drier. The hair drier from a discount store worked well enough to give the engineering team the insight required to build a very large business on a commodity component.

Who gets squished in this shift?

Read more

Search Industry Spot Changing: Risks and Rewards

September 20, 2010

I want to pick up a theme that has not been discussed from our angle in Harrod’s Creek. Marketers can change the language in news releases, on company blogs, and in PowerPoint pitches with a few keystrokes. For many companies, this is the preferred way to shift from one-size-fits-all search solutions described as a platform or framework into a product vendor. I don’t want to identify any specific companies, but you will be able to recognize them as these firms load up on Google AdWords, do pay-to-play presentations at traditional conferences, and output information about the new products. To see how this works, just turn off Google Instant and run the query “enterprise search”, “customer support”, or “business intelligence.” You can get some interesting clues from this exercise.

image

Source: http://jason-thomas.tumblr.com/

Enterprise search, as a discipline, is now undergoing the type of transformation that hit suppliers to the US auto industry last year. There is consolidation, outright failure , and downsizing for survival. The auto industry needs suppliers to make cars. But when people don’t buy the US auto makers products, dominoes fall over.

What are the options available to a company with a brand based on the notion of “enterprise search” and wild generalizations such as “all your information at your fingertips”? As it turns out, the options are essentially those of the auto suppliers to the US auto industry:

  • The company can close its doors. A good example is Convera.
  • The search vendor can sell out, ideally at a very high price. A good example is Fast Search & Transfer SA.
  • The search vendor can focus on a specific solution; for example, indexing FAQs and other information for customer support. A good example is Open Text.
  • The vendor can dissolve back into an organization and emerge with a new spin on the technology. An example is Google and its Google Search Appliance.
  • The search vendor can just go quiet and chase work as a certified integrator to a giant outfit like Microsoft. Good examples are the firms who make “snap ins” for Microsoft SharePoint.
  • The search vendor can grab a market’s catchphrase like “business intelligence” and say me too. The search vendor can morph into open source and go for a giant infusion of venture funding. An example is Palantir.

Now there is nothing wrong with any of these approaches. I have worked on some projects and used many of the tactics identified above as rivets in an analysis.

What I learned is that saying enterprise search technology is now a solution has an upside and downside. I want to capture my thoughts about each before they slip away from me. My motivation is the acceleration in repositioning that I have noticed in the last two weeks. Search vendors are kicking into overdrive with some interesting moves, which we will document here. We are thinking about creating a separate news service to deal with some of the non-search aspects of what we think is a key point in the evolution of search, content processing and information retrieval.

The Upside of Repositioning One-Size-Fits-All-Search

Let me run down the facets of this view point.

First, repositioning—as I said above—is easy. No major changes have to be made except for the MBA-style and Madison Avenue type explanation of what the company is doing. I see more and more focused messages. A vendor explains that a solution can deliver an on point solution to a big problem. A good example are the search vendors who are processing blogs and other social content for “meaning” that illuminates how a product or service is perceived. This is existing technology trimmed and focused on a specific body of content, specific outputs from specific inputs, and reports that a non-specialist can understand. No big surprise that search vendors are in the repositioning game as they try to pick up the scent of revenues like my neighbor’s hunting dog.

Read more

RSS Readers Dead? And What about the Info Flows?

September 13, 2010

Ask.com is an unlikely service to become a harbinger of change in content. Some folks don’t agree with this statement. For example, read “The Death Of The RSS Reader.” The main idea is that:

There have been predictions since at least 2006, when Pluck shut its RSS reader down that “consumer RSS readers” were a dead market, because, as ReadWriteWeb wrote then, they were “rapidly becoming commodities,” as RSS reading capabilities were integrated into other products like e-mail applications and browsers. And, indeed, a number of consumer-oriented RSS readers, including News Alloy, Rojo, and News Gator, shut down in recent years.

The reason is that users are turning to social services like Facebook and Twitter to keep up with what’s hot, important, newsy, and relevant.

image

An autumn forest. Death or respite before rebirth?

I don’t dispute that for many folks the RSS boom has had its sound dissipate. However, there are several factors operating that help me understand why the RSS reader has lost its appeal for most Web users. Our work suggest these factors are operating:

  1. RSS set up and management cause the same problems that the original Pointcast, Backweb, and Desktop Data created. There is too much for the average user to do and then too much on going maintenance required to keep the services useful.
  2. The RSS stream outputs a lot of baloney along with the occasional chunk of sirloin. We have coded our own system to manage information on the topics that interest the goose. Most folks don’t want this type of control. After some experience with RSS, my hunch is that many users find them too much work and just abandon them. End users and consumers are not too keen on doing repetitive work that keeps them from kicking back and playing Farmville or keeping track of their friends.
  3. The volume of information in itself is one part of the problem. The high value content moves around, so plugging into a blog today is guarantee that the content source will be consistent, on topic, or rich with information tomorrow. We have learned that lack of follow through by the creators of content creators is an issue. Publishers know how to make content. Dabblers don’t. The problem is that publishers can’t generate big money so their enthusiasm seems to come and go. Individuals are just individuals and a sick child can cause a blog writer to find better uses for any available time.

Read more

Microsoft FAST and the Solr Wind: A Trend or Sun Spot Consequence

September 9, 2010

I have to be blunt. I find that LinkedIn’s enterprise search group is a service I rarely visit. I responded to one LinkedIn person who wanted advice on creating an enterprise search system from scratch. It seems my observation that the idea was not too good hurt the person’s feelings. I got an impassioned personal email wanting to know why I was so negative. Right. Negative on coding an enterprise search system from scratch. Maybe this was a great idea when STAIRS III, InQuire, and BRS ruled the roost. Not such a great idea today.

I was delighted to read a thread about moving from Microsoft Fast to Solr. The information was interesting, but I can locate vendors with a Google or Bing search. A closer look at the responses provided me with some insight into a potentially interesting “solar wind” metaphor.

image

Icarus and all that. Source: http://www.mentera.org/2007/01/12/flight-of-icarus/comment-page-1/

I have no idea if you can access this thread, but I will include the link that worked for me. Here you go: Link to Fast to Solr discussion. If it doesn’t work, join LinkedIn and poke around.

The suggestions for integrators who can shift information assets from Microsoft Fast to Solr included:

  • Cominvent. This is a company with which I am not familiar.Lucid Imagination. This is the outfit who has me working for T shirts on the Lucene Revolution Conference, October 7-8, 2010, in Boston.
  • ESR Technologies. This is a company with which I am not familiar. (The logo reminded me of Digital Reasoning, another content processing firm working on a more advanced approach to digital information.)
  • Findwise (linked from Findabilityblog.se). This is a company with which I am not familiar.New Idea Engineering. This is an outfit with whom I collaborated on an open source and shareware search utility software directory and which I have tapped for some special project work.
  • Search Technologies. This is an outfit that participated in my enterprise search podcast for ETM recently.

Read more

Language Computer: Why Now for Swingly and Extractiv

September 2, 2010

I did some fooling around on the Language Computer Corp. Web site. The PR blitz is on for Swingly, the question-answering service that was featured in blogs and on the quite remarkable podcast hosted by Jason Calacanis. I listened to the Swingly segment but exited once that interview concluded. Instead of wallowing in the “ask a question, get an answer” just like Ask.com, Yahoo Answers, Mahalo, Quora, Aardvark, and others, I thought I would navigate to the Overflight archive and check out the Web site. The first thing I noted was that a click on the WebFerret button now renamed “Ferret” returned a 404 error. Okay. So much for that. I then punched the entity recognition demo which I had also examined a while ago. More luck there, but I had to dismiss an “invalid security certificate,” which I supposed would have been a deal breaker for the Steve Gibson types visiting the Language Computer Web site.

I uploaded one of my for-fee columns  to CiceroLite ML.. The system accepted the file, stripped out the Word craziness, and invited me to process the file. I punched the “process” button. The system highlighted the different entities. What’s important is that Language Computer has for at least eight or nine years performed at or near the top of the heap on various US government tests of content processing systems. Here’s what the marked up text looked like. Each color represents a different type of entity. For example, red is an organization, blue a person, etc.

lcc entity

In operational use, the tagged entities are written to a file, not embedded in a document. But for demo purposes, it makes it easy to see that Language Computer did a pretty good job. Entity extraction is a big deal for some types of content activities. I find a tally of how many times an entity appears in a document quite useful. The big chunk of work, in my opinion, is mapping entities to synonyms and then to people and places. It’s great to know the entities in a document, but it is even more great to have these items hooked together. I quite like the ability to click and see the entities in the source document.

Language Computer Corporation has been around since 1995. It has an excellent reputation, and, like other next generation content processing systems, has been used by specialists in quite specific niche markets. I won’t name these, but you can figure out what outfits are interested in:

  • Entity recognition
  • Event time stamping
  • Sentiment tracking
  • Document summarization.

The plumbing for these industrial-strength applications is what makes Swingly.com work. Swingly.com is a demo of the Language Computer question answering function. In my opinion, I am not likely to do much typing or speaking of questions into a search box or device. I type queries and I shout into a phone, often with considerable enthusiasm. (I hate phones.)

If you want to explore the Language Computer function to turn Web content (heterogeneous and semi-structured content) into structured data, navigate to www.extractiv.com. You will need to register. In order to use the service you have to create a content job, perform some steps, and then know what the heck you are looking at. The system works.

The larger issue to consider is, “Why are companies like Language Computer, Fetch Technologies, JackBe, and others from the niche government markets suddenly bursting into the broader enterprise and consumer sector?”

The pundits have not tackled this question. Most of the Swingly.com write ups are content to beat on the Q&A drum. I don’t think question answering is a mass market service except on devices that allow me to talk. In short, the Web angle is silly. So I am at odds with the azurini. I don’t care too much about English majors and journalists who are experts in search and content processing. Feel free to fall in love. Just brush up on your Shakespeare because the plumbing in systems like Language Computer’s will mean zero to this crowd.

Read more

Google and Its Global Street View Experiences

August 30, 2010

Special to Beyond Search

Technological innovative ideas have transformed our societies and lifestyles for better since time immemorial, also affecting the social norms and values. Such changes, as all changes do by default, go through a period of resistance, before they are finally embraced. The recent Google Street View controversy in Germany is a perfect example, and it has set out people and political parties to philosophize and finally polarize themselves in two opposite camps.

image

Source for this great illustration: textually.org

The Spiegel.de article “Google Knows More about Us than the KGB, Stasi or Gestapo” delineates the various ideologies of the politicians, institutional leaders, thinkers and commentators of Germany, when Google is just a few weeks away from launching the street views of the 20 German cities on the Internet. The German government, criticized for its slow reaction, now wants to take a cautious approach, rejecting the proposed legislation against Google’s Street View, and instead wants to address geographically based Internet services in general.

Read more

Digital Information and Progress?

August 25, 2010

Progress is an interesting idea. I read the “A Smartphone Retrospective” and looked at the pictures on August 19, 2010. To be candid, I didn’t give it much thought. Math Club types, engineers, and Type A marketers have been able to cook up the progress pie for many years. In fact, prior to the application of electricity, life was pretty much unchanged for millennia. A hekatontarch in Sparta could have been dropped into the Battle of Waterloo and contributed without much effort. Drop that same grunt into a SOCOM unit, and he wouldn’t know how to call in air cover.

Let’s take a trip down memory lane.

Most people in Farmington, Illinois, not far from where I grew up, believed that the world got better a little bit at a time. The curves most people believed and learned in grade school went up.

image

Well, most people believed that until the price for farm output stagnated. Then the strip mining companies made life a little better by pushing some money into the hands of farmers. Well, the money dried up and the land was not too useful for much after the drag lines departed.

image

Then the price of chemical fertilizer climbed. Well, then the government paid farmers not to farm so things looked better. Each year the automobiles got bigger and more luxurious and those who wanted the make the American dream a reality left for the big city. Now Farmington, Illinois, is a quiet town. Most of the stores are closed, and it is a commuter city for folks lucky enough to have a job in the economically-trashed central Illinois region about one hour south of Chicago.

Progress.

What’s happening in online and digital information is nothing particularly unusual. The notion of “progress”, at least in Farmington, is different today from what it was in 1960. Same with online, digital information, and technological gimcracks. I realized that most folks have not realized that “progress” may not be the bright, shiny gold treasure that those folks in Farmington accepted as the basic assumption of life in the U. S. of A.

Read more

Intel Embraces Security, Ignores Search

August 23, 2010

The pundits and azurini have been happier than a Poland China in a mud puddle. Intel spent $7 billion for McAfee. You can read many analyses of this deal. These include the Business Week “Intel Does Security, US Broadband Not So Fast” and the San Jose Mercury News’s “McAfee Deal Stirs Speculation about Symantec.” Leo LaPorte’s network tackled the subject in a breezy way. I lost track of the number of blog posts that choked my feedreader. In a lousy economy like ours, any big deal is a great deal for those who made it happen. After the commissions are paid, I am not so sure about the upside for Intel.

Let’s step back. McAfee has an interesting past. Recently the company’s update killed XP machines. The price tag seems lofty. There are some cultural differences between the two companies. These range from where money comes from to the markets served. But deal hungry MBAs can concoct some weird and potent bar drinks in today’s less than exciting financial environment.

Intel has many interests but chief among them is generating lots of revenue. The world of expensive silicon is an extremely hostile place. One can imagine the owner of a delivery company in Peanut, California, trying to keep an outfit like Intel humming like a nuclear submarine’s reactor. The skills required are many and varied. Intel is now taking on another country’s fleet. Some nuclear powered, some coal fired, and others wind powered.

image

In 1999 or 2000, I can’t remember the exact year, I brushed against a project that included an analysis of Intel. At that time, Intel was thinking about what to do with the spare cycles on its future chips. CPUs, particularly the dudes with the skyscraper architecture and multiple cores, have more zoom than the software can exploit. One of the ideas the research project explored was putting content processing or information retrieval functions in silicon. I remember kicking around ideas for various types of on-the-fly processing and even combining online, offline, and near line functions to make it easy for an employee to locate needed information. Great fun, just an above average complexity type of problem.

In the last decade, Intel has nosed into a wide range of businesses. You can get a partial run down in the Wikipedia article “Intel Corporation.” Omitted from the Wikipedia write up was Intel’s experiment with Convera to put search and retrieval in data centers. I have documented some of the features of this interesting move in the first edition of my Enterprise Search Report (2003-2004, now out of print). Here’s the highlight: The effort failed and burned a big pile of greenbacks. Also omitted from the Wikipedia write up was Intel’s stake in Endeca. This amounted to a smaller pile of greenbacks than the Convera adventure, but nothing really came of that step either.

image

Based on my nosing around, these probes into search were part of Intel’s quest to find something really hot to slap into “little silicon” (a CPU type thing or a small box gizmo) or into “big silicon” (a data center type of thing). Either big or little, Intel would be able to offer more value-added functions and, so the theory goes, charge more. The object of the game is to make money, remember?

Read more

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta