Google and the Enterprise: The Point? Money
March 19, 2012
You must read “Google Enterprise chief Girouard Heads to Startup Upstart.com.” I wondered if a simple executive shuffle many months after a de facto demotion was news. Apparently the poobahs and “real” journalists find a Xoogler worthy of a headline. I have a different view about Google and the enterprise. I write about Google’s latest adventures in my Enterprise Technology Management column, published in the UK, each month.
Google pumped quite a bit of time, effort, money, and Google mouse pads into its enterprise initiative. In the salad days, Google could not learn enough about the companies dominating the enterprise search space. As I researched my Google monographs, I was picking up from interview subjects anecdotal information about the paucity of knowledge Googlers had about what enterprise procurement teams required.
In one memorable, yet still confidential interaction, Google allegedly informed a procurement manager that Google disagreed with a requirement. Now, if that were true, that is something one hears about a kindergarten teacher scolding a recalcitrant five year old. Well, that may have been a fantasy, but there were enough rumblings about a lack of customer support, a “fluid” approach to partners, and a belief that whatever Google professionals did was the “one true path.” I never confused Google and Buddha, but for some pundits, Google was going to revolutionize the enterprise. Search was just the pointy end of the spear. The problem, of course, is that organizations are not Googley. In fact, Googley-type actions make some top dogs uncomfortable.
What happened?
Based on my research, which I shifted to the back burner, I learned:
- Google was unable to put on an IBM type suit. The Googley stuff opened doors, but the old Wendy’s hamburger ad sums up what happened after the mouse pads and sparkle pins were distributed: “Where’s the beef?”
- The products and services were not industrial strength and ready for prime time. The notion of an endless beta and taxi meter pricing, no matter how “interesting”, communicated a lack of commitment.
- The enterprise market likes the idea of paying money to be able to talk to a person who in most cases semi-cares about a problem. AT&T makes tons of dough making clients pay four times an engineer’s salary to get a human on the phone any time. Google delegated support down to partners. Won’t work. A Fortune 100 company wants to call Google, not send an email.
- Pricing. If you are not sure what the ballpark cost for indexing 100 million documents using a search appliance, ensuring 24×7 uptime, and backing up—navigate to www.gsaadvantage.com and look up the price of a Google Search Appliance. Now figure out how much it will cost to process an additional one million documents. How’s that price grab you?
When Larry Page assumed control of the company, I wrote about the wizards who were reporting directly to him. The head of the enterprise unit was not one of those folks. My conclusion: game over.
Like AOL, the notion of having a Google person on staff is darned appealing to some, but as the AOL experience makes clear, a Xoogler is not a sure fire money maker.
Here’s the quote I jotted down from the GigaOM story:
Still, market share and revenue may never have been Google’s goal. By offering a lower-cost option to the Office/Exchange tandem, Google forced the market leader to respond, and that may have been the point all along.
Baloney. Google expected to have big outfits roll over and wag their tail. The US government did not roll over. Most big IBM, Microsoft, and Oracle customers did not roll over. More important, the new wave of enterprise service and solutions providers did not roll over. Why? A lack of focus and a dependence on online advertising, legal hassles, privacy chatter, and a failure to deliver competitive products and services made the enterprise initiative a tough sell. Betas may be great for market tests. For the enterprise, a beta may be a hindrance.
Stephen E Arnold, March 19, 2012
Sponsored by Pandia.com
Has Bing Caught Up to Rival Google?
March 15, 2012
Radical idea, right?
A Bing insider claims that Microsoft’s search engine has finally caught up to Google, technology-wise at least. Wired Enterprise reports, “Microsoft Says Decaffeinated Bing Tastes as Good as Google.” The Caffeine alluded to in the title refers to Google’s 2010 platform of that name, the purpose of which is to produce fresher search results. Microsoft’s Harry Shum, in charge of Bing’s research and development, says his team’s product is at least as good. Writer Cade Metz reports:
Harry Shum joined the Bing team in 2007, after eleven years with Microsoft’s research arm. The task at hand was enormous: catch up to Google. Five years on, Google is still the world’s dominant search engine — some estimates put its market share as high as 85 or 90 percent — but Shum believes that Bing has finally reached a point where it can compete with Google on a technical level.
The difference between Caffeine and its predecessor MapRequest are significant: the new platform allows sections of the search index to be updated continuously, rather than indexing the entire thing is huge batches. Shum hints that the current Bing approach is similar, but is guarded with the details. That’s understandable.
If Shum is right, this is a surprising development. We still believe that Blekko and Yandex are better than other Web search systems, however. Google has bet on social. We think Google should have put more money on search. Hedging bets is often a good idea.
Cynthia Murrell, March 15, 2012
Sponsored by Pandia.com
PDF Search from Dieselpoint
March 14, 2012
We heard Dieselpoint offers a PDF search engine, so we decided to check it out. This company keeps a very low profile, but we find it is worth looking into.
Dieselpoint’s PDF Search is an enterprise product that can navigate large collections of PDFs, extracting both metadata and text for indexing. Metadata can be searched and used to build more sophisticated interfaces in conjunction with Dieselpoint’s Search platform.
Often, titles are left out of a document’s metadata, making searches more challenging; Dieselpoint has an innovative solution for that. The product overview states:
Quite often, authors of PDFs neglect to enter titles into the document’s metadata. This makes it difficult to display a good, descriptive title when a PDF appears on a search results page. Dieselpoint Search eliminates this problem by providing ‘Smart Titles’. The system analyzes each PDF looking for clues as what the title might be, and employs advanced heuristics to select one. Studies show that Dieselpoint’s algorithm selects a title which is the same as the one that a human would have selected over 90% of the time.
This tool also takes advantage of XMP data, which resides in an XML file embedded within a PDF file. This data can contain information on subjects such as authors, digital rights, categories, and keywords.
Dieselpoint began developing the core indexing algorithms behind its search engine in 1999, and released version 1.0 the next year. Originally meant for use with engineered industrial goods, the product (and company) name reflects these origins.
Cynthia Murrell, March 14, 2012
Sponsored by Pandia.com
Reference Resource for Big Data Vendors
March 13, 2012
SoftArtisians and Riparian Data have been reporting on a series that examines some of the key players in Boston’s emerging big data scene.
The recent article, “Boston’s Big Datascape, Part 2: Nasuni, VoltDB, Lexalytics, Totutek, Cloudant” is the second in the series and examines five companies who may differ in their growth stages and approach but are similar in their ideology that “big data is the castle, and their tools [are] the keys.”
The article breaks each company down by product, founder, technologies used, target industries, and location.
Tokutek’s mission is to transform the way data is stored and retrieved and deliver a quantum leap in the performance of databases and file systems. The company breakdown was:
“Product: TokuDB brings massive data processing and analysis capabilities to heretofore neglected MySQL. It’s a drop-in replacement for InnoDV that extends the capacity of MySQL databases from GBs to TBs.
Founders: Michael A. Bender, Martín Farach-Colton (ln), Bradley C. Kuszmaul
Technologies used: MySQL, MVCC, ACID, Fractal Tree™ indexing
Target industries: Online Advertising, eCommerce, Social Networking, Mobile Solutions, ePublishing.”
We’re interested to see how this series develops and the innovative new companies that come about from it.
Jasmine Ashton, March 13, 2012
Sponsored by Pandia.com
Open Source Projects at GitHub
March 5, 2012
We’ve run across a couple of interesting open source components on the collaboration site GitHub. Founded in 2008, the site hosts over two million code repositories and provides tools with which subscribers can manage their projects. The two we’d like to highlight are the Nutch-Elasticsearch-Indexer and MongoDB.
The Nutch-Elasticsearch-Indexer allows for the indexing of crawl data from the Nutch search system into the ElasticSearch system. The project’s Readme explains:
“This is similar in nature to that of the SolrIndexer that comes with Nutch which let you index directly into Solr. This provides a way directly index data into elasticsearch coming directly from Nutch.
This is just the code necessary to create the solution. You must start by having the Nutch codebase and have it setup in your development environment (Eclipse) see http://wiki.apache.org/nutch/RunNutchInEclipse for how do this. Once you are set up and is working well. You are ready to get started. The following files below are necessary to integrate into the Nutch base and then re-build Nutch.”
Participating developers must have access to the Nutch source and an ElasticSearch environment. See the GitHub project page for further details.
The page on MongoDB is the other project here that sparked our interest. It includes developer details such as related utilities; where to do for info on building a database; how to run Mongo; client drivers; documentation; build notes; and licensing information. We are pleased to see that this page sees a fair amount of activity.
Maybe the search mantra should be, “Go, open source”?
Cynthia Murrell, March 5, 2012
Sponsored by Pandia.com
Ontoprise GmbH: Multiple Issues Says Wikipedia
March 3, 2012
Now Wikipedia is a go-to resource for Google. I heard from one of my colleagues that Wikipedia turns up as the top hit on a surprising number of queries. I don’t trust Wikipedia, but I don’t trust any encyclopedia produced by volunteers including volunteers. Volunteers often participate in a spoofing fiesta.
Note: I will be using this symbol when I write about subjects which trigger associations in my mind about use of words, bound phrases, and links to affect how results may be returned from Exalead.com, Jike.com, and Yandex.ru, among other modern Web indexing services either supported by government entities or commercial organizations.
I was updating my list of Overflight companies. We have added five companies to a new Overflight service called, quite imaginatively, Taxonomy Overflight. We have added five firms and are going through the process of figuring out if the outfits are in business or putting on a vaudeville act for paying customers.
The first five companies are:
We will be adding to the Taxonomy Overflight another group of companies on March 4, 2012. I have not yet decided how to “score” each vendor. For enterprise search Overflight, I use a goose method. Click here for an example: Overflight about Autonomy. Three ducks. Darned good.
I wanted to mention one quite interesting finding. We came across a company doing business as Ontoprise. The firm’s Web site is www.ontoprise.de. We are checking to see which companies have legitimate Web sites, no matter how sparse.
We noted that the Wikipedia entry for Ontoprise carried this somewhat interesting “warning”:
The gist of this warning is to give me a sense of caution, if not wariness, with regard to this company which offers products which delivered “ontologies.” The company’s research is called “Ontorule”, which has a faintly ominous sound to me. If I look at the naming of products from such firms as Convera before it experienced financial stress, Convera’s product naming was like science fiction but less dogmatic than Ontoprise’s language choice. So I cannot correlate Convera and Ontoprise on other than my personal “semantic”baloney detector. But Convera went south in a rather unexpected business action.
Exogenous Complexity 4: SEO and Big Data
February 29, 2012
Introduction
In the interview with Dr. Linda McIsaac, founder of Xyte, Inc., I learned that new analytic methods reveal high-value insights about human behavior. You can read the full interview in my Search Wizards Speak series at this link. The method involves an approach called Xyting and sophisticated analytic methods.
One example of the type of data which emerge from the Xyte method are these insights about Facebook users:
- Consumers who are most in tune with the written word are more likely to use Facebook. These consumers are the most frequent Internet users and use Facebook primarily to communicate with friends and connect with family.
- They like to keep their information up-to-date, meet new people, share photos, follow celebrities, share concerns, and solve people problems.
- They like to learn about and share experiences about new products. Advertisers should key in on this important segment because they are early adopters. They lead trends and influence others.
- The population segment that most frequents Facebook has a number of characteristics; for example, showing great compassion for others, wanting to be emotionally connected with others, having a natural intuition about people and how to relate to them, adapting well to change, embracing technology such as the Internet, and enjoying gossip and messages delivered in story form and liking to read and write.
- Facebook constituents are emotional, idealistic and romantic, yet can rationalize through situations. Many do not need concrete examples in order to comprehend new ideas.
I am not into social networks. Sure, some of our for-free content is available via social media channels, but where I live in rural Kentucky yelling down the hollow works quite well.
I read “How The Era Of ‘Big-Data’ Is Changing The Practice Of Online Marketing” and came away confused. You should work through the text, graphs, charts, and lingo yourself. I got a headache because most of the data struck me as slightly off center from what an outfit like Xyte has developed. More about this difference in a moment.
The thrust of the argument is that “big data” is now available to those who would generate traffic to client Web sites. Big data is described as “a torrent of digital data.” The author continues:
large sets of data that, when mined, could reveal insight about online marketing efforts. This includes data such as search rankings, site visits, SERPs and click-data. In the SEO realm alone at Conductor, for example, we collect tens of terabytes of search data for enterprise search marketers every month.
Like most SEO baloney, there are touchstones and jargon aplenty. For example, SERP, click data, enterprise search, and others. The intent is to suggest that one can pay a company to analyze big data and generate insights. The insights can be used to produce traffic to a Web page, make sales, or produce leads which can become sales. In a lousy business environment, such promises appeal to some people. Like most search engine optimization pitches, the desperate marketer may embrace the latest and greatest pitch. Little wonder there are growing numbers of unemployed professionals who failed to deliver the sales their employer wanted. The notion of desperation marketing fosters a services business who can assert to deliver sales and presumably job security for those who hire the SEO “experts.” I am okay with this type of business, and I am indifferent to the hollowness of the claims.
What interests me is this statement:
From our vantage point at Conductor, the move to the era of big data has been catalyzed by several distinct occurrences:
- Move to Thousands of Keywords: The old days of SEO involved tracking your top fifty keywords. Today, enterprise marketers are tracking up to thousands of keywords as the online landscape becomes increasingly competitive, marketers advance down the maturity spectrum and they work to continuously expand their zone of coverage in search.
- Growing Digital Assets: A recent Conductor study showed universal search results are now present in 8 out of 10 high-volume searches. The prevalence of digital media assets (e.g. images, video, maps, shopping, PPC) in the SERPs require marketers to get innovative about their search strategy.
- Multiple Search Engines: Early days of SEO involved periodically tracking your rank on Google. Today, marketers want to expand not just to Yahoo and Bing, but also to the dozens of search engines around the world as enterprise marketers expand their view to a global search presence.
All the above factors combined mean there are significant opportunities for an increase in both the breadth and volume of data available to search professionals.
Effective communication, in my experience, is not measured in “thousands of key words”. The notion of expanding the “zone of coverage” means that meaning is diffused. Of course, the intent of the key words is not getting a point across. The goal is to get traffic, make sales. This is the 2112 equivalent of the old America Online carpet bombing of CD ROMs decades ago. Good business for CD ROM manufacturers, I might add. Erosion of meaning opens the door to some exogenous complexity excitement I assert.
Discover Point: Search Shows the Unseen
February 16, 2012
Discover Point comes at search and retrieval with the “automatically connect people with highly relevant information.” I find this interesting because it makes search into a collaborative type of solution. Different from a search enable application, Discover Point pops up a conceptual level. After all, who wants another app. When I need information, I usually end up talking to an informed individual.
Government Computer News reported on this approach in the write up “An Info and Expertise Concierge for the Office.” GCN perceives Discover Point as having a solution for the US government which “prevents agencies from constantly reinventing the wheel and instead helps users move forward with new tasks and projects…” This is an interesting marketing angle because it shifts from assertions that few understand such as semantics, ontologies, and facets.
GCN continues:
DiscoverPoint from Discover Technologies is designed to point users in the direction of the most relevant information and subject-matter experts within the shared platform environment. As your job focus changes, so do the searches that DiscoverPoint makes….But the really cool things start happening after you’ve been using the system for a while. As more personnel and documents relevant to what you are doing become available on the system, they will show up on your discovery page.
The idea of having a system “discover” information appeals to the GCN professionals giving Discover Point a test drive.
Discover Point is compatible with SharePoint, Microsoft’s ubiquitous content management, collaboration, search, and kitchen sink solution. Discover Point’s news release emphasizes that the firm’s approach in unique. See “Discover Point Software Selected Product of the Month by Government Computer News.” The Discover Point Web site picks up this theme:
Discover Technologies’ approach is truly unique, in that we do not require the manual creation of databases or MySites or other repositories to understand the needs of each and every user. We continuously analyze the content they dwell in, and establish an understanding of the users’ interests based on that content. Once this user understanding is gained, and this happens very quickly, then the proactive delivery of information and ‘people’ is enabled and the cost savings and quality benefits are realized.
Unique is a strong word. The word suggests to me something which is the only one of its kind or without an equal or an equivalent. There are many SharePoint search, retrieval, and discovery solutions in the market at this time. The president’s letter tells me:
‘Discover’ is able to understand what your users need, in terms of both information and ‘experts’ with whom they should be collaborating. This understanding is gained via our patent pending algorithms, which are able to examine user related content and ‘understand’ the subject matter being addressed, and therefore the subject matter that each and every one of your employees is focused on. Once this takes place, our products can deliver both info and people to your users, personalized to match their individual needs. The bottom line is that you need your experts, your most highly paid and critical personnel, to minimize the amount of time they spend doing administrative or manual activities and to maximize the time spent tackling the key problems that they are uniquely qualified to address. That is what DiscoverPoint does for you, and it pays for itself in very short order!
The company offers an Extensible Search Framework and an Advanced Connector Engine. The company also performs customer UIS (an acronym with which I am unfamiliar). The firm also has a software integration business, performs “high performance data indexing”, and offers professional services.
The company has an interesting marketing message. I noticed that Google’s result page includes a reference to IDOL, Autonomy’s system. We will monitor the firm’s trajectory because it looks like a hybrid which combines original software, a framework, consulting, and services. Maybe Forrester, Gartner, and Ovum will emulate Discover Technologies’ Swiss Army knife approach to findability and revenue generation?
Stephen E Arnold, February 16, 2012
Sponsored by Pandia.com
Exogenous Complexity 2: The Search Appliance
February 15, 2012
I noted a story about Fujitsu and its search appliance. What was interesting is that the product is being rolled out in Germany, a country where search and retrieval are often provided by European vendors. In fact, when I hear about Germany, I think about Exorbyte (structured data), Ontoprise (ontologies), SAP (for what it is worth, TREX and Inxight), and Lucene/Solr. I also know that Fabasoft Mindbreeze has some traction in Germany as does Microsoft with its Fast Search & Technology solution. Fast operated a translation and technical center in Germany for a while. Reaching farther into Europe, there are solutions in Norway, France, Italy, and Spain. Each of these countries’ enterprise search and retrieval vendors have customers in Germany. Even Oracle with its mixed search history with Germany’s major newspaper has customers. IBM is on the job as well, although I don’t know if Watson has made the nine hour flight from JFK to Frankfort yet. Google’s GSA or Google Search Appliance has made the trip, and, from what I understand, the results have been okay. Google itself commands more than 90 percent of the Web search traffic.
The key point. The search appliance is supposed to be simple. No complexity. An appliance. A search toaster which my dear, departed mother could operate.
The work is from Steinman Studios. A happy quack to http://steinmanstudios.com/german.html for the image which I finally tracked down.
In short, if your company operates in Germany, you have quite a few choices for a search and retrieval solution. The question becomes, “Why Fujitsu?” My response, “I don’t have a clue.”
Here’s the story which triggered my thoughts about exogenous complexity: “New Fujitsu Powered Enterprise Search Appliance Launched in Europe Through Stordis.” The news releases can disappear, so you may have to hunt around for this article and my link is dead.
Built on Fujitsu high performance hardware, the new appliance combines industry leading search software from Perfect Search Corporation with the Fujitsu NuVola Private Cloud Platform, to deliver security and ultimate scalability. Perfect Search’s patented software enables user to search up to a billion documents using a single appliance. The appliance uses unique disk based indexing rather than memory, requiring a fraction of the hardware and reducing overall solution costs, even when compared to open source alternatives solutions…Originally developed by Fujitsu Frontech North America, the PerfectSearch appliance is now being exclusively marketed throughout Europe by high performance technology distributor Stordis. PerfectSearch is the first of a series of new enterprise appliances based on the Fujitsu NuVola Private Cloud Platform that Stordis will be bringing to the European market during 2012.
No problem with the use of a US technology in a Japanese product sold in the German market via an intermediary with which I was not familiar. The Japanese are savvy managers, so this is a great idea.
What’s this play have to do with exogenous complexity?
Chiliad: Virtual Information Sharing
February 14, 2012
In 1999, Christine Maxwell, who created the “Magellan” search engine, Paul McOwen, co-founder of the National Center for Intelligent Information Retrieval for the National Science Foundation, and Howard Turtle, former chief scientist at West Publishing, formed Chiliad with the intention of creating a business-to-consumer shopping site with a natural language search engine.
And then September 11, 2001, happened. Chiliad turned its attention to the intelligence community. In 2007, with the FBI as its largest client, the company received $1.6 million in funding from a joint development project with various intelligence and military agencies to enhance Chiliad’s cross-agency knowledge fusion capability by tightly integrating cross-domain “trusted guard” capabilities to support distributed multi-level-security and by enhancing collaboration tools. For the past several years, every time someone at the FBI wanted to search for a name in its Investigative Data Warehouse, technology from Chiliad was working in the background.
Another outfit which connects dots. But Chiliad connects all the dots. Hmm. A categorical affirmative, and I don’t think this is possible.
Chiliad has solved two challenging problems. The first is the ability to rapidly search data collections at greater scale than any other offering in the market. The second is to allow search formulation and analysis in natural language. It offers Chiliad Discovery/Alert, a platform for search and knowledge discovery to operate in parallel across distributed repositories of unstructured and structured data; Peer-to-Peer Architecture, which allows organizations to distribute instances of the search, indexing, and analysis engine in a network of cooperating nodes in local or remote distributed networks; Distributed Search, which provides a search capability that works seamlessly in amounts of structured and unstructured data; Filtering and Alerting Service for tracking and receiving alerts about new data in real time; Discover Knowledge service, an integral component of the Discovery/Alert platform used for navigation and discovery; Discovery/Alert Geospatial Service, an organizing concept for information; and Global Knowledge Discovery technology. Rather than moving data across the network to a central indexing system, Chiliad’s technology allows organizations to put a Discovery/Alert node wherever information is managed. Each node is part of a secure peer-to-peer network that allows a query to be executed in parallel across all locations.
The company serves investigative analysis, information security, and research and development applications; and government and intelligence, insurance, law enforcement, and life sciences healthcare industries. Because Chiliad’s product is a platform, it faces competition in the enterprise market from large, better known vendors, such as Microsoft, IBM, Oracle, and SAP.
Stephen E Arnold, February 14, 2012
Sponsored by Pandia.com