IBM Buys Vivisimo Allegedly for Its Big Data Prowess

April 25, 2012

Big data. Wow. That’s an angle only a public relations person with a degree in 20th century American literature could craft. Vivisimo is many things, but a big data system? News to me for sure.

IBM has been a strong consumer and integrator of open source search solutions. Watson, the game show winner, used Lucene with IBM wrapper software to keep the folks in Jeopardy post production on their toes.

A screen shot of the Vivisimo Velocity system displaying search results for the RAND organization. Notice the folders in the left hand panel. The interface reveals Vivisimo’s roots in traditional search and retrieval. The federating function operates behind the scenes. The newest versions of Velocity permit a user to annotate a search hit so the system will boost it in subsequent queries if the comment is positive. A negative rating on a result suppresses that result.

I learned that IBM allegedly purchased Vivisimo, a company which I have covered in my various monographs about search and content processing. Forbes ran a story which was at odds with my understanding of what the Vivisimo technology actually does. Here’s the Forbes’ title: “IBM To Buy Vivisimo; Expands Bet On Big Data Analytics.” Notice the phrase “big data analytics.”

Why do I point out the “big data” buzzword? The reasons include:

Vivisimo has a clustering method which takes search results and groups them, placing similar results identified by the method in “folders”
Vivisimo has a federating method which, like Bright Planet’s and Deep Web Technologies’, takes a user’s query and sends the query to two or more indexing systems, retrieves the results, and displays them to the user
Vivisimo has a clever de-duplication method which makes the results list present one item. This is important when one encounters a news story which appears on multiple Web sites.

According to the write up in Forbes, a “real” news outfit:

IBM this morning said it has agreed to acquire Vivisimo, a Pittsburgh-based provider of big data access and analysis tools.

Okay, but in Beyond Search we have documented that Vivisimo followed this trajectory in its sales and marketing efforts since the company opened for business in 2000. In fact, the Wikipedia write up about Vivisimo says this:

Vivisimo is a privately held enterprise search software company in Pittsburgh that develops and sells software products to improve search on the web and in enterprises. The focus of Vivisimo’s research thus far has been the concept of clustering search results based on topic: for example, dividing the results of a search for “cell” into groups like “biology,” “battery,” and “prison.” This process allows users to intuitively narrow their search results to a particular category or browse through related fields of information, and seeks to avoid the “overload” problem of sorting through too many results.

Written by Stephen E. Arnold · Filed Under Acquisition, Feature, Federated search, Search | 2 Comments

Search Vendors and Web Traffic

April 4, 2012

I was fiddling around with Compete.com, a Web analytics outfit. One of the company’s services is to provide usage statistics for urls. I have no idea if these data are in line with the vendors’ Web logs, but one can look at a number of sites’ traffic and do some rough comparisons. You can try the service at www.compete.com. The tables below present some of the data I examined. Vendors, don’t write to tell me Compete data are incorrect. We are looking at data generated by a method. I am not selecting data to make your Web site magnetism weak.

Company	Traffic (2- 2012)	Comment
Attensity	974	Conferences and news releases
Attivio	973	White papers
Autonomy	12597	Full court press method
BA Insight	0	Microsoft and webinars
Brainware	857	News releases
Concept Searching	0	Webinars
Connotate	5025	News releases and analyst support
Content Analyst	321	Some conference participation
Coveo	2034	Traditional PR
Dieselpoint	1880	Web site
dtSearch	5208	Free downloads
Easy Ask	472	News releases
Endeca	3900	Analyst support
Exalead	20494	Full court press method
Exorbyte	11486	News releases
Funnelback	2640	News releases and conference partici- pation
Hakia	7891	News releases
Lexalytics	710	News releases
Linguamatics	0	Web site
MarkLogic	1088	Full court press method
Recommind	3964	News releases aimed at trade publications
SearchBlox	0	Web site
Sinequa	0	Web site
Vivisimo	7324	News releases
ZyLAB	0	Full court press method
X1	5589	Web site
X1 Discovery	2575	Web site

Here is this sample’s alleged traffic from most traffic to least traffic:

Company	Traffic (Feb 2012)	Comment
Exalead	20,494	Full court press method
Autonomy	12,597	Full court press method
Exorbyte	11,486	News releases
Hakia	7,891	News releases
Vivisimo	7,324	News releases
X1	5,589	Web site
dtSearch	5,208	Free downloads
Connotate	5,025	News releases and analyst support
Recommind	3,964	News releases aimed at trade publications
Endeca	3,900	Analyst support
Funnelback	2,640	News releases and conference partici-pation
X1 Discovery	2,575	Web site
Coveo	2,034	Traditional PR
Dieselpoint	1,880	Web site
MarkLogic	1,088	Full court press method
Attensity	974	Conferences and news releases
Attivio	973	White papers
Brainware	857	News releases
Lexalytics	710	News releases
Easy Ask	472	News releases
Content Analyst	321	Some conference participation
BA Insight	0	Microsoft and webinars
Concept Searching	0	Webinars
Linguamatics	0	Web site
SearchBlox	0	Web site
Sinequa	0	Web site
ZyLAB	0	Full court press method

Several observations:

There is little correlation between having a Web site and traffic. If you build it, no one may come. Search engine optimization experts are not able to deliver the chunky granola bar stuffed with sales leads I surmise.
Traffic, even for the leaders Exalead (Dassault Systèmes) and Autonomy (Hewlett Packard) is modest when compared to the traffic to the main Dassault and the HP main Web sites. Dassault’s traffic is reported as 56,164 and HP’s, 13,856,775. My thought is that search and content processing is not the home run some folks assume it will be. I don’t think effective search is a commodity. I think search is not hot as other fields; e.g., analytics.
The different marketing methods in use work in some cases and in others not at all. A good example are those companies with Web site traffic so low that Compete.com reports zero traffic. The little known Exorbyte generates an alleged 11,486 while the marketing calisthenics of Webinar centric marketing (BA Insight and Concept Searching) and the open source approach (SearchBlox) yield little traffic. What happens when one marketing method doesn’t perform? Most vendors just try something else. When something works, vendors just keep doing it until it no longer works. Rinse, repeat.

More work needs to be done to figure out what generates traffic to a group of companies which appear to have modest traction even with the backing of major companies (e.g., Endeca is owned by Oracle). Some of these companies have expensive public relations programs in place. I don’t know how much firms such as MarkLogic and Recommind spend on the flow of news releases, special events, and trade publication by lined articles, but the traffic seems to be an issue.

My thought is that most of the vendors in the search and content processing space face what I would characterize as a “crisis” in marketing. None of the activities produce blockbuster traffic. Obviously, if a Web site has a single unique visitor and that visitor places a $20 million order, the Web site worked. My hunch is that some of these companies are going to kick into what I call “desperation marketing” mode. I see this when I get mindless faux “news” announcements from PR firms stuffed with failed middle school teachers, unemployed socialogy graduate students, and “real” journalists whose magazine or newspaper nuked itself.

“Desperation marketing” is the outcome of watching costs for each sales contact go up without a comparable increase in the close rate. If sales come only when there is cost cutting to win the job, then the financial situation becomes increasingly charged. Desperation marketing leads to rich search engine optimization consultants and white paper work for the “pay to play” consultants. (Yes, this is the coloring agent for the azure chip consulting herd.) Have you heard of webinar fatigue? I have it. How many SharePoint webinars do I need to sit in on to know that SharePoint is a complex and often irritable beastie? Webinar fatigue is what causes a potential attendee to sign up, listen for five minutes, and then move on to checking a Facebook page. Some companies would do better to put the thing on video and go with a YouTube channel with a “register to win a bagel” inducement on a special landing page.

Search and content processing vendors need more than hit-and-miss marketing activities. If I were responsible for a search and content processing company, I would be looking for different ways to generate visibility and sales leads.

A Web site alone won’t do the job. Traditional PR seems to work in some cases and does not work in others. The variable, of course, is the talent of the PR professional and the value proposition the company sets forth. Get the PR wrong or the message wrong, and PR becomes a hit and miss investment. SEO appeared to me to be part of each of these companies’ Web presence. I also have a tally of which of these companies make use of blogs and social media like Twitter, which I may summarize in another blog post. But the data suggest to me that social media is no panacea either. Traditional marketing and PR are expensive, unreliable, and an okay reaction to competitors’ actions.

Is there another path? We think there are some new methods. One interesting one is the Augmentext service. It is worth a quick look.

In the meantime, if you are trying to close deals using a Web site and some old fashioned methods, you may find yourself under increasing pressure. Replacing company presidents, hiring a Mad Ave agency, or signing on with a slick self appointed expert—these are standard methods. The issue becomes making them work in a tough economic and competitive environment.

Stephen E Arnold, April 4, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Business strategy, Marketing, News, Search | Comments Off on Search Vendors and Web Traffic

Altova Noses into XML Semantics

March 27, 2012

IT Jungle’s Alex Woodie recently announced some good news for IBM DB2/400 fans in the article “Altova Adds Support for DB2/400 Logical Files in MissionKit.”

According to the article, Altova has now added support for DB2/400 logical files in MissionKit. The latest release of MissionKit called 2012r2, features updates to support for DB2/400 logical files have been added to the XMLSpy, MapForce, UModel, DatabaseSpy, and DiffDog products, which already supported DB2/400.

Woodie writes:

MissionKit includes eight handy utilities that allow IT professionals to accomplish a range of XML, data, and unified modeling language (UML)-related tasks. Anchoring the kit is its popular XML editor, called XMLSpy. MapForce, meanwhile, provides data conversion and related capabilities, UModel allows developers to visually design their application flows in UML, while DatabaseSpy allows users to design, query, and compare multiple databases. Rounding out the suite are StyleVision, DiffDog, SchemaAgent, and SemanticWorks.

These new features are bound to attract IBM i customers due to its powerful data manipulation tools. For more information and free trial downloads check out www.altova.com Since I am no longer receiving spam from MarkLogic and AtomicPR, I am not sure how that XML centric company is responding.

Jasmine Ashton, March 27, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under News, Semantic, XML | Comments Off on Altova Noses into XML Semantics

Intellisophic: Formerly Indraweb

January 26, 2012

Founded in 1999 as Indraweb and changing its name in 2055, Intellisophic, Inc., is a privately-funded technology company that is the world’s largest provider of taxonomic content. Its technology, originating from the work of founders Henry Kon, PhD., George Burch, and Michael Hoey, is based on the premise that concepts within unstructured information can be systematically derived by leveraging the trusted taxonomies of the reference book community. Within this core idea, Intellisophic developed and patented the Orthogonal Corpus Indexing algorithm for extracting and using taxonomies from reference and education books.

During a stint as principal investigator for MIT’s Context Interchange, CTO Kon researched and implemented methodologies for enterprise integration of structured and semi-structured data over independently managed and disparate schema databases. He researched, designed, and prototyped integration engines for distributed multi-database query and caching over heterogeneous, distributed, and partially connected databases. As a member of MIT’s Composite Information Systems Laboratory, Kon published on multi-database integration engines and the use of ontology for bridging database schema. With Intellisophic, he has pioneered innovation in the conceptual management of unstructured information and in the integration of structured, semi-structured and unstructured content.

Intellisophic content is machine-developed, leveraging knowledge from respected referenceworks. The taxonomies are unbounded by subject coverage and are cost-effective to create. The taxonomy library covers several million topic areas defined by hundreds of millions of terms. In addition to taxonomic content, the company offers intelligent solutions, such as enterprise search and retrieval, business intelligence, categorization and classification, compliance management, portal infrastructure, social networking, content and knowledge management, electronic discovery, data warehousing, and government intelligence.

Its strategic alliance partners include Mark Logic, DataLever, SchemaLogic, DFI International, and Mosaic, Inc. Competitors Sandpiper, Intellidimension, and HighFleet. The depth and breadth of Intellisophic’s taxonomies, along with its support of the leading text mining, search, and categorization applications, make it a good solution for many industries. (I would not include Concept Searching or Ontoprise in this short list due to exogenous complexity factors.)

Stephen E Arnold, January 26, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Indexing, News, Profile, Text processing | Comments Off on Intellisophic: Formerly Indraweb

Amazon: Will DynamoDB Electrocute the Big Boys?

January 18, 2012

I want to capture a few business related observations about Amazon’s now public DynamoDB. The blog post by Amazon’s chief technical officer provides a good overview of the home grown NoSQL data management service. Navigate to “Amazon DynamoDB–A Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications.” For a run down of some of the features, point your browser at “Notes About Amazon DynamoDB.” The basic idea is that Amazon has created its own NoSQL database, matched it to the Amazon cloud environment, and packaged it with taxi meter pricing.

Why didn’t Amazon use Hadoop or some other NoSQL, open source, Codd free systems? My hunch is that Amazon sees big money in a ready-to-role, automatic sharding, solid state disc base data management solution. Rolling its own solution gives Amazon control. In fact, Amazon is cranking up the dial on its Controlometer.

The issue that interests me is the business angle of the DynamoDB. Here are several preliminary thoughts.

First, Amazon is getting frisky but slowly. My sources report that work on the DynamoDB system began several years ago. Microsoft picked up wind of the project and was unable to respond. Right now, Amazon’s an engineering magnet, attracting talent from outfits once considered the best in the soggy city. With higher quality engineering horsepower, the dowdy retailer is shifting from a horse and wagon to a far more capable vehicle.

Second, MarkLogic had the idea that it could impinge on Oracle. Well, we know how that turned out with AtomicPR (the content fallout kids), management change, and wild and crazy marketing. Now Amazon is on the path to make life tough for Oracle. Amazon had Oracle as a steady date, but senior year is coming. Amazon may be marrying the DynamoDB, leaving Oracle without a homecoming date. If Amazon pulls off this new hitch up, Amazon may be ready to go for the enterprise gold. I think this is better than a 50-50 deal but I may change my mind.

Third, Amazon has demonstrated the value of a “Google Legacy.” Google plunged forward, diffused its resources, and ended up with its lovely self snared in legal and social thorns. Amazon, on the other hand, has avoided some of the traps into which Google threw itself. In the process itself, Amazon used Android to move its branded hardware forward. There is nothing like a friend who plans on evicting you from your home. Amazon is, once again, going beyond Google.

I have a number of other thoughts, but the goose’s liver needs a rest. Oh, oh, here comes a scowling nurse. Will she rescue the electrocuted big boys of database? I doubt it.

Stephen E Arnold, January 18, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Database, Management, News, Technology | 2 Comments

SAP: Lemons from Lemonade for Search Vendors

January 18, 2012

A couple of years ago I did a series of columns about SAP, the German software company which is imbued with the DNA of IBM and the more unpredictable genes of the “let ‘er rip” approach to generating revenues. Change is difficult, and SAP interests to me because the firm’s machinations are the embodiment of the dislocations that old style software vendors face in the cloudy world of Amazon, Google, and even old Big Blue herself, IBM. Keep in mind, one of SAP’s strategic moves was to purchase Sybase.

HANA emerged two years ago as a solution to the woes of organizations struggling with big data, the need to make sense of them, and the complexity which threatens to sink traditional enterprise applications. Consider SAP itself. The company owns Business Objects, once the leader in business analytics. Today I don’t think of Business Objects, which may say more about my awareness than SAP’s marketing. But I hear zero about Inxight Software which performs entity extraction and other text operations and I have heard little or nothing about TREX, SAP’s information retrieval system. I lost track of the SAP investment in Endeca long before SAP’s rival Oracle snagged the 1998 technology to “enhance” its own struggling search solutions.

What is HANA?

According to an SAP friendly blog, SAP describes HANA in this way:

HANA is the foundation and the core of all that we do now and going forward for existing products, new products and entirely new frontiers. We are transforming enterprise software with HANA, and we are transforming our entire product portfolio,” Sikka said in a statement earlier this week announcing that SAP HANA is now generally available worldwide. “But HANA is more than a product,” Sikka continued. “It is a new paradigm, an entirely new way to build applications. It is the basis for our own intellectual renewal internally at SAP—where we rethink how we design, build, deploy, service and sell products—and the basis for our customers’ and partners’ intellectual renewal—where we help customers rethink existing business problems and help them solve entirely new challenges using design-thinking.” (Source: The Top 10 Reasons SAP HANA Is Disrupting Larry Ellison’s Grand Plans]

To me, HANA is a next generation database and it now has to differentiate itself from the XML next generation database from the likes of MarkLogic, from Cloudera, from other NoSQL solutions, and from the new and improved versions of data management systems from IBM, Microsoft, and even Amazon. Big job. Maybe an impossible job?

In December 2011, I snipped the write up “Can SAP be the #2 database vendor by 2015?” I found this passage particularly interesting:

Why doesn’t SAP HANA have deeper market penetration? Put simply it is because SAP wanted it this way. Whilst HANA truly is a general-purpose database, SAP first announced it as an analytics appliance for the 1.0 release. They also priced it really high and didn’t’ offer a discount – list pricing can be as high as €180,000 for a 64GB HANA “unit”, depending on which version you require. And what’s more, SAP sells solutions and HANA is a platform, so the global sales force doesn’t quite know how to sell it in volume – yet. They didn’t want to sell it in volume in any case because they wanted to introduce it slowly to market – building stability, references along the way and avoiding expensive and embarrassing global escalations. So by the end of 2011 we should expect $100-150m of HANA sales, which is 3-5% of SAP’s total revenue. Not particularly significant, right? Well in September they released HANA as being supported for SAP’s Business Warehouse software, which allows large-scale data warehouses. And this is where it gets interesting: there are 17,000 existing BW customers, and HANA would provide business benefit to all of them.

If you are interested in HANA, you can access SAP’s primer about the solution at this link.

In the midst of the HANA hype, Seeking Alpha’s “SAP Is No Longer The Leader It Once Was” stated in December 2011:

The current most promising innovation is SAP HANA, an appliance with columnar in-memory technology enabling fast processing and near real-time analytics. According to SAP, HANA has the potential to become the next-generation system architecture, removing the use of middleware and relational databases. However, the root causes of the downturn appear outside the perimeter of the company transformations: product development, continuous customer complaints, and the 20-year aging ERP that represents the core of the customer base seem to remain unchanged. Agile is probably not enough to address the long-term issues of product development. Most likely, Agile is not the solution to fifteen years of trying to get CRM right, or to making three platform mistakes in three on-demand initiatives (CRM on-demand in 2006, Business byDesign in 2007, and SaaS Enterprise in 2009).

The Seeking Alpha analysis then makes these machine gun like statements:

Is SAP getting it right? Here is a summary of the points to keep in mind to answer this question:

SAP R&D has yet to deliver its first truly successful product since 1992 (it could be HANA overtime)

The core of ERP that holds the customer base is outdated

There seem to be no plans to develop a modern replacement product

Development of a potential new ERP would take years

Sales have declined stepping back by 3 to 4.5 years

SAP’s leadership is questionable

According to Gartner, the revenue from relicensing R/3 to ERP 6.0 is ending

Customers and employees have lost trust

Executives have been leaving

On-demand is not making progress

The customer base is increasingly at risk

Analysts estimate that HANA could produce just 10% of the revenue by 2013.

There is a gap between the buzz and the hard facts.

What does this mean for vendors who hitch their wagons to the SAP “star” as ISYS Search Software did with the announcement “ISYS Wins Software Deal with SAP”? Three points:

Search vendors are looking at their technology and packaging it in ways to generate incremental revenue. ISYS, it appears, is in the connector game, competing with firms such as EntropySoft
SAP seems to be lagging further and further behind the NoSQL players who are now facing headwinds despite early market leads. My example is MarkLogic, the XML database outfit
The broader market seems to be splitting into quite different segments. SAP is going to have difficulty in the IBM and Oracle space, and it is going to have trouble with the open source NoSQL crowd which seems to prefer having Hadoop on its T shirts than HANA.

SAP remains interesting, but it is now in some danger of further marginalization. SAP needs a search system still.

Stephen E Arnold, January 18, 2012

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Business strategy, Connectors, Database, News, Search | 2 Comments

Article Marketing Confused with Article Spinning

December 30, 2011

We receive quite a few missives from hot, maybe radioactive, public relations outfits. A good example is AtomicPR, the MarkLogic information output service. I have a tough time figuring out what is editorial opinion such as the information I generate when a topic like azure chip consultants or LinkedIn enterprise search blather. I also never know when a Forbes or Bloomberg news story is a recycle news release. Marketwatch also baffles me. I am deeply suspicious of any information from Marketwatch which is displayed with copious amounts of green, which is supposed to suggest money to me I think. I skip the public relations nuclear waste, the company sponsored blogs which provide me with tips to cope with eDiscovery as ZyLAB is doing, and sponsored blogs like our own HighGainBlog.com operation. (Oh, we will announce a new sponsor in January 2012, and we will deliver useful, curated information too. I find company blogs endlessly amusing. Google operates more than four score blogs each outputting “content.” Now the SEO crowd has figured out “content.” Hooray.

Writing for Search Engine Journal, Suzanne Edwards puts her spin on article marketing in “Eight Good Reasons Why Spinning Articles is Bad for your Website.” The writer, who also writes for Cash for Gold, of all places, makes some sweeping generalizations. There is a wide range between summarizing and pointing the way to helpful links and using “spinning” software. The article describes these abominable applications:

Tons of article spinning software have flooded the Internet. Needless to say, article marketing has become an efficient way of building hundreds if not thousands of backlinks. However, automatic articles spinning with the use of a spinning software is deemed as a black hat SEO technique that can seriously hurt a website’s search rankings and page rank.

While we agree with her point on robo-writers, she paints with too broad a brush. Is reason number nine financial advisors’ ability to discern truth and accuracy? Financial services firms make SEO firms look like the Vatican’s college of Cardinals on a Bible study weekend.

Super. This goose will immediately snap to content ideas from a financial advisor. It’s okay. We trust financial advisors like Bear Stearns and Lehman Brothers, right? We believe everything we read on the Internet even when the content is delivered by predictive methods developed by dear old Google and Microsoft.

Cynthia Murrell, December 30, 2011

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Marketing, News, Publishing | Comments Off on Article Marketing Confused with Article Spinning

Open Text Social Framework

November 21, 2011

The dips and glides of the enterprise and content processing sectors fascinate me. I noticed that Open Text, based in Waterloo, Ontario, is on track to remain a $1.0 billion company. As I write this, the company’s stock is nosing toward $60 a share. With Hewlett Packard’s acquisition of Autonomy, Open Text inherits the title of a “billion dollar search and content processing company.”

In the 1990s, I tracked Open Text. As the company evolved into a collection of properties, I shifted to companies which were sticking closer to the “findability” sector. As you probably know, the core of Open Text today sits upon technology which I associate with Dr. Tim Bray. Dr. Bray work at Digital Equipment and worked at the University of Waterloo on the New Oxford English Dictionary project. He founded Open Text Corporation, which commercialized an XML search system which I believe was used in the dictionary. Open Text created a Web index which available as the Open Text Index and then morphed into “Tuxedo,” a Web index no longer available at the link I had on the Open Text Web site. Web search is an expensive proposition, and I understand why a company like Open Text would exit the free Web search service business.

Today’s Open Text owns the SGML search technology, and the company has acquired a number of other search and content processing systems. My view is that Open Text perceived search as a good business in which to compete. With the ready availability of open source search solutions and low cost “good enough” systems, I wonder if the company’s enthusiasm for search and retrieval has dwindled.

Open Text has a number of search technologies. For example, Open Text acquired Information Dimension in 1998. Information Dimensions’ BASIS search system was database management system. My colleague Howard Flank and I used BASIS to build the original Bellcore MARS billing system on the platform shortly after the AT&T breakup was announced. Open Text also acquired Fulcrum, a Microsoft centric search and retrieval system based in Ottawa, Ontario. I remember that one could use Fulcrum to search Siebel Systems content. Hummingbird was acquired by Open Text in 2006. Open Text used the Fulcrum technology in its Hummingbird Search Server product, now a connectivity solution. Open Text also acquired BRS Search (Bibliographic Retrieval Services) in 2001. As you know, BRS was a competitor to Dialog Information Services. BRS was a variant of IBM STAIRS technology, ran on IBM mainframe systems, and could handle sophisticated queries. I recall hearing that BRS technology was used in the Open Text LiveLink product. I think of LiveLink as an early version of SharePoint, blending content, collaboration, and search in a single system.

In 2010, Open Text purchased the Nstein content processing firm, which was based in Montréal, Québec. I think one of my team contacted Nstein to profile them for one of my reports. The firm was too busy. Then in 2009, an Nstein executive scheduled an appointment with me in London, UK, and “forgot” the meeting. Nifty.

Open Text has a basket of technologies to use to solve prospect and client problems. Is the company a model for other search and content processing firms trying to generate top line growth in a tough economic setting?

Since Dr. Bray’s departure, Open Text has been rolling up search and content processing firms. Much of the company’s growth has been fueled by acquisitions and cross selling, not raw innovation. In fact, Open Text has a bewildering array of content management technologies, including PS Software (records management), Gauss (Web content management systems), RedDot (Web content management systems with an embedded Autonomy search functionality), IXOS AG (SAP-centric archiving systems), Captaris (document capture systems which gave Open Text Brainware and ZyLAB functionality), Spicer (file viewing technology), Vizible (an interface company), StreamServe (an enterprise publishing system vendor of direct mail and other collateral), Metastorm (business process software), weComm (mobile device software developer), and Global 360 Holding Corp. (case management solutions).

Written by Stephen E. Arnold · Filed Under Business strategy, Feature, Financial, Search, Technology, Text processing | 1 Comment

ISYS Has 16,000 Customers. Did I Goof?

November 11, 2011

I covered six vendors of enterprise search systems in my June 2011 The New Landscape of Enterprise Search. An azure chip consulting firm borrowed a key word from my monograph’s title and put out a report covering twice as many vendors.

Today I read “16,000 Organizations Worldwide Now Boost Their Productivity with the ISYS 1-Click FileFinder.” In a write up about AtomicPR’s spam attack on me and the MarkLogic “reinvention of itself as more than a file markup and repository outfit,” I mentioned ISYS Search Software was licensing its connectors, essentially software widgets that allow one system to ingest the files from an incompatible system. So ISYS, ISYS, ISYS.

Years ago I met the founder of ISYS Search Software in Crow’s Nest in a suburb Sydney, Australia. I recall a very interesting lunch in a restaurant that was almost next to the ISYS headquarters. Very interesting those Australian engineers. At the time, I was doing something for some outfit sponsoring the international chief of police conference or some similar intelligence-type event. I was one of the speakers and a guest of the Australian government. In my spare time, I was either watching folks shoot red kangaroos or visiting search and information retrieval experts. After the visit, I did some work for Ian Davies, the founder. His role has changed, and I have lost track of him, his senior sales professional, and the senior engineer whom I met that day. Distance and time I suppose.

I have drifted away from ISYS because I learned that the company–despite a new president, new lines of business like licensing connectors, and introducing file finding utilities—was not hitting my radar with the sort of information I am now tracking. No problem, of course. Quite a few search vendors have changed their spots or at least their marketing pitch faster than a rap star who signs a movie deal. Examples range from Coveo becoming a customer support solution provider to Vivisimo’s puzzling “information optimization.” Other vendors have gone quiet like Dieselpoint, an XML centric search system vendor. Others have found themselves on the receiving end of a dump truck filled with cash. Think InQuira, Autonomy, Endeca, and RightNow to name four vendors who are now happily within giant corporate shells thinking about which island to buy.

My understanding is that ISYS generates about one third of its revenue from the US and the balance from elsewhere. Although the UK is a good market for ISYS, the company’s stronghold is Australia. This raises what I call “the Canadian question.” Ah, you ask, “What’s Canada got to do with Australia and ISYS?”

Here’s my point. When determining how much revenue one of my ventures can generate in Canada, I take the US revenue and then figure that Canada will product 10 percent of that amount. The reason has to do with population, appetite for the sort of products my team produces, and experience. The 10 percent can be five percent, or it could be 15 percent. However, 10 percent is a good rule of thumb.

Therefore, if a company in Australia generates $10 million a year in that country of 23 million people, then it follows that the US with its population of 308 million should produce revenue of about 12 to 13 times the Australian revenue. If we assume that ISYS is generating $10 million from the land down under, I would expect $120 million from the land up above.

I may be off base, but in our research for The New Landscape of Enterprise Search, I did not find data to support that ISYS was generating revenue in this range. Therefore, I decided to exclude the company from my monograph.

The azure chip consulting firm replete with home economics majors, a handful of former journalists, and a couple of failed webmasters sees the world differently. I think the reason is that the azure chip outfit uses its reports as sales collateral. I don’t have any first hand experience with the “real” consultants in enterprise search, but after reading some of these reports, I formed my own opinion. Yours may differ.

To answer the question, “Did I goof by not including ISYS along side Autonomy, Endeca, Exalead, Google, Microsoft, and Vivisimo?,” The answer is, “I don’t think so.”

Hoping a vendor is competing with the likes of Autonomy, Endeca, Exalead, etc. is one thing. Actually beating these firms in major accounts is a different one. Just my opinion, and I look forward to the push back from the “experts” who know more than I, aggrieved company executives who want me to revisit my conclusions about which companies are altering the landscape of search, and the “real” consultants who will swarm over my view point.

Have at it kids. Sales revenues matter. When someone plops down $1.2 billion as Microsoft did for the Fast Search & Technology system or the interesting $10 billion for Autonomy, I will make another pass over the “big six.” Until then, I need to hear first hand about how non US firms cope with my Canadian rule of thumb. I quite like the ISYS technology. But for Landscape, revenues play more of a role than technology.

Stephen E Arnold, November 11, 2011

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under Connectors, Enterprise, Enterprise search, News, Search, Technology, Text processing, Tools | Comments Off on ISYS Has 16,000 Customers. Did I Goof?

Will a Silver Bullet Save Sci-Tech Publishers?

November 11, 2011

I poked around my Overflight service and noticed a recent news release with the meaty title “Scientific Publisher Saving Hundreds of Thousands of Dollars with MarkLogic.” The subtitle was compelling as well: “New Mobile Applications Let Researchers Study in the Field.”

I thought a moment about the logic of the two statements. I am okay with the idea that a scientific publisher faces some significant challenges. The traditional markets for scientific and technical information in traditional journal form are under severe budget pressure. In response to some scientific publishers’ pricing policies, libraries and some not for profit outfits no longer renew certain journal subscriptions. Others have joined consortia in order to get better value for available budgets.

But STM (scientific, technical, and medical) publications have other issues with which to cope as well. First, technology may not be a core competency. Why would it be? Publishers get authors to write. Publishers package and sell. Technology is talked about but even giants like Thomson Reuters buy print publishing companies in Argentina. So much for embracing the digital revolution. Even more interesting is that some STM publishers often ask authors pay the journal typesetting, correction, and maybe some production costs. As headcount comes under pressure in research institutes and universities, some scientific publishers are finding that authors are either not willing to pay or not able to get a third party to pony up the money. In short, STM in the traditional mode is fighting for oxygen.

The mobile angle baffled me as well.

In my experience, many scientists work in what might be called “controlled environments.” In the pharmaceutical sector, certain firms operate the research facilities the way a South African gold mine superintendents monitor workers at the end of a shift. If this type of security does not resonate with you, you need to do some backfilling on gold and diamond mining security protocols. Think naked. Think weighing workers before and after a shift. Think requiring showers and filtering the gray water. You get the idea. Other types of research does require mobile devices; for example, cleaning up a gone-wrong nuclear reactor which is not a job for an outfit like AtomicPR, in my experience. Public relations “experts” write about radiation and often have limited experience with micro-contamination and chemical decontamination. The point? Mobile often has specific requirements which stretch beyond creating an “app for that.”

In a nutshell, here’s the nub of the news release from my point of view:

Taking research into the field has a new, literal meaning with the launch of new mobile applications built on MarkLogic that are helping scientists better understand soil and crops. MarkLogic Corporation, the company empowering organizations to make high stakes decisions on Big Data in real time, today announced the American Society of Agronomy (ASA) launched Science Pubs, developed for iPad, iPhone, Android, and BlackBerry devices. Science Pubs utilizes MarkLogic to give subscribers and non-subscribers the freedom to dig deep into ASA’s journals, magazines, and eBooks while conducting first-hand research and observations in the field.

The point is that a markup language makes it possible to do an app. Puzzled I plunged forward:

“MarkLogic will save us at least $150,000 per year. That is a lot of money for any publisher, especially a non-profit like the American Society of Agronomy,” said Ian Popkewitz, director, Information Technology & Operations, American Society of Agronomy. “We originally implemented MarkLogic to cut the cost of providing critical publications to our subscribers, but we quickly realized several intangible benefits such as speed, ease of use, and flexibility. The flexibility allowed us to focus on the deployment of Science Pubs. ASA is very pleased to be able to quickly launch these services for subscribers and non-subscribers, and we expect them to generate revenue.”

I understand. However, I want to offer several observations based on my modest experience in publishing. Note I did work for a newspaper that was once one of the Top 25 in the world, but the paper is a starved dog now. I also worked for Bill Ziff, mastermind of multiple empires and the magnate other New York publishers loved to loathe, which is what I learned when I was escorted from the New York Times’s president’s office when he learned I worked for the interesting Mr. Ziff.

First, publishers absolutely have to reduce their costs and in a big way. Saving $150,00 is great, but my question is, “How much does it cost to implement a cost saving system such as a MarkLogic or JSON solution (the fat free alternative to chubby XML), keep it up, and then running at a scientific publisher such as the American Society of Agronomy?” If a system costs $50,000, 100,000, or even $300,000, the publisher has to pay off the system, its maintenance fee, and whip out some products that sell. With revenues at many scientific publishers flat lining or shriveling, the savings are important and may light a fire under the agronomists to cope with a big expense in the name of cost savings. That type of race can be brutal. And it is one that I would be reluctant to enter.

Second, many not for profit organizations and “charities” in the UK are facing declining memberships. Unthinkable five years ago, professional organizations have to market to their members and then spend money to collect on slow paying professionals. Even the certification angle in the UK is not working as it once did. Unemployment among professionals is making it difficult for some experts to pay to be in a must-have organization. Faced with rising costs across the board and decreasing or flat revenue, some not for profit outfits are looking at a nuclear winter, not AtomicPR with a very short half life.

Third, the notion that scientific research has to be peer reviewed in a lengthy, antiquated manner. Also, the long publication cycles for some STM journals are out of step with the real time culture in fast moving fields. Not surprisingly, the no-cost or low-cost alternatives to traditional journal publishing refuse to go away. In some fields like mathematics and physics, blogs and even social media have become the important channels for dissemination of technical information and making or breaking careers. Even grants can be determined by a Facebook-type of presence. Quite a shift.

My take on this “news story” is that it makes a possibly compelling case that an XML repository can help reduce certain costs. But without the context of total cost burdens, I have a question, “Why not use JSON?” XML is darned useful, but so is JSON. My concern is that for many scientific, technical, and medical publishers, is JSON a viable option?

The ArnoldIT team is finishing a report about the outlook for a major publishing company. With more than $5 billion in revenues, this well known firm may be forced to sell its STM business to generate cash. Not even cost cutting can prevent the dislocations that some publishing companies face. The digital revolution has arrived and is now moving in new directions. Many traditional publishers face stark choices and very difficult financial challenges. Alas, no silver bullets today in my opinion.

Stephen E Arnold, November 11, 2011

Sponsored by Pandia.com

Written by Stephen E. Arnold · Filed Under News, Publishing, Technology, Text processing, Tools, Work flow, XML | Comments Off on Will a Silver Bullet Save Sci-Tech Publishers?

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

IBM Buys Vivisimo Allegedly for Its Big Data Prowess

Search Vendors and Web Traffic

Altova Noses into XML Semantics

Intellisophic: Formerly Indraweb

Amazon: Will DynamoDB Electrocute the Big Boys?

SAP: Lemons from Lemonade for Search Vendors

Article Marketing Confused with Article Spinning

Open Text Social Framework

ISYS Has 16,000 Customers. Did I Goof?

Will a Silver Bullet Save Sci-Tech Publishers?

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta