Geospatial Intelligence: Autonomy and SharePoint

March 16, 2012

I must admit I don’t associate Hewlett Packard Autonomy with Microsoft. I know I should. Autonomy technology has been adding functionality to Microsoft SharePoint for years. I was reminded of Autonomy’s ability to “play well with others” when I read “Information Discovery Improves Search Capability for the Largest Database of Geospatial Intelligence.” If you are not involved in intelligence activities, you may not know what “geospatial intelligence” embraces. If you don’t know, I am not going to explain it to you.

The write up makes three points.

First, the use case described in the document performs what I call data fusion. For the azure chip crowd and the self appointed search experts, you can probably figure out that Autonomy technology is facilitating the integration of images, data, and other information. Without Autonomy, the merged outputs would not be possible.

Second, the use case makes clear that search is an essential component of information discovery. Everyone wants the outputs to tell the user what she or he needs to know. Won’t work. So outputs lead to search and search leads to more outputs. The use case explains that text and source data have to be “augmented”; specifically, entity extraction, categorization, geo-tagging, and reverse geo-tagging.

Third, the system handles open source and secure content in compliance with a Department of Defense metadata specification. If you like codes, here’s the one you need: DDMS 2.0.

Net net: Autonomy has some interesting capabilities for outfits who use Microsoft SharePoint.

Stephen E Arnold, March 16, 2012

Sponsored by Pandia.com

PDF Search from Dieselpoint

March 14, 2012

We heard Dieselpoint offers a PDF search engine, so we decided to check it out. This company keeps a very low profile, but we find it is worth looking into.

Dieselpoint’s PDF Search is an enterprise product that can navigate large collections of PDFs, extracting both metadata and text for indexing. Metadata can be searched and used to build more sophisticated interfaces in conjunction with Dieselpoint’s Search platform.

Often, titles are left out of a document’s metadata, making searches more challenging; Dieselpoint has an innovative solution for that. The product overview states:

Quite often, authors of PDFs neglect to enter titles into the document’s metadata. This makes it difficult to display a good, descriptive title when a PDF appears on a search results page. Dieselpoint Search eliminates this problem by providing ‘Smart Titles’. The system analyzes each PDF looking for clues as what the title might be, and employs advanced heuristics to select one. Studies show that Dieselpoint’s algorithm selects a title which is the same as the one that a human would have selected over 90% of the time.

This tool also takes advantage of XMP data, which resides in an XML file embedded within a PDF file. This data can contain information on subjects such as authors, digital rights, categories, and keywords.

Dieselpoint began developing the core indexing algorithms behind its search engine in 1999, and released version 1.0 the next year. Originally meant for use with engineered industrial goods, the product (and company) name reflects these origins.

Cynthia Murrell, March 14, 2012

Sponsored by Pandia.com

Are Search Vendors Embracing Desperation PR?

March 12, 2012

The addled goose is in recovery mode. I have been keeping my feathers calm and unruffled. I am maintaining a low profile. I have undertaken no travel for 2.5 months. I just float amidst the detritus of my pond filled with mine drainage run off. I don’t send spam. I don’t make sales calls. I don’t talk on the phone unless someone pays me. In short, I am out of gas, at the end of the trail, and ready for my goose to be cooked, but I want to express an opinion about desperation marketing as practiced by public relation professionals and PR firms’ search related clients.

A mine drainage pond. I stay here. I mind my business. I don’t spam unlike AtomicPR and Voce Communications type firms. In general, I bristle at desperation marketing, sales, and public relations.

Imagine my reaction when I get unsolicited email from a PR firm such as Porter Novelli. This Porter Novelli PR firm is my newest plight.  Mercifully AtomicPR has either removed me from its spam list or figured out that I sell time just like an attorney but write about PR spam with annoying regularity.

The Porter Novelli outfit owns something call Voce Communications. Voce thinks I am a “real” journalist. I have been called many things, but “journalist” is a recent and inaccurate appellation. The only problem is that I sell time. In my opinion, “real” journalists mostly look for jobs, pretend to be experts on almost anything in the dictionary, worry about getting fired, or browse the franchise ads looking for the next Taco Bell type opportunity.

Spamming me and then reprimanding me for charging for my time are characteristics of what I call desperation marketing. In the last six months, desperation marketing is the latest accoutrement of the haute couture in PR saucisson. Deperation PR is either in vogue or a concomitant of what some describe as a “recovering economy.”

Here’s the scoop: I received an email wanting me to listen to some corporate search engine big wigs tell me about their latest and greatest software widget. The idea, even when I am paid to endure such pression de gonflage is tiring. When someone wants me to participate in a webinar, the notion is downright crazy. I usually reply, “Go away.”

To the Voce “expert” I fired off an email pointing the spammer  to my cv (www.arnoldit.com/sitemap.html) and the About page of this blog. Usually the former political science major or failed middle school teacher replies, “We saw that you were a journalist in a list of bloggers.” After this lame comment, the PR Sasha or Trent stops moves on. No such luck with Voce’s laser minded desperation PR pro.

Here’s what I received:

[Bunny Rabbit] on behalf of EntropySoft.  We have not yet had an opportunity to work together formally yet, but I wanted to reach out to you to see if we can arrange a conversation for you with the executives at EntropySoft – a company that I know you are familiar with given your recent conversation with BA Insights (which is an EntropySoft partner that uses our connector technology) and I know that you have mentioned EntropySoft in past articles for KMWorld.

The advantage that EntropySoft brings to the market is that through the use of its connector technology, EntropySoft can help companies make sense of unstructured data (or as you describe it – tsunami!)  ensuring that IT teams can not only access or connect to just about everything worth connecting to in the KM universe, but that they can also act on it.  Each EntropySoft connection is bi-directional: teams can access and act on everything. So from a SharePoint standpoint, EntropySoft’s connectors can now connect to everything in SharePoint (we just made news on this last week at the SPTechConference in San Francisco, see press release below) enabling search through FAST for SharePoint or the SharePoint Search Server Portal engine, as well as any other enterprise content management systems.

Do you have some available time on your calendar in the next week to have a phone conversation with EntropySoft?  Please let me know what works for you.

Okay, I sure do know about EntropySoft. I have French clients. The two top chiens at EntropySoft know me and find me less than delectable. In the last year, the company has gunned its engine with additional financing, but for me, the outfit provides code widgets which hook one system to another. Useful stuff, but I am not going to flap my feathers in joy about this type of technology. Connectors are available from Oracle, open source, and outfits in Germany and India. Connector technology is important, but it is like many utility-centric technologies—out of sight and out of mind until the exception folder overflows. Then connectors get some attention. How often do you think about exporting an RFT 1.6 file from Framemaker? Exactly. Connectors. There but not the bell of the ball at the technology prom.

A book promoted on the Voce Communications Web site. I was not offered a “free” copy. I bet the book is for sale, just like my time. What would happen if I call the author and asked for a free copy? Hmmm.

Read more

BPM and Big Data

February 8, 2012

Search and business process management: a shotgun marriage. The two can’t help but come together, as IT Business Edge reports in “Two Examples of BPM’s Role in Data Integration.” Writer Loraine Lawson writes that Talend intends to integrate BonitaSoft’s business process management solution with its Unified Platform.

Lawson is pretty sure Talend is the first data integration specialist vendor to venture into this area. She asked Talend’s VP of marketing, Yves de Montcheuil, how the use of BPM contributes to  integration. The write up states:

Data governance. To use a master data hub as a system of record, you’ll need to load it from multiple sources, which will have conflicting data. The MDM and data quality tool will resolve many of these conflicts automatically through matching, but for more complicated conflicts, you’ll need a workflow. BPM can drive this workflow, sending the data to business users who can resolve the conflict by validating the correct data.

That does sound more efficient. Another way Talend expects BPM to help is to manage and automate data integration and data quality services.

Talend provides both open source and SaaS big data solutions to organizations around the world. BonitaSoft also offers both open source and paid solutions, but its realm is business process management. Best wishes to the happy couple!

Cynthia Murrell, February 8, 2012

Sponsored by Pandia.com

Wordmap Introduces Taxonomy Connectors

January 30, 2012

According to the Wordmap.com article “Wordmap Taxonomy Connectors for SharePoint and Endeca” , users will be able to use its new Taxonomy Connectors directly with Endeca. Endeca Taxonomy Connector users will have the ability to use Wordmap to handle “all of their daily management tasks.”

A few notable benefits of the Taxonomy connector are,

No configuration needed for consuming systems. It can manage the taxonomy centrally and push out only relevant sections for indexing, navigation and search and taxonomy is seamlessly integrated into the content lifecycle.

The Wordmap Taxonomy platform definitely seems to be a viable tool when it comes to managing Endeca systems and seems like a no brainer for those using the platform. However, a few questions do come to mind. If Open Source connectors enter the scene will there still be a market for Wordmap connectors or what if Oracle decides to become a little stingier with its system access policies?

Users could still find that the Wordmap Taxonomy Connectors hit the spot or they could find the platform too cumbersome and go elsewhere. Guess it depends on “Which way the wind blows.”

We have heard of a push to make open source connectors available. With some firms charging as much as $20,000 for a connector, lower cost options or open source connectors could have a significant impact on the content processing sector.

April Holmes, Janaury 30, 2012

Sponsored by Pandia.com

SAP: Lemons from Lemonade for Search Vendors

January 18, 2012

A couple of years ago I did a series of columns about SAP, the German software company which is imbued with the DNA of IBM and the more unpredictable genes of the “let ‘er rip” approach to generating revenues. Change is difficult, and SAP interests to me because the firm’s machinations are the embodiment of the dislocations that old style software vendors face in the cloudy world of Amazon, Google, and even old Big Blue herself, IBM. Keep in mind, one of SAP’s strategic moves was to purchase Sybase.

HANA emerged two years ago as a solution to the woes of organizations struggling with big data, the need to make sense of them, and the complexity which threatens to sink traditional enterprise applications. Consider SAP itself. The company owns Business Objects, once the leader in business analytics. Today I don’t think of Business Objects, which may say more about my awareness than SAP’s marketing. But I hear zero about Inxight Software which performs entity extraction and other text operations and I have heard little or nothing about TREX, SAP’s information retrieval system. I lost track of the SAP investment in Endeca long before SAP’s rival Oracle snagged the 1998 technology to “enhance” its own struggling search solutions.

What is HANA?

According to an SAP friendly blog, SAP describes HANA in this way:

HANA is the foundation and the core of all that we do now and going forward for existing products, new products and entirely new frontiers. We are transforming enterprise software with HANA, and we are transforming our entire product portfolio,” Sikka said in a statement earlier this week announcing that SAP HANA is now generally available worldwide. “But HANA is more than a product,” Sikka continued. “It is a new paradigm, an entirely new way to build applications. It is the basis for our own intellectual renewal internally at SAP—where we rethink how we design, build, deploy, service and sell products—and the basis for our customers’ and partners’ intellectual renewal—where we help customers rethink existing business problems and help them solve entirely new challenges using design-thinking.” (Source: The Top 10 Reasons SAP HANA Is Disrupting Larry Ellison’s Grand Plans]

To me, HANA is a next generation database and it now has to differentiate itself from the XML next generation database from the likes of MarkLogic, from Cloudera, from other NoSQL solutions, and from the new and improved versions of data management systems from IBM, Microsoft, and even Amazon. Big job. Maybe an impossible job?

In December 2011, I snipped the write up “Can SAP be the #2 database vendor by 2015?” I found this passage particularly interesting:

Why doesn’t SAP HANA have deeper market penetration? Put simply it is because SAP wanted it this way. Whilst HANA truly is a general-purpose database, SAP first announced it as an analytics appliance for the 1.0 release. They also priced it really high and didn’t’ offer a discount – list pricing can be as high as €180,000 for a 64GB HANA “unit”, depending on which version you require. And what’s more, SAP sells solutions and HANA is a platform, so the global sales force doesn’t quite know how to sell it in volume – yet. They didn’t want to sell it in volume in any case because they wanted to introduce it slowly to market – building stability, references along the way and avoiding expensive and embarrassing global escalations. So by the end of 2011 we should expect $100-150m of HANA sales, which is 3-5% of SAP’s total revenue. Not particularly significant, right? Well in September they released HANA as being supported for SAP’s Business Warehouse software, which allows large-scale data warehouses. And this is where it gets interesting: there are 17,000 existing BW customers, and HANA would provide business benefit to all of them.

If you are interested in HANA, you can access SAP’s primer about the solution at this link.

In the midst of the HANA hype, Seeking Alpha’s “SAP Is No Longer The Leader It Once Was” stated in December 2011:

The current most promising innovation is SAP HANA, an appliance with columnar in-memory technology enabling fast processing and near real-time analytics. According to SAP, HANA has the potential to become the next-generation system architecture, removing the use of middleware and relational databases. However, the root causes of the downturn appear outside the perimeter of the company transformations: product development, continuous customer complaints, and the 20-year aging ERP that represents the core of the customer base seem to remain unchanged. Agile is probably not enough to address the long-term issues of product development. Most likely, Agile is not the solution to fifteen years of trying to get CRM right, or to making three platform mistakes in three on-demand initiatives (CRM on-demand in 2006, Business byDesign in 2007, and SaaS Enterprise in 2009).

The Seeking Alpha analysis then makes these machine gun like statements:

Is SAP getting it right? Here is a summary of the points to keep in mind to answer this question:

SAP R&D has yet to deliver its first truly successful product since 1992 (it could be HANA overtime)

The core of ERP that holds the customer base is outdated

There seem to be no plans to develop a modern replacement product

Development of a potential new ERP would take years

Sales have declined stepping back by 3 to 4.5 years

SAP’s leadership is questionable

According to Gartner, the revenue from relicensing R/3 to ERP 6.0 is ending

Customers and employees have lost trust

Executives have been leaving

On-demand is not making progress

The customer base is increasingly at risk

Analysts estimate that HANA could produce just 10% of the revenue by 2013.

There is a gap between the buzz and the hard facts.

What does this mean for vendors who hitch their wagons to the SAP “star” as ISYS Search Software did with the announcement “ISYS Wins Software Deal with SAP”? Three points:

  1. Search vendors are looking at their technology and packaging it in ways to generate incremental revenue. ISYS, it appears, is in the connector game, competing with firms such as EntropySoft
  2. SAP seems to be lagging further and further behind the NoSQL players who are now facing headwinds despite early market leads. My example is MarkLogic, the XML database outfit
  3. The broader market seems to be splitting into quite different segments. SAP is going to have difficulty in the IBM and Oracle space, and it is going to have trouble with the open source NoSQL crowd which seems to prefer having Hadoop on its T shirts than HANA.

SAP remains interesting, but it is now in some danger of further marginalization. SAP needs a search system still.

Stephen E Arnold, January 18, 2012

Sponsored by Pandia.com

 

 

Kapow Releases Katalyst Version 8.2

January 2, 2012

Kapow Software moves in a new direction that is a bit of a surprise to us. EWeek reports, “Kapow Software Punches Out Update of Cloud-Based Analytics Service.” Kapow is positioning its Katalyst version 8.2 as a self-service, subscription model analytics tool with an intuitive user interface. It also boasts 100% data accuracy. According to the write up:

Katalyst 8.2 can organize, integrate and analyze data from streams as diverse as legacy, on-premise, social media, partner, B2B, competitor, e-commerce, blogs and news sites, as well as location-based and mobile data, [founder Stefan] Andreasen said. The Kapow service is one that speaks to both IT and line-of-business people at an enterprise, and thus can bring them together (when they most often work separately) to solve common research needs.

Headquartered in Palo Alto, CA, Kapow Software  has offered innovative technology solutions for a decade. The company prides itself on bridging the divide between IT departments and business users. It now has over 500 customers worldwide but its heart remains in Copenhagen. Take your conceptual umbrella we suggest.

Cynthia Murrell, January 2, 2012

Sponsored by Pandia.com

Talend Pitches Holistic Integration

December 21, 2011

Connectors get some new lingo; holistic integration is a term we learned from Talend’s press release, “Talend V5: Democratizing Holistic Integration.” The company defends its coinage of the term:

Frankly, IT often uses loosely some terms from the general corpus. But in this case, holistic does the trick. . . . The promise of Talend v5 is to enable IT organizations to converge traditionally disparate integration efforts and practices through a common set of products, tools and best practices. When an organization deploys Talend v5, it will deploy essentially one platform, regardless of the integration need: data integration, application integration, process integration.

That does fit the definition of the term, but it is a little grand, don’t you think? Hmm, maybe not in a field titled “Big Data.”

Talend positions this release as the result of the changes its products have undergone since it bought the German Sopera this time last year. The company is quick to point out that this comprehensive approach does not result in bloatware. Each product included in the platform works independently; customers must only deploy the parts they need.

The write up emphasizes that Talend’s products are still based on the open source underpinnings on which they were founded. The company boasts of being a leader in the open source data management market.

Cynthia Murrell, December 21, 2011

Sponsored by Pandia.com

Protected: Trade Tips and Prices at the SharePoint StackExchange

December 16, 2011

This content is password protected. To view it please enter your password below:

Exclusive Interview: Gilles Andre, PolySpot

December 13, 2011

Last week I was able to interview Gilles Andre, the chief executive officer, of PolySpot late in November and then last week. Mr. Andre joined PolySpot in June 2010. Prior to this, Gilles  was co-founder and CEO of Augure, a company engaged in e-reputation management and services. Mr. Andre was also the founder of Leonard’s Logic suite in 1997 (software editor of Genio ETL). Acquired by Hummingbird in 1999. Mr. Andre is board member at Talend, recognized market leader in open source middleware solutions.

image

PolySpot is a provider of open search solutions. The company offers a robust and innovative architecture which supports search-centric applications accessible from any device connected to a client’s network.

I was interested in Mr. Andre’s view of PolySpot. The search and content processing sector is in transition, and the role of open source solutions continues to gain traction. He told me:

PolySpot’s agile framework, its use of open source technology like Lucene, and a focus on putting information in the business work flow. Olivier Lefassy, David Fischer – our CTO – and I had designed some interesting ideas, and I was eager to fine tune these elements into a business model that would propel PolySpot over the hurdles which cause many enterprise information solutions to fail.

With open source making in roads at IBM and other major technology providers, I asked about Mr. Andre’s involvement in the “communities” which play an important role in the sector. He told me:

When I was board member at Talend, a very successful French initiative in the ETL [extract, transform, load] segment from inception in 2006 to December 2010, I came to understand the potential of open source software. PolySpot gives me a chance to leverage my knowledge about fast growth, high potential companies, open source software, and the “big data” opportunity around us. I think you can say that data management and information are woven throughout my business fabric.

The PolySpot approach boasts a robust framework. I asked what PolySpot has constructed around Lucene, the open source search system:

We build the connectors I mentioned before and a connector software development kit. We engineered out proprietary transformation and enrichment platform (that’s the Sense Builder components) which adds intelligence to raw information. We also developed a very innovative end to end administration console enabling to design and maintain search applications with no particular technical skill, this eases Lucene and Solr configuration but also amplifies the search functionalities provided by Solr. Last, we have added display modules, information views, and graphical user interfaces. These can easily be customized. To make it brief, PolySpot delivers the first end-to-end packaged search infrastructure over Lucene and SOLR core technologies.

After seeing several demonstrations of client deployments, I was impressed with the PolySpot technology. To learn more about PolySpot’s solutions and technical approach, navigate to www.polyspot.com. The full text of the interview with Mr. Andre is located in the ArnoldIT’s series Search Wizards Speak at this link.

Stephen E Arnold, December 13, 2011

Sponsored by Pandia.com, publishers of The New Landscape of Enterprise Search

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta