Data Harmony: Sweet Tune for Knowledge Management Experts
January 10, 2012
Short honk: Here in Harrod’s Creek, we find meet ups, hoe downs, and webinars plentiful and out of tune with our needs. We want to put on your calendar an event that seems to offer a sweet tune about knowledge management.
The Eighth Annual Data Harmony Users Group (DHUG) meeting, scheduled February 7 to 9, 2012, in Albuquerque, New Mexico will focus on helping users get the most from their investment in the knowledge management software suite, which helps users organize information resources based on a well-built and systematically applied taxonomy or thesaurus.
We learned:
This meeting is an exciting opportunity to learn how to fully utilize the power of Data Harmony software to maximize the effectiveness and profitability of your organization for your members, customers and staff,” said Marjorie M.K. Hlava, president of Access Innovations.
You can get complete details from Access Innovations. The widely read Web log Taxodiary is encouraging anyone who wishes to share their story at the meeting to contact Data Harmony at this link. Registrations are also now being accepted. For more information about the Eighth Annual Data Harmony Users Group meeting, click here or call (505)998-0800 or 1-800-926-8328. We hope that Access Innovations captures their knowledge in a monograph. Too many amateur taxonomists and knowledge mavens pumping out inaccurate or incomplete information. In our experience, the go-to experts gravitate to the performances by the Mozarts of mark up.
Sounds excellent to us.
Stephen E Arnold, January 10, 2012
Sponsored by Pandia.com
Will a Silver Bullet Save Sci-Tech Publishers?
November 11, 2011
I poked around my Overflight service and noticed a recent news release with the meaty title “Scientific Publisher Saving Hundreds of Thousands of Dollars with MarkLogic.” The subtitle was compelling as well: “New Mobile Applications Let Researchers Study in the Field.”
I thought a moment about the logic of the two statements. I am okay with the idea that a scientific publisher faces some significant challenges. The traditional markets for scientific and technical information in traditional journal form are under severe budget pressure. In response to some scientific publishers’ pricing policies, libraries and some not for profit outfits no longer renew certain journal subscriptions. Others have joined consortia in order to get better value for available budgets.
But STM (scientific, technical, and medical) publications have other issues with which to cope as well. First, technology may not be a core competency. Why would it be? Publishers get authors to write. Publishers package and sell. Technology is talked about but even giants like Thomson Reuters buy print publishing companies in Argentina. So much for embracing the digital revolution. Even more interesting is that some STM publishers often ask authors pay the journal typesetting, correction, and maybe some production costs. As headcount comes under pressure in research institutes and universities, some scientific publishers are finding that authors are either not willing to pay or not able to get a third party to pony up the money. In short, STM in the traditional mode is fighting for oxygen.
The mobile angle baffled me as well.
In my experience, many scientists work in what might be called “controlled environments.” In the pharmaceutical sector, certain firms operate the research facilities the way a South African gold mine superintendents monitor workers at the end of a shift. If this type of security does not resonate with you, you need to do some backfilling on gold and diamond mining security protocols. Think naked. Think weighing workers before and after a shift. Think requiring showers and filtering the gray water. You get the idea. Other types of research does require mobile devices; for example, cleaning up a gone-wrong nuclear reactor which is not a job for an outfit like AtomicPR, in my experience. Public relations “experts” write about radiation and often have limited experience with micro-contamination and chemical decontamination. The point? Mobile often has specific requirements which stretch beyond creating an “app for that.”
In a nutshell, here’s the nub of the news release from my point of view:
Taking research into the field has a new, literal meaning with the launch of new mobile applications built on MarkLogic that are helping scientists better understand soil and crops. MarkLogic Corporation, the company empowering organizations to make high stakes decisions on Big Data in real time, today announced the American Society of Agronomy (ASA) launched Science Pubs, developed for iPad, iPhone, Android, and BlackBerry devices. Science Pubs utilizes MarkLogic to give subscribers and non-subscribers the freedom to dig deep into ASA’s journals, magazines, and eBooks while conducting first-hand research and observations in the field.
The point is that a markup language makes it possible to do an app. Puzzled I plunged forward:
“MarkLogic will save us at least $150,000 per year. That is a lot of money for any publisher, especially a non-profit like the American Society of Agronomy,” said Ian Popkewitz, director, Information Technology & Operations, American Society of Agronomy. “We originally implemented MarkLogic to cut the cost of providing critical publications to our subscribers, but we quickly realized several intangible benefits such as speed, ease of use, and flexibility. The flexibility allowed us to focus on the deployment of Science Pubs. ASA is very pleased to be able to quickly launch these services for subscribers and non-subscribers, and we expect them to generate revenue.”
I understand. However, I want to offer several observations based on my modest experience in publishing. Note I did work for a newspaper that was once one of the Top 25 in the world, but the paper is a starved dog now. I also worked for Bill Ziff, mastermind of multiple empires and the magnate other New York publishers loved to loathe, which is what I learned when I was escorted from the New York Times’s president’s office when he learned I worked for the interesting Mr. Ziff.
First, publishers absolutely have to reduce their costs and in a big way. Saving $150,00 is great, but my question is, “How much does it cost to implement a cost saving system such as a MarkLogic or JSON solution (the fat free alternative to chubby XML), keep it up, and then running at a scientific publisher such as the American Society of Agronomy?” If a system costs $50,000, 100,000, or even $300,000, the publisher has to pay off the system, its maintenance fee, and whip out some products that sell. With revenues at many scientific publishers flat lining or shriveling, the savings are important and may light a fire under the agronomists to cope with a big expense in the name of cost savings. That type of race can be brutal. And it is one that I would be reluctant to enter.
Second, many not for profit organizations and “charities” in the UK are facing declining memberships. Unthinkable five years ago, professional organizations have to market to their members and then spend money to collect on slow paying professionals. Even the certification angle in the UK is not working as it once did. Unemployment among professionals is making it difficult for some experts to pay to be in a must-have organization. Faced with rising costs across the board and decreasing or flat revenue, some not for profit outfits are looking at a nuclear winter, not AtomicPR with a very short half life.
Third, the notion that scientific research has to be peer reviewed in a lengthy, antiquated manner. Also, the long publication cycles for some STM journals are out of step with the real time culture in fast moving fields. Not surprisingly, the no-cost or low-cost alternatives to traditional journal publishing refuse to go away. In some fields like mathematics and physics, blogs and even social media have become the important channels for dissemination of technical information and making or breaking careers. Even grants can be determined by a Facebook-type of presence. Quite a shift.
My take on this “news story” is that it makes a possibly compelling case that an XML repository can help reduce certain costs. But without the context of total cost burdens, I have a question, “Why not use JSON?” XML is darned useful, but so is JSON. My concern is that for many scientific, technical, and medical publishers, is JSON a viable option?
The ArnoldIT team is finishing a report about the outlook for a major publishing company. With more than $5 billion in revenues, this well known firm may be forced to sell its STM business to generate cash. Not even cost cutting can prevent the dislocations that some publishing companies face. The digital revolution has arrived and is now moving in new directions. Many traditional publishers face stark choices and very difficult financial challenges. Alas, no silver bullets today in my opinion.
Stephen E Arnold, November 11, 2011
Sponsored by Pandia.com
A Coming Dust Up between Oracle and MarkLogic?
November 7, 2011
Is XML the solution to enterprise data management woes? Is XML a better silver bullet than taxonomy management? Will Oracle sit on the sidelines or joust with MarkLogic?
Last week, an outfit named AtomicPR sent me a flurry of news releases. I wrote a chipper Atomic person mentioning that I sell coverage and that I thought the three news releases looked a lot like Spam to me. No answer, of course.
A couple of years ago, we did some work for MarkLogic, a company focused on Extensible Markup Language or XML. I suppose that means AtomicPR can nuke me with marketing fluff. At age 67, getting nuked is not my idea of fun via email or just by aches and pains.
Since August 2011, MarkLogic has been “messaging” me. The recent 2011 news releases explained that MarkLogic was hooking XML to the buzz word “big data.” I am not exactly sure what “big data” means, but that is neither here nor there.
In September 2011, I learned that MarkLogic had morphed into a search vendor. I was surprised. Maybe, amazed is a more appropriate word. See Information Today’s interview with Ken Bado, formerly an Autodesk employee. (Autodesk makes “proven 3D software that accelerates better design.” Autodesk was the former employer of Carol Bartz when Autodesk was an engineering and architectural design software company. I have a difficult time keeping up with information management firms’ positioning statements. I refer to this as “fancy dancing” or “floundering” even though an azure chip consultant insists I really should use the word “foundering”. I love it when azure chip consultants and self appointed experts input advice to my free blog.)

In a joust between Oracle and MarkLogic, which combatant will be on the wrong end of the pointy stick thing? When marketing goes off the rails, the horse could be killed. Is that one reason senior executives exit the field of battle? Is that one reason veterinarians haunt medieval re-enactments?
Trade Magazine Explains the New MarkLogic
I thought about MarkLogic when I read “MarkLogic Ties Its Database to Hadoop for Big Data Support.” The PCWorld story stated:
MarkLogic 5, which became generally available on Tuesday, includes a Hadoop connector that will allow customers to “aggregate data inside MarkLogic for richer analytics, while maintaining the advantages of MarkLogic indexes for performance and accuracy,” the company said.
A connector is a software widget that allows one system to access the information in another system. I know this is a vastly simplified explanation. Earlier this year, Palantir and i2 Group (now part of IBM) got into an interesting legal squabble over connectors. I believe I made the point in a private briefing that “connectors are a new battleground.” the MarkLogic story in PCWorld indicated that MarkLogic is chummy with Hadoop via connectors. I don’t think MarkLogic codes its own connectors. My recollection is that ISYS Search Software licenses some connectors to MarkLogic, but that deal may have gone south by now. And, MarkLogic is a privately held company funded, I believe, by Lehman Brothers, Sequoia Capital, and Tenaya Capital. I am not sure “open source” and these financial wizards are truly harmonized, but again I could be wrong, living in rural Kentucky and wasting my time in retirement writing blog posts.
Will XML Save Your Job?
October 29, 2011
If you work on enterprise search, enterprise content repurposing, or high end business intelligence systems, you may want to consider this question.
Is the Extensible Markup Language the ticket to first class retirement at a giant multi national firm?
At least one gosling asked me this morning, “What’s with the interest in XML?” I told him:
XML is complicated and can be explained in such a way that a CFO will write a check to save money due to the benefits of “intelligent content.”
If you believe that, then you are going to answer the question, “Will XML save your job?” yourself and probably before the end of 2011.
You will want to take a look at data2type’s AntillesXML tool. The product will definitely help lock in your expertise, making you indispensible to your employer. The story “A Unique Combination of XML Tools” asserts:
AntillesXML a perfectly equipped toolbox for dealing with XML documents. Thanks to the new graphical user interface which is easy and intuitively to handle, the numerous features are suitable for developers and users alike,” explains Manuel Montero, managing director of data2type GmbH.
If that does not bolster your confidence, you can follow the new White House Chief Information Officer, Steven VanRoekel. He is on board with XML. Navigate to “Federal CIO Unveils Initiatives to Push XML, Virtualization, Agile IT”. Imagine all government documents in XML.
Will this happen?
Well, US government initiatives seem to come and go. When was the last time you used USA.gov or Data.gov? Hmm.
Stephen E Arnold, October 29, 2011
Sponsored by Pandia.com
Is XML Looking at JSON Tail Lights?
September 21, 2011
Extensible Markup Language has a long and distinguished lineage. Think CALS and SGML. We try to pay attention to XML centric search and content processing companies. Examples include the very quiet Dieselpoint and the repositioned Mark Logic Corp.
We have heard anecdotes about some disenchantment with XML, which has been stretched to perform a wide range of content acrobatics. Now it seems that some Twitter features will not support XML. Many older applications rely on XML support for functionality, but Twitter could likely force developers to make updates. In Programmable Web’s article, “Twitter API Ditches XML For Trends: New Features Are JSON-Only,” Twitter’s Jason Costa explained why Twitter is removing XM L:
As well as standardizing the trends URL we are also planning to switch the trends API to JSON only. The reason for this is because the use of XML on the trends API is significantly low and removing support would allow us to free up resources for other developments. Running down the data formats supported by Twitter’s various APIs, there is still plenty of XML support (as well as RSS and Atom), but some of the newer features are JSON-only.”
What’s JSON? The acronym means JavaScript Object Notation. According to JSON.org, it is:
a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition – December 1999. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON an ideal data-interchange language.
Will XML have a future at Twitter? Right now it appears that Twitter streaming is already JSON-only. This move by Twitter may presage an important shift in the Web from XML to JSON.
XML is a complex beastie and publishing companies have embraced XML because it makes slicing and dicing of content easier. But an investment is required to make XML deliver. Chopping out complexity may put pressure on vendors who emphasize the XML ingredients in their enterprise solutions.
If light weight JSON gains traction, some disruptions may be triggered in a forceful way.
Andrea Hayden, September 22, 2011
Sponsored by Pandia.com
Is XML Running Out of Steam in Search?
July 22, 2011
XML is probably the most well known web technology in the world but users are discovering that depending on their needs other technologies can be quite valuable. According to the XML article “JSON vs XML – A Jason vs Freddie Sequel,” JSON is a functional and feature friendly web technology. XML/XMLHttpRequest refers to “the world wide XML standard for data” and is used to describe data format as well as transportation pattern.
The XMLHttpRequest is needed in order to obtain information from servers. Unless a proxy server featuring an AJAX XML toolkit is used, “the server has to be in the same domain as the web page.” JSON, stands for JavaScript Object Notation and this option of data formatting makes information obtained from any server native JavaScript. When information is obtained from the server it is already in JavaScript object format and ready to be used. In addition users can add additional tools such as methods and procedures to JavaScript depending on their needs.
JSON allows users to gain flexibility and build technology that meets their specific needs. “You can call it “serverless” programming. Users drop small pieces of JavaScript into their HTML to get big functionality. “XML is still highly used and is a good choice but JSON definitely gives them a run for their money.”
So what?
With certain XML centric vendors repositioning or taking a low, low, low profile, maybe XML for search is running out of steam or in search of a new way to generate revenues? Want me to identify some XML search engines which have drifted out of the spotlight? Well, I won’t. Let’s just look for vendors who are repositioning or telling me, “Just because we have no blog posts and no tweets, we are really cruising along.” Okay with me.
Stephen E Arnold, July 22, 2011
Sponsored by Pandia.com, publishers of the New Landscape of Enterprise Search
MarkLogic, FAST, Categorical Affirmatives, and a Direction Change
July 5, 2011
I weakened this morning (July 4, 2011) with a marketing Fourth of July boom. I received one of those ever present LinkedIn updates putting a comment from the Enterprise Search Engine Professionals Group in front of me.
The MarkLogic positioning exploded on my awareness like a Fourth of July skyrocket’s burst.
Most of the comments on the LinkedIn group are ho hum. One hot topic has been Microsoft’s failure to put much effort in its blogs about Fast Search & Transfer’s technology. Snore. Microsoft put down $1.2 billion for Fast, made some marketing noises, and had a fellow named Mr. Treo-something talk to me about the “new” Fast Search system. Then search turned out to be more like a snap in but without the simplicity of a Web part. Microsoft moved on and search is there, but like Google’s shift to Android, search is not where the action is. I am not sure who “runs” the enterprise search unit at Microsoft. Lots of revolving door action is my impression of Microsoft’s management approach in the last year.
The noise died down and Fast has become another component in the sprawling Shanghai of code known as SharePoint 2010. Making Fast “fast” and tuning it to return results that don’t vary with each update has created a significant amount of business for Microsoft partners “certified” to work on Fast Search. Licensees of the Linux/Unix version of ESP are now like birds pushed from the next by an impatient mother.
New MarkLogic Market Positioning?
Set Microsoft aside for a moment and look at this post from a MarkLogic professional who once worked at Fast Search and subsequently at Microsoft. I am not sure how to hyperlink to LinkedIn posts without generating a flood of blue and white screens begging for log in, sign up, and money. I will include a link, but you are on your own.
Here’s the alleged MarkLogic professional’s comment:
Many organizations are replacing FAST with MarkLogic. MarkLogic offers a scalable enterprise search engine with all the features of FAST plus more…
Wow.
An XML engine with wrappers is now capable of “all” the Fast features. In my new monograph “The New Landscape of Enterprise Search”, I took some care to review information presented by Fast at CERN, the wizard lair in Europe, about Fast Search’s effort to rewrite Fast ESP, which was originally a Web search engine. The core was wrapped to convert Web search into enterprise search. This was neither quick nor particularly successful. Fast Search & Transfer ran into some tough financial waters, ended up the focus of a government investigation, and was quickly sold for a price that surprised me and the goslings in Harrod’s Creek.
You can get the details of the focus of the planned reinvention of the Fast system and the link to the source document at CERN which I reference in my Landscape study. A rewrite indicates that some functions were not in 2007 and 2008 performing in a manner that was acceptable to someone in Fast Search’s management. Then the acquisition took place. The Linux/Unix support was nuked. Fast under Microsoft’s wing has become a utility in the incredible assemblage of components that comprises SharePoint 2010. I track the SharePoint ecosystem in my information service SharePointSemantics.com. If you haven’t seen the content, you might want to check it out.
DITA and SharePoint Are Now Compatible
June 29, 2011
Scott Abel, the Content Wrangler, never believed that Microsoft and SharePoint were the best tools to work with the Darwin Information Typing Architecture (DITA) thanks to a nifty program called DITA Exchange that is the equivalent of putting SharePoint documents on steroids. “Yes, You Can Do DITA With Microsoft Office and SharePoint!” explains how DITA Exchange is a software tools suite designed for Microsoft Office and SharePoint to create and manage DITA content, taking away the hassle of other programs.
It works by end users creating XML content or DITA topics within Microsoft Word, where they are guided to create DITA content with instructional text and user interface. Once finished, the content is uploaded to SharePoint, where it is configured as a component library. To publish it, the user selects the DITA library, turns it into a topic map, then publishes it from within Microsoft Word. It’s advertised as easy to use, deploy, an excellent collaboration tool, no significant training, and will solve XML business problems.
“DITA Exchange presents a practical alternative. The primary components — Microsoft Office and Microsoft SharePoint — are widely deployed around the globe and probably already in use by employees of these firms.”
XML programming software often hinders development rather than helps it. DITA Exchange makes the process easier, while bumping up SharePoint documents to the point where they need to be tested for illegal drug use. SurfRay’s search-based technology is also a practical solution for SharePoint.
Torben Ellert, June 29, 2011
You can read more about enterprise search and retrieval in The New Landscape of Enterprise Search, published my Pandia in Oslo, Norway, in June 2011.
OpenText to Unify Data with an Integration Center
June 24, 2011
I am easily confused. I thought OpenText’s original SGML data management system performed integration. Guess not.
In “Content Integration Software unifies data across enterprise,” ThomasNet News serves up welcome news from OpenText about its latest product, Integration Center. Ah, unification; such a lovely concept. We learned:
Most integration technologies focus on either structured data in databases or content in document repositories, but not both. Now with OpenText Integration Center, which inherently understands both structured data and unstructured content, customers can give business decision makers easy access to corporate information assets through ECM Suite 2010.
Being able to go to one source for all its data would certainly be a boon for most companies, saving both time and money. It could unlock the value of data that has been sitting dormant because wrangling it was not deemed worth the effort.
OpenText also boasts about several other advantages of its software. For example, simplified content migration and data archiving and, consequently, the ability to decommission legacy systems.
Yay, efficiency! It also integrates diverse systems, from data warehouses to content management.
The company has been helping clients manage their data for a couple of decades now, and does so around the globe.
Cynthia Murrell June 23, 2011
Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion
Darwin Information Typing Architecture (DITA) and SharePoint
June 13, 2011
“Yes, You can do DITA with Microsoft Office and SharePoint!” has the latest scope on the Darwin Information Architecture used for publishing and writing with OASIS standard XML.
DITA, once never thought to be compatible with SharePoint or Microsoft Office, now resolves the most common problems in these two programs. DITA Exchange is comprised of the DITA Exchange Foundation—has SharePoint recognize and understand DITA content, DITA Exchange Transformation Services—enables the exchange of DITA content between MS Office format and DITA, and DITA Exchange Desktop—turns MS Word into a DITA authoring tool.
We learned that DITA Exchange works by:
software components are installed that enable Microsoft Office and Microsoft SharePoint to provide content contributors with the ability to create structured, XML content (DITA topics) from within the ubiquitous Microsoft Word environment. Once the content is created, DITA Exchange saves it into SharePoint Server, which has been configured to act as a content component (topic) library from which content can be managed at a granular level.
DITA resolves many problems for XML authoring and is easy for any end user to grasp, not to mention it saves a bundle on purchasing further XML authoring tools. This is a practical office application to have if you rely on XML content for your SharePoint portal. SurfRay’s SharePoint Search solutions are built from the ground up to leverage XML in search and collaboration. To learn more, click our search experience link.
Torben Ellert, June 13, 2011
SurfRay

