Temis and MarkLogic: Timid? Not on the Semantic Highway
April 12, 2013
My in box overfloweth. Temis has rolled out a number of announcements in the last 10 days. The company is one of the many firms offering “semantic” technology. Due to the vagaries of language, Temis is in the “content enrichment” business. The idea is that technology indexes key words and concepts even though a concept may not be expressed in a text document. I call this indexing, but “enrichment” is certainly okay.
The first announcement which caught my attention was a news release I saw on the Marketwatch for fee distribution service. The title of the article was “TEMIS Completes Successful Wide Scale Semantic Content Enrichment Test in Windows Azure.” A news release about a test struck me as unusual. The key point for me was that Temis is positioning itself to go after the SharePoint add in market. That sector has some established players like Smartlogic, so the pay off from this “test” announcement will be interesting to watch.
The second announcement was a news story distributed by Eureka Alert called “Wiley Selects Temis for Semantic Big Data Initiative The key point is that a traditional publishing company has licensed software to do what humans used to do in a venerable publishing company which, until recently, was sticking with traditional methods and products. Will Temis propel John Wiley to the top of the leader board of professional publishers? Hopefully some information will become available quickly.
The third announcement which I noted was “Temis and MarkLogic Strengthen Strategic Alliance.” The write up hits the concepts of semantics and big data. Here’s the passage which intrigued me:
MarkLogic® Server is the only enterprise NoSQL database designed for building reliable, scalable and secure search, analytics and information applications quickly and easily. The platform includes tools for fast application development, powerful analytics and visualization widgets for greater insight, and the ability to create user-defined functions for fast and flexible analysis of huge volumes of data.
I am uncomfortable with the notion of “only”. MarkLogic is an XML centric data management system. Software wrappers can use the XML back end for a range of applications. These include something as exotic as a Web site for the US Army to more sophisticated applications for publishing technical documents for an aircraft manufacturing firm. However, there are a number of ways to accomplish these tasks and some of the options make use of somewhat similar technology; for example, eXist-db. While not perfect, the fact that an alternative exists only increases my discomfort with an “only”.
So what’s up? My hunch is that both MarkLogic and Temis are in flat out marketing mode. Clusters of announcements are, in my experience, an indication that the pipeline needs to be filled. Equally surprising is that MarkLogic into a big data player and an enterprise search system, not a publishing system. Most vendors are morphing. The tie up with Temis suggests that Temis’ back end needs some beefing up. The MarkLogic positioning is that it is now a player in semantics and big data. I think that partnering is a quick way to fill gaps.
Will MarkLogic blast through the $100 million in revenue ceiling? Will Temis emerge as a giant slayer in semantic big data? The company recently raised $25 million to become a player in big data. (See “Big Data Boon: MarkLogic Pulls In $25 Million In VC Funding”.) Converting $25 million into high margin revenue could tax the likes of Jack Welch in his prime.
My hunch is that both firms’ management teams have this as a 2013 goal. With the patience of investors wearing thin for many search and content processing vendors, closed deals are a must. The economy may be improving for analysts on CNBC, but for search vendors, making Autonomy-scale or Endeca-scale revenues may be difficult, if not impossible.
In my opinion, the labels “big data” and semantics do not by themselves deliver revenue the way Google delivers Adwords. As more search firms chase additional funding, has the world of search switched from finding information for customers to getting money to stay in business?
No timidity visible as these two firms race down the semantic interstate.
Stephen E Arnold, April 12, 2013
Understanding JSON
April 8, 2013
The Altova Blog piece “Editing, Converting and Generating JSON” provides a helpful guide to using JSON. The use of JSON as a data transport protocol has been on the rise and so has the debate about the advantages of JSON vs. XML. The debate has been waging on but the author actually sums it up fairly well.
“But when you boil it down, there are simply some cases for which JSON is the best choice, and others where XML makes more sense. While you might need to choose between JSON and XML depending on the development task at hand, you don’t have to choose between code editors – XMLSpy supports both technologies and will even convert between the two.”
Altova has extended its intelligent XML editing features to JSON editor in order to make JSON editing as simple as possible. Users who begin editing JSON in text view will get lots of help along the way from XMLSpy thanks in the form of syntax coloring, bracket matching, source folding, entry helper windows, menus and other helpful tools. A one click option on the XMLSpy convert menu makes converting XML to or from JSON quick and easy. The ability to edit but also convert items directly within the XML editor program is extremely useful. JSON lovers will definitely have something to look forward to.
April Holmes, April 08, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
MarkLogic Takes Olympic Coverage From Probable Nightmare to Practical Success
February 26, 2013
Most people never really think about how news organizations transmit data across continents when there is a big event. For the Summer Olympics in 2012 The Press Association relied on MarkLogic’s XML repository’s ability to store and query hundreds of thousands of pieces of metadata per second.
In “How PA Cleared The Big Data Hurdle At The London Olympics” the Press Associations director of technical architecture, John O’Donovan, gives consumers an in depth look at how the office was able to cope with more than 50,000 requests per second.
“The problem with that is having to sit down and design a relational database model that can represent everything that’s in the XML. That takes quite a lot of time, you have to build all of your input/output extenders and map XML objects into relational stores.”
At first look it seems like an impossible task, organizing all of the photos, biographical information, statistics, and competition results for thousands of athletes and beaming it to televisions, phones and computers everywhere, but, by removing the relational database the PA made it possible.XML store instead of storing it in the relational database and then retransferring the data back to XML.
It simplified the delivery system from 100 to 34 man hour days to get off the ground and was so successful that The Press Association will be utilizing the new system for all of its wire and output communications.
Big thumbs ups to MarkLogic’s ability to handle the process and to the PA for finding a new way to utilize an already reliable resource.
Leslie Radcliff, February 26, 2013
Sponsored by ArnoldIT.com, developer of Augmentext
Altova Release New Version of MissionKit
November 30, 2012
Altova, a data management solutions provider and creator of XMLSpy, recently published the news release, “Altova Announces the Release of Version 2013 of MissionKit” on its website.
According to the article, Altova has released an integrated suite of XML, SQL, and UML tools. It offers automatic error correction and support for SQL stored procedures in data mapping projects. Prices start at $59 per product and are available for purchase in the Altova online shop.
The release states:
“Among the many updates and new features we incorporated into the Version 2013 release, one of the most significant is Smart Fix. Smart Fix is unique to XMLSpy 2013 and is a huge leap forward in intelligent XML editing. It provides options for fixing validation errors that developers can apply automatically, with a single click. It’s true XML alchemy,” said Alexander Falk, President and CEO for Altova. “With increased demands on developers today we are always looking for ways to incorporate efficiencies into our products. You simply won’t find this functionality in other tools.”
Altova’s MissionKit is certainly affordable and the suite offers great tools. However, it only saves you money if you plan on using equal numbers of XMLSpy and MapForce.
Jasmine Ashton, November 30, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
Easy XML Converter for Sale
September 23, 2012
Perhaps this is useful: Sofotex offers through its site an Easy XML Converter. The downloadable software runs $119, but there is a twenty day trial period. The product description reads:
“Easy XML Converter helps to convert XML files into a variety of formats. Easy XML Converter also has a help screen that tells you which tables (elements) that are related to each other. What you want to convert, choose from a tree view, select the desired columns that you want, making it very easy to set up. The converter also supports batch job. Paths and all conversion functions are set and stored in a schema, which you activate when you are in need of conversion of the XML file.Supported formats: Excel 2003 and 2007, Text, Access (.mdb), HTML and XML”
The page goes on to list these functions: the software can convert several XML files, then merge them into one output file; users can filter converted data; a detail view of the file allows the software to double as a handy XML viewer; and backup folders are available.
We haven’t given the converter a spin yet, but it could be useful if it works as advertised. If you think such a product could help you, try it out for about nineteen days, then decide.
Cynthia Murrell, September 23, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
XML Exhausting Possibly Too Complex to Last
August 19, 2012
A post on DevXtra Editors’ Blog, “Is XML Too Big? Does Anyone Care?,” poses an interesting sentiment on the size and possibilities of XML.
XML, or the Extensible Markup Language, is too big and can be quite complex depending on the size and purpose of the documents. Syntactic analysis of XML documents are time consuming and difficult, not only for the people completing the task but also for the CPU. The World Wide Web Consortium says that XML “is a simple, very flexible text format.”
The blog post disagrees, stating:
“[…]it’s actually more difficult to parse a large document than to create one. If an XML document is damaged or malformed, software can become very confused, and often, even trivial errors or corruption in the XML document can stop processing. Working with schema extensions can be difficult, and older documents written using DTDs (Document Type Definitions) and Document Object Models (DOMs) can be incomprehensible.”
We think the better question is: “Will people care about XML in two years?” Currently, XML is crucial to exchange data and documents, but will the complexity of the system make it an inexpugnable solution? It is hard to validate using such extensive resources. A simplified system is surely, hopefully, on the way.
Andrea Hayden, August 19, 2012
Sponsored by ArnoldIT.com, developer of Augmentext
IBM Asserts Its i Technology Can Handle XML
May 9, 2012
IBM asserts that DB2 can do big data, including XML in IBM Systems Magazine’s “i Can Use XML in a Relational World.” Blogger and IBM employee Nick Lawrence writes:
“In this most recent round of announcements, IBM has included support for the XMLTABLE table function in SQL. XMLTABLE is designed to convert an XML document into a relational result set (rows and columns) using popular XPath expressions. This function has been referred to as the Swiss army knife for working with XML because it can help solve a wide variety of XML related problems.”
Lawrence recommends a good XML TABLE tutorial, located in the SQL XML Reference in IBM’s Info Center. He also identifies and elaborates upon areas that he says could use some more clarification. For example, a way to create an XML response document that involves creating the document “inside out.” I guess that’s a technical term?
It’s a helpful piece if that’s the route you want to travel. However, it involves lots of code, lots of fiddling. A bit like mining asteroids we think.
Our question: Why not use a NoSQL data management system? After all, big data is what those do best.
Cynthia Murrell, May 9, 2012
Sponsored by PolySpot
Altova Noses into XML Semantics
March 27, 2012
IT Jungle’s Alex Woodie recently announced some good news for IBM DB2/400 fans in the article “Altova Adds Support for DB2/400 Logical Files in MissionKit.”
According to the article, Altova has now added support for DB2/400 logical files in MissionKit. The latest release of MissionKit called 2012r2, features updates to support for DB2/400 logical files have been added to the XMLSpy, MapForce, UModel, DatabaseSpy, and DiffDog products, which already supported DB2/400.
Woodie writes:
MissionKit includes eight handy utilities that allow IT professionals to accomplish a range of XML, data, and unified modeling language (UML)-related tasks. Anchoring the kit is its popular XML editor, called XMLSpy. MapForce, meanwhile, provides data conversion and related capabilities, UModel allows developers to visually design their application flows in UML, while DatabaseSpy allows users to design, query, and compare multiple databases. Rounding out the suite are StyleVision, DiffDog, SchemaAgent, and SemanticWorks.
These new features are bound to attract IBM i customers due to its powerful data manipulation tools. For more information and free trial downloads check out www.altova.com Since I am no longer receiving spam from MarkLogic and AtomicPR, I am not sure how that XML centric company is responding.
Jasmine Ashton, March 27, 2012
Sponsored by Pandia.com
Data Harmony: Sweet Tune for Knowledge Management Experts
January 10, 2012
Short honk: Here in Harrod’s Creek, we find meet ups, hoe downs, and webinars plentiful and out of tune with our needs. We want to put on your calendar an event that seems to offer a sweet tune about knowledge management.
The Eighth Annual Data Harmony Users Group (DHUG) meeting, scheduled February 7 to 9, 2012, in Albuquerque, New Mexico will focus on helping users get the most from their investment in the knowledge management software suite, which helps users organize information resources based on a well-built and systematically applied taxonomy or thesaurus.
We learned:
This meeting is an exciting opportunity to learn how to fully utilize the power of Data Harmony software to maximize the effectiveness and profitability of your organization for your members, customers and staff,” said Marjorie M.K. Hlava, president of Access Innovations.
You can get complete details from Access Innovations. The widely read Web log Taxodiary is encouraging anyone who wishes to share their story at the meeting to contact Data Harmony at this link. Registrations are also now being accepted. For more information about the Eighth Annual Data Harmony Users Group meeting, click here or call (505)998-0800 or 1-800-926-8328. We hope that Access Innovations captures their knowledge in a monograph. Too many amateur taxonomists and knowledge mavens pumping out inaccurate or incomplete information. In our experience, the go-to experts gravitate to the performances by the Mozarts of mark up.
Sounds excellent to us.
Stephen E Arnold, January 10, 2012
Sponsored by Pandia.com
Will a Silver Bullet Save Sci-Tech Publishers?
November 11, 2011
I poked around my Overflight service and noticed a recent news release with the meaty title “Scientific Publisher Saving Hundreds of Thousands of Dollars with MarkLogic.” The subtitle was compelling as well: “New Mobile Applications Let Researchers Study in the Field.”
I thought a moment about the logic of the two statements. I am okay with the idea that a scientific publisher faces some significant challenges. The traditional markets for scientific and technical information in traditional journal form are under severe budget pressure. In response to some scientific publishers’ pricing policies, libraries and some not for profit outfits no longer renew certain journal subscriptions. Others have joined consortia in order to get better value for available budgets.
But STM (scientific, technical, and medical) publications have other issues with which to cope as well. First, technology may not be a core competency. Why would it be? Publishers get authors to write. Publishers package and sell. Technology is talked about but even giants like Thomson Reuters buy print publishing companies in Argentina. So much for embracing the digital revolution. Even more interesting is that some STM publishers often ask authors pay the journal typesetting, correction, and maybe some production costs. As headcount comes under pressure in research institutes and universities, some scientific publishers are finding that authors are either not willing to pay or not able to get a third party to pony up the money. In short, STM in the traditional mode is fighting for oxygen.
The mobile angle baffled me as well.
In my experience, many scientists work in what might be called “controlled environments.” In the pharmaceutical sector, certain firms operate the research facilities the way a South African gold mine superintendents monitor workers at the end of a shift. If this type of security does not resonate with you, you need to do some backfilling on gold and diamond mining security protocols. Think naked. Think weighing workers before and after a shift. Think requiring showers and filtering the gray water. You get the idea. Other types of research does require mobile devices; for example, cleaning up a gone-wrong nuclear reactor which is not a job for an outfit like AtomicPR, in my experience. Public relations “experts” write about radiation and often have limited experience with micro-contamination and chemical decontamination. The point? Mobile often has specific requirements which stretch beyond creating an “app for that.”
In a nutshell, here’s the nub of the news release from my point of view:
Taking research into the field has a new, literal meaning with the launch of new mobile applications built on MarkLogic that are helping scientists better understand soil and crops. MarkLogic Corporation, the company empowering organizations to make high stakes decisions on Big Data in real time, today announced the American Society of Agronomy (ASA) launched Science Pubs, developed for iPad, iPhone, Android, and BlackBerry devices. Science Pubs utilizes MarkLogic to give subscribers and non-subscribers the freedom to dig deep into ASA’s journals, magazines, and eBooks while conducting first-hand research and observations in the field.
The point is that a markup language makes it possible to do an app. Puzzled I plunged forward:
“MarkLogic will save us at least $150,000 per year. That is a lot of money for any publisher, especially a non-profit like the American Society of Agronomy,” said Ian Popkewitz, director, Information Technology & Operations, American Society of Agronomy. “We originally implemented MarkLogic to cut the cost of providing critical publications to our subscribers, but we quickly realized several intangible benefits such as speed, ease of use, and flexibility. The flexibility allowed us to focus on the deployment of Science Pubs. ASA is very pleased to be able to quickly launch these services for subscribers and non-subscribers, and we expect them to generate revenue.”
I understand. However, I want to offer several observations based on my modest experience in publishing. Note I did work for a newspaper that was once one of the Top 25 in the world, but the paper is a starved dog now. I also worked for Bill Ziff, mastermind of multiple empires and the magnate other New York publishers loved to loathe, which is what I learned when I was escorted from the New York Times’s president’s office when he learned I worked for the interesting Mr. Ziff.
First, publishers absolutely have to reduce their costs and in a big way. Saving $150,00 is great, but my question is, “How much does it cost to implement a cost saving system such as a MarkLogic or JSON solution (the fat free alternative to chubby XML), keep it up, and then running at a scientific publisher such as the American Society of Agronomy?” If a system costs $50,000, 100,000, or even $300,000, the publisher has to pay off the system, its maintenance fee, and whip out some products that sell. With revenues at many scientific publishers flat lining or shriveling, the savings are important and may light a fire under the agronomists to cope with a big expense in the name of cost savings. That type of race can be brutal. And it is one that I would be reluctant to enter.
Second, many not for profit organizations and “charities” in the UK are facing declining memberships. Unthinkable five years ago, professional organizations have to market to their members and then spend money to collect on slow paying professionals. Even the certification angle in the UK is not working as it once did. Unemployment among professionals is making it difficult for some experts to pay to be in a must-have organization. Faced with rising costs across the board and decreasing or flat revenue, some not for profit outfits are looking at a nuclear winter, not AtomicPR with a very short half life.
Third, the notion that scientific research has to be peer reviewed in a lengthy, antiquated manner. Also, the long publication cycles for some STM journals are out of step with the real time culture in fast moving fields. Not surprisingly, the no-cost or low-cost alternatives to traditional journal publishing refuse to go away. In some fields like mathematics and physics, blogs and even social media have become the important channels for dissemination of technical information and making or breaking careers. Even grants can be determined by a Facebook-type of presence. Quite a shift.
My take on this “news story” is that it makes a possibly compelling case that an XML repository can help reduce certain costs. But without the context of total cost burdens, I have a question, “Why not use JSON?” XML is darned useful, but so is JSON. My concern is that for many scientific, technical, and medical publishers, is JSON a viable option?
The ArnoldIT team is finishing a report about the outlook for a major publishing company. With more than $5 billion in revenues, this well known firm may be forced to sell its STM business to generate cash. Not even cost cutting can prevent the dislocations that some publishing companies face. The digital revolution has arrived and is now moving in new directions. Many traditional publishers face stark choices and very difficult financial challenges. Alas, no silver bullets today in my opinion.
Stephen E Arnold, November 11, 2011
Sponsored by Pandia.com




