Will XML Save Your Job?

October 29, 2011

If you work on enterprise search, enterprise content repurposing, or high end business intelligence systems, you may want to consider this question.

Is the Extensible Markup Language the ticket to first class retirement at a giant multi national firm?

At least one gosling asked me this morning, “What’s with the interest in XML?” I told him:

XML is complicated and can be explained in such a way that a CFO will write a check to save money due to the benefits of “intelligent content.”

If you believe that, then you are going to answer the question, “Will XML save your job?” yourself and probably before the end of 2011.

You will want to take a look at data2type’s AntillesXML tool. The product will definitely help lock in your expertise, making you indispensible to your employer. The story “A Unique Combination of XML Tools” asserts:

AntillesXML a perfectly equipped toolbox for dealing with XML documents. Thanks to the new graphical user interface which is easy and intuitively to handle, the numerous features are suitable for developers and users alike,” explains Manuel Montero, managing director of data2type GmbH.

If that does not bolster your confidence, you can follow the new White House Chief Information Officer, Steven VanRoekel. He is on board with XML. Navigate to “Federal CIO Unveils Initiatives to Push XML, Virtualization, Agile IT”. Imagine all government documents in XML.

Will this happen?

Well, US government initiatives seem to come and go. When was the last time you used USA.gov or Data.gov? Hmm.

Stephen E Arnold, October 29, 2011

Sponsored by Pandia.com

Is XML Looking at JSON Tail Lights?

September 21, 2011

Extensible Markup Language has a long and distinguished lineage. Think CALS and SGML. We try to pay attention to XML centric search and content processing companies. Examples include the very quiet Dieselpoint and the repositioned Mark Logic Corp.

We have heard anecdotes about some disenchantment with XML, which has been stretched to perform a wide range of content acrobatics. Now it seems that some Twitter features will not support XML. Many older applications rely on XML support for functionality, but Twitter could likely force developers to make updates. In Programmable Web’s article, “Twitter API Ditches XML For Trends: New Features Are JSON-Only,” Twitter’s Jason Costa explained why Twitter is removing XM L:

As well as standardizing the trends URL we are also planning to switch the trends API to JSON only. The reason for this is because the use of XML on the trends API is significantly low and removing support would allow us to free up resources for other developments. Running down the data formats supported by Twitter’s various APIs, there is still plenty of XML support (as well as RSS and Atom), but some of the newer features are JSON-only.”

What’s JSON? The acronym means JavaScript Object Notation. According to JSON.org, it is:

a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition – December 1999. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON an ideal data-interchange language.

Will XML have a future at Twitter? Right now it appears that Twitter streaming is already JSON-only. This move by Twitter may presage an important shift in the Web from XML to JSON.

XML is a complex beastie and publishing companies have embraced XML because it makes slicing and dicing of content easier. But an investment is required to make XML deliver. Chopping out complexity may put pressure on vendors who emphasize the XML ingredients in their enterprise solutions.

If light weight JSON gains traction, some disruptions may be triggered in a forceful way.

Andrea Hayden, September 22, 2011

Sponsored by Pandia.com

Is XML Running Out of Steam in Search?

July 22, 2011

XML is probably the most well known web technology in the world but users are discovering that depending on their needs other technologies can be quite valuable. According to the XML article “JSON vs XML – A Jason vs Freddie Sequel,” JSON is a functional and feature friendly web technology. XML/XMLHttpRequest refers to “the world wide XML standard for data” and is used to describe data format as well as transportation pattern.

The XMLHttpRequest is needed in order to obtain information from servers. Unless a proxy server featuring an AJAX XML toolkit is used, “the server has to be in the same domain as the web page.” JSON, stands for JavaScript Object Notation and this option of data formatting makes information obtained from any server native JavaScript. When information is obtained from the server it is already in JavaScript object format and ready to be used. In addition users can add additional tools such as methods and procedures to JavaScript depending on their needs.

JSON allows users to gain flexibility and build technology that meets their specific needs. “You can call it “serverless” programming. Users drop small pieces of JavaScript into their HTML to get big functionality. “XML is still highly used and is a good choice but JSON definitely gives them a run for their money.”

So what?

With certain XML centric vendors repositioning or taking a low, low, low profile, maybe XML for search is running out of steam or in search of a new way to generate revenues? Want me to identify some XML search engines which have drifted out of the spotlight? Well, I won’t. Let’s just look for vendors who are repositioning or telling me, “Just because we have no blog posts and no tweets, we are really cruising along.” Okay with me.

Stephen E Arnold, July 22, 2011

Sponsored by Pandia.com, publishers of the New Landscape of Enterprise Search

MarkLogic, FAST, Categorical Affirmatives, and a Direction Change

July 5, 2011

I weakened this morning (July 4, 2011) with a marketing Fourth of July boom. I received one of those ever present LinkedIn updates putting a comment from the Enterprise Search Engine Professionals Group in front of me.


The MarkLogic positioning exploded on my awareness like a Fourth of July skyrocket’s burst.

Most of the comments on the LinkedIn group are ho hum. One hot topic has been Microsoft’s failure to put much effort in its blogs about Fast Search & Transfer’s technology. Snore. Microsoft put down $1.2 billion for Fast, made some marketing noises, and had a fellow named Mr. Treo-something talk to me about the “new” Fast Search system. Then search turned out to be more like a snap in but without the simplicity of a Web part. Microsoft moved on and search is there, but like Google’s shift to Android, search is not where the action is. I am not sure who “runs” the enterprise search unit at Microsoft. Lots of revolving door action is my impression of Microsoft’s management approach in the last year.

The noise died down and Fast has become another component in the sprawling Shanghai of code known as SharePoint 2010. Making Fast “fast” and tuning it to return results that don’t vary with each update has created a significant amount of business for Microsoft partners “certified” to work on Fast Search. Licensees of the Linux/Unix version of ESP are now like birds pushed from the next by an impatient mother.

New MarkLogic Market Positioning?

Set Microsoft aside for a moment and look at this post from a MarkLogic professional who once worked at Fast Search and subsequently at Microsoft. I am not sure how to hyperlink to LinkedIn posts without generating a flood of blue and white screens begging for log in, sign up, and money. I will include a link, but you are on your own.

Here’s the alleged MarkLogic professional’s comment:

Many organizations are replacing FAST with MarkLogic. MarkLogic offers a scalable enterprise search engine with all the features of FAST plus more…


An XML engine with wrappers is now capable of “all” the Fast features. In my new monograph “The New Landscape of Enterprise Search”, I took some care to review information presented by Fast at CERN, the wizard lair in Europe, about Fast Search’s effort to rewrite Fast ESP, which was originally a Web search engine. The core was wrapped to convert Web search into enterprise search. This was neither quick nor particularly successful. Fast Search & Transfer ran into some tough financial waters, ended up the focus of a government investigation, and was quickly sold for a price that surprised me and the goslings in Harrod’s Creek.

You can get the details of the focus of the planned reinvention of the Fast system and the link to the source document at CERN which I reference in my Landscape study. A rewrite indicates that some functions were not in 2007 and 2008 performing in  a manner that was acceptable to someone in Fast Search’s management. Then the acquisition took place. The Linux/Unix support was nuked. Fast under Microsoft’s wing has become a utility in the incredible assemblage of components that comprises SharePoint 2010. I track the SharePoint ecosystem in my information service SharePointSemantics.com. If you haven’t seen the content, you might want to check it out.

Read more

Protected: DITA and SharePoint Are Now Compatible

June 29, 2011

This content is password protected. To view it please enter your password below:

OpenText to Unify Data with an Integration Center

June 24, 2011

I am easily confused. I thought OpenText’s original SGML data management system performed integration. Guess not.

In “Content Integration Software unifies data across enterprise,” ThomasNet News serves up welcome news from OpenText about its latest product, Integration Center. Ah, unification; such a lovely concept. We learned:

Most integration technologies focus on either structured data in databases or content in document repositories, but not both. Now with OpenText Integration Center, which inherently understands both structured data and unstructured content, customers can give business decision makers easy access to corporate information assets through ECM Suite 2010.

Being able to go to one source for all its data would certainly be a boon for most companies, saving both time and money. It could unlock the value of data that has been sitting dormant because wrangling it was not deemed worth the effort.

OpenText also boasts about several other advantages of its software. For example, simplified content migration and data archiving and, consequently, the ability to decommission legacy systems.

Yay, efficiency! It also integrates diverse systems, from data warehouses to content management.

The company has been helping clients manage their data for a couple of decades now, and does so around the globe.

Cynthia Murrell June 23, 2011

Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion

Protected: Darwin Information Typing Architecture (DITA) and SharePoint

June 13, 2011

This content is password protected. To view it please enter your password below:

Will Schema.org Would Limit Web Developer Choices?

June 10, 2011

We just don’t know. We noted on Slashdot the article “Schema.org—Google, Microsoft and Yahoo! Agree on Markup Vocabulary.” At first glance, this is another technical hoe down. The goal of standardization promised by Schema.org looks like a good move. The stated goal is improved search results. What could be wrong with that?

In reality, it’s a case of the big boys collaborating to make decisions for the rest of us, like in the good old days with Boss Tweed and Commodore Vanderbilt.

The Slashdot blurb points to Manu Sporny’s piece “The False Choice of Schema.org.” Sporny details the choices that will be lost by adopting this model. RDFa and Microformats would become unsupported, unnecessarily narrowing developer choice to Microdata only. The stated advantages of reducing complexity do not outweigh the losses:

Those [RDFa] features aren’t just there to be purely complex – they were specifically requested by the Web community when building RDFa. Microdata is lacking many of those community-requested features, which does make it simpler, but it also makes it so that it doesn’t solve the problems that the ‘complex’ features were designed for. RDFa is designed to solve a wider range of problems than just those of the search companies. Yes, complexity is bad – but so is cutting features that the Web community has specifically requested and needs to make structured data on the Web everything that it can be.

Because business success today depends so much on search ranking, few businesses are likely to resist the changes once in place. It’s possible, though, that enough protest now will cause the bosses to rethink their edict. As Sporny declares, “this is not how we do things on the Web.”

We also recall that Google has some serious standards horsepower working in the Googleplex. Is it possible that Google wants to move more quickly than the standard practice may be? Worth watching.

Stephen E Arnold, June 10, 2011

Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion

Dieselpoint: Described in a Fuzzy Manner

April 29, 2011

A quote from the MartinButler Research “fact with opinion piece“Dieselpoint” states

“Dieselpoint is something of a Porsche in the Enterprise Search space. It is very fast, well-engineered, doesn’t carry much excess weight, and its text based searching technology can be made to satisfy almost any search requirement.”

Though the Porsche reference is a somewhat unconventional comparison, to most it sounds like this company deserves a closer look. At first glance the Dieselpoint Web Site seems routine but upon taking a closer look one can’t help but notice that it does not list any current information or events within the last several years but they claim to be a leader in their field. This article says some great things about Dieselpoint but it ultimately leaves more questions than answers. Questions such as “What type of system does Dieselpoint offer??” and “What type of moderate prices and options do they offer?” come up. With more questions than answers it may be that this “Porsche” may be parked on the shoulder of the information superhighway.

Check out our Overflight profile of Dieselpoint. Quiet seems it.

Stephen E Arnold, April 29, 2011


Content Not One Dimensional

April 11, 2011

In the digital age that we live in it is no longer possible to simply advertise your content through a traditional media outlet such as newspaper or radio. In order for your content to be used and your message to be heard you have to utilize many different types of channels. In order to expand digital awareness and offerings many content providers are looking to recycle already existing content in order to create new products. The result of such recycling is a “mash-up” of many different content retrieval systems. This “mash-up” of systems when done correctly can create consistency between the
outlets that utilize the information. This is where XML comes in.

XML stands for extensible markup language and functions somewhat like html but while html can display data, xml is designed to carry data and must have defined user tags that are self-descriptive.  While XML doesn’t actually DO anything it is definitely a useful tool for storing and transporting information for widespread use and as a compliment to HTML.

Now you may be wondering why you should care about something as small as XML, well, here’s the answer; XML combined with HTML is what makes the mash up of existing content possible. Without well-structured XML protocols it is impossible to transfer information from one platform, such as an eReader to another platform like a cell phone application.

XML makes it possible to transfer PDF’s to your design software such as InDesign and Quark and it is what allows you to utilize your Kindle or Nook to download books in an instant and it’s what makes the information you seek on a Google search engine return the correct queries. XML is a cross directional language that makes it easier for users to access the product and let’s face it…the easier and more accessible a product is, the more money it is going to generate and let’s face it, that’s what you really want to know…isn’t it?

Stephen E Arnold, April 11 2011


« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta