Handy Dandy XML Gizmo

February 19, 2011

Coding takes hours and is a thankless task. XML is one of the worse coding languages, especially when you have to convert to CVS. You know it will take as long when you have to create other transformations. My Content Builder offers a with Speed and Accuracy.”

“Advanced XML Converter is an converter utility that solves a typical problem many users have when they want to export XML to CSV. Quickly and automatically, XML converter export XML to HTML, XML to CSV, XML to DBF, XML to XLS and XML to SQL – all with much accuracy and in seconds.”

According to the story, the conversion is as easy as it sounds. You take a file, press the convert button, and its exported/saved to your computer. There are also custom export parameters and batch conversion. If this is an everyday tribulation for you, Advanced XML Converter is worth a closer look.

Whitney Grace, February 19, 2011

Freebie

Written by Stephen E. Arnold · Filed Under News, Technology, XML | Comments Off on Handy Dandy XML Gizmo

XML Carnage

January 31, 2011

We noted “Learning from our Mistakes: The Failure of OpenID, AtomPub, and XML on the Web.” What caught our attention was this steemtn:

So next time you’re evaluating a technology that is being much hyped by the web development blogosphere, take a look to see whether the fundamental assumptions that led to the creation of the technology actually generalize to your use case. An example that comes to mind that developers should consider doing with this sort of evaluation given the blogosphere hype is NoSQL.

The article points out that the enthusiasm for OpenID, AtomPub, and XML for “the Web” has cooled. What looks like the next big think, I concluded, may not be.

What are the implications for search and content processing vendors?

For those who don’t know what the three technologies are or do, the answer is, “Not much.” Many vendors handle security, intakes, and formats via connectors. I wrote a for fee column about the importance of connectors, filters, and code widgets that make one outfit’s proprietary or tricky file formats easily tappable / importable by anothre vendor’s system. I know that you have been following the i2 Ltd. and Palantir legal hassle closely. If you haven’t, you can get some color in the stories in www.inteltrax.com and my for fee columns.

But, if you are a vendor who has a big investment in one or more of these technologies, the loss of “enthusiasm”—if the source article is accurate—could mean higher costs. Here’s why:

The marketing positioning and collateral will have to be adapted. Probably not a big deal in the pre-crash days, but now this is a cost and it can be a time sink. Not good when pressure for sales goes up each day. One vendor told me, “We’re really heads down.” No kidding. I don’t think it is work; I think it is survival. A marketing distraction is not a positive.
Credibility with some customers may be eroded. If you beat a drum for one or more of these three technologies, the client assumes that everyone likes the rhythm. Articles that suggest three “next big things” are really three day old brook trout may beg for air freshener.
Partners who often just buy the software vendors’ pitches have invested. Now those investments may not have the type of value one associates with certifiation from Microsoft or the sheer staying power of a wild and crazy push by IBM or Oracle. If partners bail out, recovery can be difficult in some markets.

Worth reading the article and thinking about its implications for search and content processing vendors. Might not ruffle your features; could tear off a wing.

Stephen E Arnold, January 31, 2011

Freebie

Written by Stephen E. Arnold · Filed Under Database, Marketing, News, Publishing, Technology, Text processing, XML | 1 Comment

MarkLogic and Change

January 10, 2011

Short honk: I read “You Say Goodbye, I Say Hello.” The write up by Dave Kellogg reported that he will be leaving MarkLogic. MarkLogic Corp. has gained considerable traction in publishing and a couple of other business sectors with the MarkLogic Server product. You can get more information about that product at this link. MarkLogic server is a database built for unstructured information. Depending on the licensee’s use of the MarkLogic server, the resulting implementation can “look like” search, business intelligence, a custom-publishing system, or other information application.

In his blog post, Mr. Kellogg said:

I am proud of what we accomplished during my six years at the MarkLogic: acquiring over 200 enterprise customers, growing annual revenues at a 75% CAGR, raising $27.5M in venture capital, and growing the company from 40 to over 230 employees. I am particularly happy to say that I will be leaving the company in a position of strength, having exceeded the 2010 revenue plan targets and with nearly $20M cash in the bank.

What’s next for MarkLogic? We have been impressed with the MarkLogic technology for years. We will keep you posted.

Stephen E Arnold, January 10, 2011

Written by Stephen E. Arnold · Filed Under Business strategy, Database, Financial, News, Technology, Text processing, XML | 2 Comments

Word, Flawed as It Is, Embraces XML

December 2, 2010

I have fiddled with a number of editors that look like Microsoft Word or slap code on, over, and in Word to make it work like an XML editor. Sigh. Now, Word with its wacky automatic features has another function tossed in its 1990 retrorod truck bed.

“IXIASOFT Integrates Quark XML Author with its DITA CMS Solution” announces that Quark has partnered with IXIASOFT to integrate Quark XML Author for Microsoft Word into their content management solution. This will make it easy for anyone using Microsoft Word to edit or create content in XML. In short, “The combined solution improves cross-departmental collaboration on the production of technical documentation by making it possible for non-technical subject matter experts to create structured content without using complicated DITA or XML editors.” This solution will make life easier for the teems of people already using MS Word. Microsoft may score a bit hit with this one. There you go. Love that autonumbering too.

Alice Wasielewski, December 2, 2010

Written by Stephen E. Arnold · Filed Under Business strategy, News, Publishing, Technology, XML | 1 Comment

Access Innovations Aligns with MarkLogic

November 22, 2010

Looks like MarkLogic users are in luck. “Access Innovations Announces a New Series of Enhancements for Data Harmony Suite For MarkLogic Server Users” explains to us exactly that, and just in time for the MarkLogic Government Summit on November 17, 2010, in Washington, DC.

Like other semantic platforms, the Data Harmony tools create and integrate metadata based on controlled vocabulary, but it’s Access Innovation’s trademarked Machine Aided Indexer (M.A.I) adding a human element to the search process, that sets the company apart. After 31 years of experience, it isn’t surprising they’ve found a way to give customers a productivity boost in the realm of 700 percent.

A few of the latest enhancements discussed in the article include:

Improving website navigation by enabling users to brows the full “tree” of broader, narrower, and related terms in the taxonomy;
Unlocking the value of “long-tail” content by increasing discovery within deep archives;
Enabling visualization and interactive functionality, including enriched “Tag Clouds” – a map of concepts extracted from a set of documents – a gateway to further discovery; and
Providing users with easy-to-use current awareness tools that alert them to new content in their specific areas of interest.

Access Innovations is not kidding around. Next week at the summit, varied representatives of Government agencies will be privy to these improvements too. There’s little doubt this will be the first step in implementation across the board for existing customers.

Sarah Rogers, November 24, 2010

Written by Stephen E. Arnold · Filed Under Business strategy, News, Text processing, XML | Comments Off on Access Innovations Aligns with MarkLogic

JustSystems Expresses Its Love for XML

July 26, 2010

JustSystems, now a unit of Keyence Corporation, posted “Beyond PDFs – Reach Your Audience with Multiple Output Formats” to pump up excitement for XML. The goslings and I love XML. We even have one or two clients who think XML is the way to handle content, not a “bohica”. The hitch seems to be getting legacy content into well formed XML without plunging the information technology department’s budget into the red inkwell.

According to one of my correspondents, the main point is:

XML, and particularly the Darwin Information Typing Architecture (DITA) XML language, enables you to optimize content for different media. Because an XML-based system handles formatting and content separately, it lets you create one set of source files and then generate PDF files for print and HTML files for the web” Background: “PDF is a print-oriented format, and what works for print often doesn’t work for the web, for mobile devices, or for other electronic media. PDF is not the answer to every content delivery question.

The article has a number of useful links, including the pointer to DITA on Wikipedia, which I can never remember. The run down of output features may be useful if you don’t think of information objects assembled into “documents.”

Worth a look.

Stephen E Arnold, July 26, 2010

Freebie

Written by Stephen E. Arnold · Filed Under Database, News, Publishing, Technology, XML | Comments Off on JustSystems Expresses Its Love for XML

XML Tangled in Its Knickers?

July 7, 2010

eHow’s “Disadvantages of an XML Database” is going to toss some tinder into the XML campers’ summer outing. The goslings and I love XML. We find that it is sufficiently complex to make some clients weep for joy when we perform some scripting magic. However, some organizations find XML to much hassle, preferring to deal with the oddities of Codd databases and the familiar costs of scaling technology with four decades strapped around its ample waist. What are the weaknesses the eHow write up spells out? Let me highlight three and you can navigate to the original write up for the full scoop.

The author, Tamara Wilhite, an eHow Contributing Writer, asserts that XML is a quasi loser due to:

Performance issues. Hmmm. Interesting because I think Codd databases have some performance challenges as well. Perhaps some head to head performance metrics would bolster eHow’s argument. The assertion is offered without much foundation in my opinion.
XML Database Conversion. Yep, extract, transform, and load is tough. The problem is that ETL and file conversion headaches are not limited to XML implementations. Like performance, there are a number of links in the conversion chain, and the write up does not tell me enough to feel comfortable with this assertion.
Security. I am not sure if XML itself is a security problem. Perhaps some color would help me understand this point.

In short, eHow has generated a write up that will get clicks, but it won’t change my view of XML. Not enough meat for this spider food to be filling. Now who owns eHow? Is it Demand Media? Thoughts?

Stephen E Arnold, July 7, 2010

Freebie

Written by Stephen E. Arnold · Filed Under Database, News, Technology, XML | 2 Comments

Business Intelligence: Optimism and Palantir

June 28, 2010

Business intelligence is in the news. Memex, the low profile UK outfit, sold to SAS. Kroll, another low profile operation, became part of Altegrity, anther organization with modest visibility among the vast sea of online experts. Now Palantir snags $90 million, which I learned in “Palantir: the Next Billion Dollar Company Raises $90 Million.” In the post financial meltdown world, there is a lot of money looking for a place that can grow more money. The information systems developed for serious intelligence analysis seem to be a better bet than funding another Web search company.

Palantir has some ardent fans in the US defense and intelligence communities. I like the system as well. What is fascinating to me is that smart money believes that there is gold in them there analytics and visualizations. I don’t doubt for a New York minute that some large commercial organizations can do a better job of figuring out the nuances in their petabytes of data with Palantir-type tools. But Palantir is not exactly Word or Excel.

The system requires an understanding of such nettlesome points as source data, analytic methods, and – yikes – programmatic thinking. The outputs from Palantir are almost good enough for General Stanley McChrystal to get another job. I have seen snippets of some really stunning presentations featuring Palantir outputs. You can see some examples at the Palantir Web site or take a gander (no pun intended by the addled goose) at the image below:

Palantir is an open platform; that is, a licensee with some hefty coinage in their knapsack can use Palantir to tackle the messy problem of data transformation and federation. The approach features dynamic ontologies, which means that humans don’t have to do as much heavy lifting as required by some of the other vendors’ systems. A licensee will want to have a tame rocket scientist around to deal with the internals of pXML, the XML variant used to make Palantir walk and talk.

You can poke around at these links which may go dark in a nonce, of course: https://devzone.palantirtech.com/ and https://www.palantirtech.com/.

Several observations:

The system is expensive and requires headcount to operate in a way that will deliver satisfactory results under real world conditions
Extensibility is excellent, but this work is not for a desk jockey no matter how confident that person in his undergraduate history degree and Harvard MBA
The approach is industrial strength which means that appropriate resources must be available to deal with data acquisition, system tuning, and programming the nifty little extras that are required to make next generation business intelligence systems smarter than a grizzled sergeant with a purple heart.

Can Palantir become a billion dollar outfit? Well, there is always the opportunity to pump in money, increase the marketing, and sell the company to a larger organization with Stone Age business intelligence systems. If Oracle wanted to get serious about XML, Palantir might be worth a look. I can name some other candidates for making the investors day, but I will leave those to your imagination. Will you run your business on a Palantir system in the next month or two? Probably not.

Stephen E Arnold, June 27, 2010

Freebie

Written by Stephen E. Arnold · Filed Under Business intelligence, Database, Enterprise, Financial, News, Online (general), Technology, Text analytics, visualization, XML | 1 Comment

XSLT 2.1 Draft Available

May 14, 2010

Short honk: I don’t want to cover too much programming information in this blog. I do want to document important developments. Navigate to “First Draft of XSL Transformations (XSLT) Version 2.1 Draft Published”. With XSLT, a programmer can perform some interesting manipulations of XML content. More info here.

Stephen E Arnold, May 14, 2010

Freebie.

Written by Stephen E. Arnold · Filed Under News, Text processing, XML | Comments Off on XSLT 2.1 Draft Available

ZyLAB, SharePoint, and XML Content Archiving

May 9, 2010

ZyLAB has been a frequent visitor to my newsreader in the last week or so. The company is hopping on the rich media bandwagon with podcasts. That’s okay, but I am not a rich media goose. The idea of a serial information intake session is not too appealing to this old waterfowl. I leave the videos to the much smarter, more agile wizards, the new masters of the financially-challenged universe.

What did catch my attention was a news item in German called “Microsoft SharePoint-Paket von ZyLAB unterstützt jetzt auch Wikis und Blogs”. The idea is that ZyLAB’s technology and Microsoft SharePoint mesh together. Of particular interest to me is that the ZyLAB product now supports wikis and blogs. ZyLAB has nosed into the XML space as well with its storage service. With ZyLAB an already happy SharePoint customer will be able to extend that goodness with:

Search of scanned documents in different languages
Tap the benefits of XML storage of SharePoint content.
Eliminate the need for additional SQL Server licenses

One interesting feature is that in eDiscovery some SharePoint documents can go missing. The ZyLAB system can create a SharePoint archive with a comprehensive content set.

More information is available at www.zylab.com.

Stephen E Arnold, May 9, 2010

Unsponsored post.

Written by Stephen E. Arnold · Filed Under Microsoft, News, Search, SharePoint, XML | Comments Off on ZyLAB, SharePoint, and XML Content Archiving

« Previous Page

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Handy Dandy XML Gizmo

XML Carnage

MarkLogic and Change

Word, Flawed as It Is, Embraces XML

Access Innovations Aligns with MarkLogic

JustSystems Expresses Its Love for XML

XML Tangled in Its Knickers?

Business Intelligence: Optimism and Palantir

XSLT 2.1 Draft Available

ZyLAB, SharePoint, and XML Content Archiving

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta