A New and Improved Content Delivery System

September 7, 2017

Personalized content and delivery is the name of the game in PRWEB’s, “Flatirons Solutions Launches XML DITA Dynamic Content Delivery Solutions.”  Flatirons Solutions is a leading XML-based publishing and content management company and they recently released their Dynamic Content Delivery Solution.  The Dynamic Content Delivery Solution uses XML-based technology will allow enterprises to receive more personalized content.  It is advertised that it will reduce publishing and support costs.  The new solution is built with the Mark Logic Server.

By partnering with Mark Logic and incorporating their industry-leading XML content server, the solution conducts powerful queries, indexing, and personalization against large collections of DITA topics. For our clients, this provides immediate access to relevant information, while producing cost savings in technical support, and in content production, maintenance, review and publishing. So whether they are producing sales, marketing, technical, training or help documentation, clients can step up to a new level of content delivery while simultaneously improving their bottom line.

The Dynamic Content Delivery Solution is designed for government agencies and enterprises that publish XML content to various platforms and formats.  Mark Logic is touted as a powerful tool to pool content from different sources, repurpose it, and deliver it to different channels.

MarkLogic finds success in its core use case: slicing and dicing for publishing.  It is back to the basics for them.

Whitney Grace, September 7, 2017

 

Factoids about Toutiao: Smart News Filtering Service

August 28, 2017

The filtering service Toutiao is operated by Bytedance. The company attracted attention  because it is generating money (allegedly) and has lots of users or “daily average users” in the 120 million range. (If you are acronym minded, the daily average user count is a DAU. Holy Dau!)

Forget Google’s “translate this page” for Toutiao, the service is blind to the Toutiao content. A work around is to cut and paste snippets into FreeTranslations.org or get someone who reads Chinese to explain what’s on the Toutiao’s pages.

Other items of interest include. (Oh, the hyperlinks point to the source of the factoid.)

    • $900 million in revenue (allegedly). Wall Street Journal, August 28, 2017 with a pay wall for your delectation
    • Funding of $3 billion Crunchbase
    • Valuation of $20 billion or more Reuters
    • Toutiao means headlines Wikipedia
    • What it does from Wikipedia:

Toutiao uses algorithms to select different quality content for individual users. It has created algorithmic models that understand information (text, images, videos, comments, etc.) in depth, and developed large-scale machine learning systems for personalized recommendation that surfaces content users have not necessarily signaled preference for yet. Using Natural Language Processing and Computer Vision technologies in A.I, Toutiao extracts hundreds of entities and keywords as features from each piece of content. When a user first open the app, Toutiao makes a preliminary recommendation based on the operation system of his mobile device, his location and other factors. With users’ interactions with the app, Toutiao fine-tunes its models and make better recommendations.

  • Founded by Zhang Yiming, age 34, in 2012 Reuters

Technode’s “Why Is Toutiao, a News App, Setting Off Alarm Bells for China’s Giants?” suggests that Toutiao may be the next big Chinese online success. The reason is that the service aggregates “news” from disparate content sources; for example, text, video, images, and data.

Toutiao may be the next big thing in algorithmic, mobile centric information access solutions. The company generates revenues from online ads. The company’s secret sauce include smart software plus some extra ingredients:

  • Social functions
  • Search
  • Video
  • User generated “original” content
  • Global plans.

Net net: Worth watching.

Stephen E Arnold, August 28, 2017

Smartlogic: A Buzzword Blizzard

August 2, 2017

I read “Semantic Enhancement Server.” Interesting stuff. The technology struck me as a cross between indexing, good old enterprise search, and assorted technologies. Individuals who are shopping for an automatic indexing systems (either with expensive, time consuming hand coded rules or a more Autonomy-like automatic approach) will want to kick the tires of the Smartlogic system. In addition to the echoes of the SchemaLogic approach, I noted a Thomson submachine gun firing buzzwords; for example:

best bets (I’m feeling lucky?)
dynamic summaries (like Island Software’s approach in the 1990s)
faceted search (hello, Endeca?)
model
navigator (like the Siderean “navigator”?)
real time
related topics (clustering like Vivisimo’s)
semantic (of course)
taxonomy
topic maps
topic pages (a Google report as described in US29970198481)
topic path browser (aka breadcrumbs?)
visualization

What struck me after I compiled this list about a system that “drives exceptional user search experiences” was that Smartlogic is repeating the marketing approach of traditional vendors of enterprise search. The marketing lingo and “one size fits all” triggered thoughts of Convera, Delphes, Entopia, Fast Search & Transfer, and Siderean Software, among others.

I asked myself:

Is it possible for one company’s software to perform such a remarkable array of functions in a way that is easy to implement, affordable, and scalable? There are industrial strength systems which perform many of these functions. Examples range from BAE’s intelligence system to the Palantir Gotham platform.

My hypothesis is that Smartlogic might struggle to process a real time flow of WhatsApp messages, YouTube content, and mobile phone intercept voice calls. Toss in the multi language content which is becoming increasingly important to enterprises, and the notional balloon I am floating says, “Generating buzzwords and associated over inflated expectations is really easy. Delivering high accuracy, affordable, and scalable content processing is a bit more difficult.”

Perhaps Smartlogic has cracked the content processing equivalent of the Voynich manuscript.

image

Will buzzwords crack the Voynich manuscript’s inscrutable text? What if Voynich is a fake? How will modern content processing systems deal with this type of content? Running some content processing tests might provide some insight into systems which possess Watson-esque capabilities.

What happened to those vendors like Convera, Delphes, Entopia, Fast Search & Transfer, and  Siderean Software, among others? (Free profiles of these companies are available at www.xenky.com/vendor-profiles.) Oh, that’s right. The reality of the marketplace did not match the companies’ assertions about technology. Investors and licensees of some of these systems were able to survive the buzzword blizzard. Some became the digital equivalent of Ötzi, 5,300 year old iceman.

Stephen E Arnold, August 2, 2017

Academic Publisher Retracts Record Number of Papers

June 20, 2017

To the scourge of fake news we add the problem of fake research. Retraction Watch announces “A New Record: Major Publisher Retracting More Than 100 Studies from Cancer Journal over Fake Peer Reviews.”  We learn that Springer Publishing Company has just retracted 107 papers from a single journal after discovering their peer reviews had been falsified. Faking the integrity of cancer research? That’s pretty low. The article specifies:

To submit a fake review, someone (often the author of a paper) either makes up an outside expert to review the paper, or suggests a real researcher — and in both cases, provides a fake email address that comes back to someone who will invariably give the paper a glowing review. In this case, Springer, the publisher of Tumor Biology through 2016, told us that an investigation produced “clear evidence” the reviews were submitted under the names of real researchers with faked emails. Some of the authors may have used a third-party editing service, which may have supplied the reviews. The journal is now published by SAGE. The retractions follow another sweep by the publisher last year, when Tumor Biology retracted 25 papers for compromised review and other issues, mostly authored by researchers based in Iran.

The article shares Springer’s response to the matter, some from their official statement and some from a spokesperson. For example, we learn the company cut ties with the “Tumor Biology” owners, and that the latest fake reviews were caught during a process put in place after that debacle.  See the story for more details.

Cynthia Murrell, June 20, 2017

Algorithms Are Getting Smarter at Identifying Human Behavior

June 19, 2017

Algorithm deployed by large tech firms are better at understanding human behaviors, reveals former Google data scientist.

In an article published by Business Insider titled A Former Google Data Scientist Explains Why Netflix Knows You Better Than You Know Yourself, Seth Stephens-Davidowitz says:

Many gyms have learned to harness the power of people’s over-optimism. Specifically, he said, “they’ve figured out you can get people to buy monthly passes or annual passes, even though they’re not going to use the gym nearly enough to warrant this purchase.

Companies like Netflix use this to their benefit. For instance, during initial years, Netflix used to encourage users to create playlists. However, most users ended up watching the same run of the mill content. Netflix thus made changes and started recommending content that was similar to their content watching habits. It only proves one thing, algorithms are getting smarter at understanding and predicting human behaviors, and that is both good and bad.

Vishal Ingole,  June 19, 2017

U.S. Government Keeping Fewer New Secrets

February 24, 2017

We have good news and bad news for fans of government transparency. In their Secrecy News blog, the Federation of American Scientists’ reports, “Number of New Secrets in 2015 Near Historic Low.” Writer Steven Aftergood explains:

The production of new national security secrets dropped precipitously in the last five years and remained at historically low levels last year, according to a new annual report released today by the Information Security Oversight Office.

There were 53,425 new secrets (‘original classification decisions’) created by executive branch agencies in FY 2015. Though this represents a 14% increase from the all-time low achieved in FY 2014, it is still the second lowest number of original classification actions ever reported. Ten years earlier (2005), by contrast, there were more than 258,000 new secrets.

The new data appear to confirm that the national security classification system is undergoing a slow-motion process of transformation, involving continuing incremental reductions in classification activity and gradually increased disclosure. …

Meanwhile, ‘derivative classification activity,’ or the incorporation of existing secrets into new forms or products, dropped by 32%. The number of pages declassified increased by 30% over the year before.

A marked decrease in government secrecy—that’s the good news. On the other hand, the report reveals some troubling findings. For one thing, costs are not going down alongside classifications; in fact, they rose by eight percent last year. Also, response times to mandatory declassification requests (MDRs) are growing, leaving over 14,000 such requests to languish for over a year each. Finally, fewer newly classified documents carry the “declassify in ten years or less” specification, which means fewer items will become declassified automatically down the line.

Such red-tape tangles notwithstanding, the reduction in secret classifications does look like a sign that the government is moving toward more transparency. Can we trust the trajectory?

Cynthia Murrell, February 24, 2017

Investment Group Acquires Lexmark

February 15, 2017

We read with some trepidation the Kansas City Business Journal’s article, “Former Perceptive’s Parent Gets Acquired for $3.6B in Cash.”  The parent company referred to here is Lexmark, which bought up one of our favorite search systems, ISYS Search, in 2012 and placed it under its Perceptive subsidiary, based in Lenexa, Kentucky. We do hope this valuable tool is not lost in the shuffle.

Reporter Dora Grote specifies:

A few months after announcing that it was exploring ‘strategic alternatives,’ Lexmark International Inc. has agreed to be acquired by a consortium of investors led by Apex Technology Co. Ltd. and PAG Asia Capital for $3.6 billion cash, or $40.50 a share. Legend Capital Management Co. Ltd. is also a member of the consortium.

Lexmark Enterprise Software in Lenexa, formerly known as Perceptive Software, is expected to ‘continue unaffected and benefit strategically and financially from the transaction’ the company wrote in a release. The Lenexa operation — which makes enterprise content management software that helps digitize paper records — dropped the Perceptive Software name for the parent’s brand in 2014. Lexmark, which acquired Perceptive for $280 million in cash in 2010, is a $3.7 billion global technology company.

If the Lexmark Enterprise Software (formerly known as Perceptive) division will be unaffected, it seems they will be the lucky ones. Grote notes that Lexmark has announced that more than a thousand jobs are to be cut amid restructuring. She also observes that the company’s buildings in Lenexa have considerable space up for rent. Lexmark CEO Paul Rooke is expected to keep his job, and headquarters should remain in Lexington, Kentucky.

Cynthia Murrell, February 15, 2017

How to Quantify Culture? Counting the Bookstores and Libraries Is a Start

February 7, 2017

The article titled The Best Cities in the World for Book Lovers on Quartz conveys the data collected by the World Cities Culture Forum. That organization works to facilitate research and promote cultural endeavors around the world. And what could be a better measure of a city’s culture than its books? The article explains how the data collection works,

Led by the London mayor’s office and organized by UK consulting company Bop, the forum asks its partner cities to self-report on cultural institutions and consumption, including where people can get books. Over the past two years, 18 cities have reported how many bookstores they have, and 20 have reported on their public libraries. Hong Kong leads the pack with 21 bookshops per 100,000 people, though last time Buenos Aires sent in its count, in 2013, it was the leader, with 25.

New York sits comfortably in sixth place, but London, surprisingly, is near the bottom of the ranking with roughly 360 bookstores. Another measure the WCCF uses is libraries per capita. Edinburgh of all places surges to the top without any competition. New York is the only US city to even make the cut with an embarrassing 2.5 libraries per 100K people. By contrast, Edinburgh has 60.5 per 100K people. What this analysis misses out on is the size and beauty of some of the bookstores and libraries of global cities. To bask in these images, visit Bookshelf Porn or this Mental Floss ranking of the top 7 gorgeous bookstores.

Chelsea Kerwin, February 7, 2017

JustOne: When a Pivot Is Not Possible

February 4, 2017

CopperEye hit my radar when I did a project for the now-forgotten Speed of Mind search system. CopperEye delivered high speed search in a patented hierarchical data management system. The company snagged some In-Q-Tel interest in 2007, but by 2009, I lost track of the company. Several of the CopperEye senior managers teamed to create the JustOne database, search and analytic system. One of the new company’s inventions is documented in “Apparatus, Systems, and Methods for Data Storage and/or Retrieval Based on a Database Model-agnostic, Schema-Agnostic, and Workload-Agnostic Data Storage and Access Models.” If you are into patent documents about making sense of Big Data, you will find US20140317115 interesting. I will leave it to you to determine if there is any overlap between this system and method and those of the now low profile CopperEye.

Why would In-Q-Tel get interested in another database? From my point of view, CopperEye was interesting because:

  1. The system and method was idea for finding information from large collections of intercept information
  2. The tech whiz behind the JustOne system wanted to avoid “band-aid” architectures; that is, software shims, wrappers, and workarounds that other data management and information access systems generated like rabbits
  3. The method of finding information achieved or exceeded the performance of the very, very snappy Speed of Mind system
  4. The system sidestepped a number of the problems which plague Oracle-style databases trying to deal with floods of real time information from telecommunication traffic, surveillance, and Internet of Things transmissions or “emissions.”

How import6ant is JustOne? I think the company is one of those outfits which has a better mousetrap. Unlike the champions of XML, JustOne uses JSON and other “open” technologies. In fact, a useful version of the JustOne system is available for download from the JustOne Web site. Be aware that the name “JustOne” is in use by other vendors.

image

The fragmented world of database and information access. Source: Duncan Pauly

A good, but older, write up explains some of the strengths of the JustOne approach to search and retrieval couched in the lingo of the database world. The key points from “The Evolution of Data Management” strikes me as helpful in understanding why Jerry Yang and Scott McNealy invested in the CopperEye veterans’ start up. I highlighted these points:

  • Databases have to be operational and analytical; that is, storing information is not enough
  • Transaction rates are high; that is, real time flows from telecommunications activity
  • Transaction size varies from the very small to hefty; that is, the opposite of the old school records associated with old school IBM IMS system
  • High concurrency; that is, more than one “thing” at a time
  • Dynamic schema and query definition

I highlighted this statement as suggestive:

In scaled-out environments, transactions need to be able to choose what guarantees they require – rather than enforcing or relaxing ACID constraints across a whole database. Each transaction should be able to decide how synchronous, atomic or durable it needs to be and how it must interact with other transactions. For example, must a transaction be applied in chronological order or can it be allowed out of time order with other transactions providing the cumulative result remains the same? Not all transactions need be rigorously ACID and likewise not all transactions can afford to be non-atomic or potentially inconsistent.

My take on this CopperEye wind down and JustOne wind up is that CopperEye, for whatever management reason, was not able to pivot from where CopperEye was to where CopperEye had to be to grow. More information is available from the JustOne Database Web site at www.justonedb.com.

Is Duncan Pauly one of the most innovative engineers laboring in the database search sector? Could be.

Stephen E Arnold, February 4, 2017

HonkinNews for January 17, 2017 Now Available

January 17, 2017

This week’s HonkinNews takes a look at Yahoo’s post Verizon name. No, our suggestion of yabba dabba hoo or was it “hoot” was not ignored by Yahoo’s marketing wizards. We also highlight Alphabet Google’s erasure of two letters from its “alphabet.” Goners are “S” and “T”. Palantir is hiring a people centric person. The fancy title may have an interesting spin. Two enterprise search vendors kick off 2017 with a blizzard of buzzwords. The depth of the cacaphones is remarkable because search by any other name would return results with questionable precision and recall. The featured story is the Mitre’s Corporation Jason Report. If you have an interest in artificial intelligence and warfighting, the report provides some insight into what the US Department of Defense may be considering. But the highlight of the unclassified document is a helpful description of Google’s TPU. The seven minute program is at this link. For fans of XQuery, we have a bit of input for you too. Proprietary XQuery too. The program is produced in old fashioned black and white and enhanced with theme music from the days of the Stutz Bearcat. From the hotbed of search and content processing, HonkinNews is different. We’re presenting information other big time outfits ignore. Mitre is a variant of Massachusetts Institute of Technology Research. There you go. Live from Harrod’s Creek.

Kenny Toth, January 17, 2017

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta