IBM Watson Deep Learning: A Great Leap Forward

August 16, 2017

I read in the IBM marketing publication Fortune Magazine. Oh, sorry, I meant the independent real business news outfit Fortune, the following article: “IBM Claims Big Breakthrough in Deep Learning.” (I know the write up is objective because the headline includes the word “claims.”)

The main point is that the IBM Watson super game winning thing can now do certain computational tasks more quickly is mildly interesting. I noticed that one of our local tire discounters has a sale on a brand called Primewell. That struck me as more interesting than this IBM claim.

First, what’s the great leap forward the article touts? I highlighted this passage:

IBM says it has come up with software that can divvy those tasks among 64 servers running up to 256 processors total, and still reap huge benefits in speed. The company is making that technology available to customers using IBM Power System servers and to other techies who want to test it.

How many IBM Power 8 servers does it take to speed up Watson’s indexing? I learned:

IBM used 64 of its own Power 8 servers—each of which links both general-purpose Intel microprocessors with Nvidia graphical processors with a fast NVLink interconnection to facilitate fast data flow between the two types of chips

A couple of questions:

  1. How much does it cost to outfit 64 IBM Power 8 servers to perform this magic?
  2. How many Nvidia GPUs are needed?
  3. How many Intel CPUs are needed?
  4. How much RAM is required in each server?
  5. How much time does it require to configure, tune, and deploy the set up referenced in the article?

My hunch is that this set up is slightly more costly than buying a Chrome book or signing on for some Amazon cloud computing cycles. These questions, not surprisingly, are not of interest to the “real” business magazine Fortune. That’s okay. I understand that one can get only so much information from a news release, a PowerPoint deck, or a lunch? No problem.

The other thought that crossed my mind as I read the story, “Does Fortune think that IBM is the only outfit using GPUs to speed up certain types of content processing?” Ah, well, IBM is probably so sophisticated that it is working on engineering problems that other companies cannot conceive let alone tackle.

Now the second point: Content processing to generate a Watson index is a bottleneck. However, the processing is what I call a downstream bottleneck. The really big hurdle for IBM Watson is the manual work required to set up the rules which the Watson system has to follow. Compared to the data crunching, training and rule making are the giant black holes of time and complexity. Fancy Dan servers don’t get to strut their stuff until the days, weeks, months, and years of setting up the rules is completed, tuned, and updated.

Fortune Magazine obviously considers this bottleneck of zero interest. My hunch is that IBM did not explain this characteristic of IBM Watson or the Achilles’ heel of figuring out the rules. Who wants to sit in a room with subject matter experts and three or four IBM engineers talking about what’s important, what questions are asked, and what data are required.

AskJeeves demonstrated decades ago that human crafted rules are Black Diamond ski runs. IBM Watson’s approach is interesting. But what’s fascinating is the uncritical acceptance of IBM’s assertions and the lack of interest in tackling substantive questions. Maybe lunch was cut short?

Stephen E Arnold, August 16, 2017

Tidy Text the Best Way to Utilize Analytics

August 10, 2017

Even though text mining is nothing new natural language processing seems to be the hot new analytics craze. In an effort to understand the value of each, along with the difference, and (most importantly) how to use either efficiently, O’Reilly interviewed text miners, Julia Silge and David Robinson, to learn about their approach.

When asked what advice they would give those drowning in data, they replied,

…our advice is that adopting tidy data principles is an effective strategy to approach text mining problems. The tidy text format keeps one token (typically a word) in each row, and keeps each variable (such as a document or chapter) in a column. When your data is tidy, you can use a common set of tools for exploring and visualizing them. This frees you from struggling to get your data into the right format for each task and instead lets you focus on the questions you want to ask.

The due admits text mining and natural language processing overlap in many areas but both are useful tools for different issues. They regulate text mining to statistical analysis and natural language processing to the relationship between computers and language. The difference may seem minute but with data mines exploding and companies drowning in data, such advice is crucial.

Catherine Lamsfuss, August 10, 2017

Palantir Technologies: Recycling Day Old Hash

July 31, 2017

I read “Palantir: The Special Ops Tech Giant That Wields As Much Real World Power as Google.” I noticed these hot buttons here:

“Special ops” for the Seal Team 6 vibe. Check.

“Wields” for the notion of great power. Check.

“Real world.” A reminder of the here and now, not an airy fairy digital wonkiness. Check.

“Google.” Yes. Palantir as potent as the ad giant Google. Check.

That’s quite a headline.

The write up itself is another journalistic exposé of software which ingests digital information and outputs maps, reports, and visualizations. Humans index too. Like the i2 Analyst Notebook, the “magic” is mostly external. Making these Fancy Dan software systems work requires computers, of course. Humans are needed too. Trained humans are quite important, essential, in fact.

The Guardian story seems to be a book review presented as a Gladwell-like revisionist anecdote. See, for example, Done: The Secret Deals That Are Changing Our World by Jacques Peretti (Hodder & Stoughton, £20). You can buy a copy from bookshop.theguardian.com. (Online ad? Maybe?)

Read the Palantir story which stuffed my Talkwalker alert with references to the article. Quite a few bloggers are recycling the Guardian newspaper story. Buzzfeed’s coverage of the Palo Alto company evoked the same reaction. I will come back to the gaps in these analyses in a moment.

The main point of the Guardian’s July 30, 2017, story strikes me as:

Palantir tracks everyone from potential terrorist suspects to corporate fraudsters…child traffickers, and what they refer to as subversives. But it all done using prediction.

Right. Everyone! Potential terrorist suspects! And my favorite “all”. Using “prediction” no less.

Sounds scary. I am not sure the platforms work with the type of reliability that the word “all” suggests. But this is about selling books, not Palantir and similar companies’ functionality, statistical methods, or magical content processing. Confusing Hollywood with reality is easy today: At least for some folks.

Palantir licenses software to organizations. Palantir is an “it,” not a they. The company uses the lingo of its customers. Subversives is one term, but it is more suggestive in my opinion than “bad actor,” “criminal,” “suspect,” or terrorist.” I think the word “tracks” is pivotal. Palantir’s professionals, like Pathfinder, look at deer tracks and nails the beastie. I want to point out that “prediction”—partly the Bayesian, Monte Carlo, and Markovian methods pioneered by Autonomy in the mid 1990s—is indeed used for certain processes. What’s omitted is that Palantir is just one company in the content processing and search and retrieval game. I am not convinced that its systems and methods are the best ones available today. (Check out Recorded Future, a Google and In-Q-Tel funded company for some big league methods. And there are others. In my CyberOSINT book and my Dark Web Notebook I identify about two dozen companies providing similar services. Palantir is one, admittedly high profile example, of next generation information access providers.

The write up does reveal at the end of the article that the Guardian is selling Jacque Peretti’s book. That’s okay. What’s operating under the radar is a book promo that seems to be one thing but is, in the real world, a nifty book promotion.

In closing, the information presented in the write up struck me as a trifle stale. I am okay with collections of information that have been assembled to make it easy for a reader to get the gist of a system quickly. My Dark Web Notebook is a Cliff’s Notes about what one Tor executive suggests does not exist.

When I read about Palantir, I look for information about:

  • Technical innovations within Gotham and Palantir’s other “products”
  • Details about the legal dust up between i2 and Palantir regarding file formats, an issue which has some here and now relevance with the New York police department’s Palantir experience
  • Interface methods which are designed to make it easier to perform certain data analysis functions
  • Specifics about the data loading, file conversion, and pre-processing index tasks and how these impact timeliness of the information in the systems
  • Issues regarding data reconciliation when local installs lose contact with cloud resources within a unit and across units
  • Financial performance of the company as it relates to stock held by stakeholders and those who want the company to pursue an initial public offering
  • What are the specific differences among systems on offer from BAE, Textron, and others with regards to Palantir Gotham?

Each time I read about Palantir these particular items seem to be ignored. Perhaps these are not sufficiently sexy or maybe getting the information is a great deal of work? The words “hash” and “rehash” come to my mind as one way to create something that seems filling but may be empty calories. Perhaps a “real journalist” will tackle some of the dot points. That would be more interesting than a stale reference to special effects in a star vehicle.

NB. I was an adviser to i2 Group Ltd., the outfit that created the Analyst’s Notebook.

Stephen E Arnold, July 31, 2017

ArnoldIT Publishes Technical Analysis of the Bitext Deep Linguistic Analysis Platform

July 19, 2017

ArnoldIT has published “Bitext: Breakthrough Technology for Multi-Language Content Analysis.” The analysis provides the first comprehensive review of the Madrid-based company’s Deep Linguistic Analysis Platform or DLAP. Unlike most next-generation multi-language text processing methods, Bitext has crafted a platform. The document can be downloaded from the Bitext Web site via this link.

Based on information gathered by the study team, the Bitext DLAP system outputs metadata with an accuracy in the 90 percent to 95 percent range.
Most content processing systems today typically deliver metadata and rich indexing with accuracy in the 70 to 85 percent range.

According to Stephen E Arnold, publisher of Beyond Search and Managing Director of Arnold Information Technology:

“Bitext’s output accuracy establish a new benchmark for companies offering multi-language content processing system.”

The system performs in near real time, more than 15 discrete analytic processes. The system can output enhanced metadata for more than 50 languages. The structured stream provides machine learning systems with a low cost, highly accurate way to learn. Bitext’s DLAP platform integrates more than 30 separate syntactic functions. These include segmentation, tokenization (word segmentation, frequency, and disambiguation, among others. The DLAP platform analyzes more  than 15 linguistic features of content in any of the more than 50 supported languages. The system extracts entities and generates high-value data about documents, emails, social media posts, Web pages, and structured and semi-structured data.

DLAP Applications range from fraud detection to identifying nuances in streams of data; for example, the sentiment or emotion expressed in a document. Bitext’s system can output metadata and other information about processed content as a feed stream to specialized systems such as Palantir Technologies’ Gotham or IBM’s Analyst’s Notebook. Machine learning systems such as those operated by such companies as Amazon, Apple, Google, and Microsoft can “snap in” the Bitext DLAP platform.

Copies of the report are available directly from Bitext at https://info.bitext.com/multi-language-content-analysis Information about Bitext is available at www.bitext.com.

Kenny Toth, July 19, 2017

Bitext and MarkLogic Join in a Strategic Partnership

June 13, 2017

Strategic partnerships are one of the best ways for companies to grow and diamond in the rough company Bitext has formed a brilliant one. According to a recent press release, “Bitext Announces Technology Partnership With MarkLogic, Bringing Leading-Edge Text Analysis To The Database Industry.” Bitext has enjoyed a number of key license deals. The company’s ability to process multi-lingual content with its deep linguistics analysis platform reduces costs and increases the speed with which machine learning systems can deliver more accurate results.

bitext logo

Both Bitext and MarkLogic are helping enterprise companies drive better outcomes and create better customer experiences. By combining their respectful technologies, the pair hopes to reduce data’s text ambiguity and produce high quality data assets for semantic search, chatbots, and machine learning systems. Bitext’s CEO and founder said:

““With Bitext’s breakthrough technology built-in, MarkLogic 9 can index and search massive volumes of multi-language data accurately and efficiently while maintaining the highest level of data availability and security. Our leading-edge text analysis technology helps MarkLogic 9 customers to reveal business-critical relationships between data,” said Dr. Antonio Valderrabanos.

Bitext is capable of conquering the most difficult language problems and creating solutions for consumer engagement, training, and sentiment analysis. Bitext’s flagship product is its Deep Linguistics Analysis Platform and Kantar, GFK, Intel, and Accenture favor it. MarkLogic used to be one of Bitext’s clients, but now they are partners and are bound to invent even more breakthrough technology. Bitext takes another step to cement its role as the operating system for machine intelligence.

Whitney Grace, June 13, 2017

Antidot: Fluid Topics

June 5, 2017

I find French innovators creative. Over the years I have found the visualizations of DATOPS, the architecture of Exalead, the wonkiness of Kartoo, the intriguing Semio, and the numerous attempts to federate data and work flow like digital librarians and subject matter experts. The Descartes- and Femat-inspired engineers created software and systems which try to trim the pointy end off many information thorns.

I read “Antidot Enables ‘Interactive’ Tech Docs That Are Easier To Publish, More Relevant To Users – and Actually Get Read.” Antidot, for those not familiar with the company, was founded in 1999. Today the company bills itself as a specialist in semantic search and content classification. The search system is named Taruqa, and the classification component is called “Classifier.”

The Fluid Topics product combines a number of content processing functions in a workflow designed to provide authorized users with the right information at the right time.

According to the write up:

Antidot has updated its document delivery platform with new features aimed at making it easier to create user-friendly interactive docs.  Docs are created and consumed thanks to a combination of semantic search, content enrichment, automatic content tagging and more.

The phrase “content enrichment” suggests to me that multiple indexing and metadata identification subroutines crunch on text. The idea is that a query can be expanded, tap into entity extraction, and make use of text analytics to identify documents which keyword matching would overlook.

The Fluid Topic angle is that documentation and other types of enterprise information can be indexed and matched to a user’s profile or to a user’s query. The result is that the needed document is findable.

The slicing and dicing of processed content makes it possible for the system to assemble snippets or complete documents into an “interactive document.” The idea is that most workers today are not too thrilled to get a results list and the job of opening, scanning, extracting, and closing links. The Easter egg hunt approach to finding business information is less entertaining than looking at Snapchat images or checking what’s new with pals on Facebook.

The write up states:

Users can read, search, navigate, annotate, create alerts, send feedback to writers, with a rich and intuitive user experience.

I noted this list of benefits fro the Fluid Topics’ approach:

  • Quick, easy access to the right information at the right time, making searching for technical product knowledge really efficient.
  • Combine and transform technical content into relevant, useful information by slicing and dicing data from virtually any source to create a unified knowledge hub.
  • Freedom for any user to tailor documentation and provide useful feedback to writers.
  • Knowledge of how documentation is actually used.

Applications include:

  • Casual publishing which means a user can create a “personal” book of content and share them.
  • Content organization which organizes the often chaotic and scattered source information
  • Markdown which means formatting information in a consistent way.

Fluid Topics is a hybrid which combines automatic indexing and metadata extraction, search, and publishing.

More information about Fluid Topics is available at a separate Antidot Web site called “Fluid Topics.” The company provides a video which explains how you can transform your world when you tackle search, customer support, and content federation and repurposing. Fluid Topics also performs text analytics for the “age of limitless technical content delivery.”

Hewlett Packard invested significantly in workflow based content management technology. MarkLogic’s XML data management system can be tweaked to perform similar functions. Dozens of other companies offer content workflow solutions. The sector is active, but sales cycles are lengthy. Crafty millennials can make Slack perform some content tricks as well. Those on a tight budget might find that Google’s hit and miss services are good enough for many content operations. For those in love with SharePoint, even that remarkable collection of fragmented services, APIs, and software can deliver good enough solutions.

I think it is worth watching how Antidot’s Fluid Topics performs in what strikes me as a crowded, volatile market for content federation and information workflow.

Stephen E Arnold, June 5, 2017

Bitvore: The AI, Real Time, Custom Report Search Engine

May 16, 2017

Just when I thought information access had slumped quietly through another week, I read in the capitalist tool which you know as Forbes, the content marketing machine, this article:

This AI Search Engine Delivers Tailored Data to Companies in Real Time.

This write up struck me as more interesting than the most recent IBM Watson prime time commercial about smart software for zealous professional basketball fans or Lucidworks’ (really?) acquisition of the interface builder Twigkit. Forbes Magazine’s write up did not point out that the company seems to be channeling Palantir Technologies; for example, Jeff Curie, the president, refers to employees at Bitvorians. Take that, you Hobbits and Palanterians.

image

A Bitvore 3D data structure.

The AI, real time, custom report search engine is called Bitvore. Here in Harrod’s Creek, we recognized the combination of the computer term “bit” with a syllable from one of our favorite morphemes “vore” as in carnivore or omnivore or the vegan-sensitive herbivore.

Read more

Machine Learning Going Through a Phase

May 10, 2017

People think that machine learning is like an algorithm magic wand.   It works by some writing the algorithmic code, popping in the data, and the computer learns how to do a task.  It is not that easy.  The Bitext blog reveals that machine learning needs assistance in the post, “How Phrase Structure Can Help Machine Learning For Text Analysis.”

Machine learning techniques used for text analysis are not that accurate.  The post explains that instead of learning the meaning of words in a sentence according to its structure, all the words are tossed into a bag and translated individually.  The context and meaning are lost.  A real world example is Chinese and Japanese because they use kanji (pictorial symbols representing words).   Chinese and Japanese are two languages, where a kanji’s meaning changes based on the context.  The result is that both languages have a lot of puns and are a nightmare for text analytics.

As you can imagine there are problems in Germanic and Latin-based languages too:

Ignoring the structure of a sentence can lead to various types of analysis problems. The most common one is incorrectly assigning similarity to two unrelated phrases such as Social Security in the Media” and “Security in Social Media” just because they use the same words (although with a different structure).

Besides, this approach has stronger effects for certain types of “special” words like “not” or “if”. In a sentence like “I would recommend this phone if the screen was bigger”, we don’t have a recommendation for the phone, but this could be the output of many text analysis tools, given that we have the words “recommendation” and “phone”, and given that the connection between “if” and “recommend” is not detected.

If you rely solely on the “bag of words” approach for text analysis the problems only get worse.  That is why it phrase structure is very important for text and sentiment analysis.  Bitext incorporates phrase structure and other techniques in their analytics platform used by a large search engine company and another tech company that likes fruit.

Whitney Grace, May 10, 2017

Enterprise Search and a Chimera: Analytical Engines

May 1, 2017

I put on my steam punk outfit before reading “Leading Analytical Engines for Enterprise Search.” Now there was one small factual error; specifically, the Google Search Appliance is a goner. When it was alive and tended to by authorized partners, it was not particularly adept at doing “analytical engine” type things.

What about the rest of the article? Well, I found it amusing.

Let me get to the good stuff and then deal with the nasty reality which confronts the folks who continue to pump money into enterprise search.

What companies does this “real journalism” out identify as purveyors of high top shoes for search. Yikes, sorry. I meant to say enterprise search systems which do analytical engine things.

Here’s the line up:

The Google Search Appliance. As noted, this is a goner. Yep, the Google threw in the towel. Lots of reasons, but my sources say, cost of sales was a consideration. Oh, and there were a couple of Google FTEs plus assorted costs for dealing with those annoyed with the product’s performance, relevance, customization, etc. Anyway. Museum piece.

Microsoft SharePoint. I find this a side splitter. Microsoft SharePoint is many things. In fact, armed with Visual Studio one can actually make the system work in a useful manner. Don’t tell the HR folks who wonder why certified SharePoint experts chew up a chunk of the budget and “fast.” Insider joke. Yeah, Excel is the go to analysis tool no matter what others may say. The challenge is to get the Excel thing to interact in a speedy, useful way with whatever the SharePoint administrator has managed to get working in a reliable way. Nuff said.

Coveo. Interesting addition to the list because Coveo is doing the free search thing, the Salesforce thing, the enterprise search thing, the customer support thing, and I think a bunch of other things. The Canadian outfit wants to do more than surf on government inducements, investors’ trust and money, and a key word based system. So it’s analytical engine time. I am not sure how the wrappers required to make key word search do analytics help out performance, but the company says it is an “analytical engine.” So be it.

Attivio. This is an interesting addition. The company emerged from some “fast” movers and shakers. The baseball data demo was nifty about six years ago. Now the company does search, publishing, analytics, etc. The shift from search to analytical engine is somewhat credible. The challenge the company faces is closing deals and generating sustainable revenue. There is that thing called “open source”. A clever programmer can integrate Lucene (Elasticsearch), use its open source components, and maybe dabble with Ikanow. The result? Perhaps an Attivio killer? Who knows.

Lucidworks (Really?). Yep, this is the Avis to the Hertz in the open source commercial sector. Lucidworks (Really?) is now just about everything sort of associated with Big Data, search, smart software, etc. The clear Lucid problem is Shay Bannon and Elastic. Not only does Elastic have more venture money, Elastic has more deployments and, based on information available to me, more revenue, partners, and clout in the open source world. Lucidworks (Really?) has a track record of executive and founder turnover and the thrill of watching Amazon benefit from a former Lucid employee’s inputs. Exciting. Really?

So what do I think of this article in CIO Review? Two things:

  1. It is not too helpful to me and those looking for search solutions in Harrod’s Creek, Kentucky. The reason? The GSA error and gasping effort to make key word search into something hot and cool. “Analytical engines” does not rev my motor. In fact, it does not turn over.
  2. CIO Review does not want me to copy a quote from the write up. Tip to CIO Review. Anyone can copy wildly crazy analytical engines article by viewing source and copying the somewhat uninteresting content.

Stephen E Arnold, May 1, 2017

Palantir Technologies: A Beatdown Buzz Ringing in My Ears

April 27, 2017

I have zero contacts at Palantir Technologies. The one time I valiantly contacted the company about a speaking opportunity at one of my wonky DC invitation-only conferences, a lawyer from Palantir referred my inquiry to a millennial who had a one word vocabulary, “No.”

There you go.

I have written about Palantir Technologies because I used to be an adviser to the pre-IBM incarnation of i2 and its widely used investigation tool, Analyst’s Notebook. I did write about a misadventure between i2 Group and Palantir Technologies, but no one paid much attention to my commentary.

An outfit called Buzzfeed, however, does pay attention to Palantir Technologies. My hunch is that the online real news outfit believes there is a story in the low profile, Peter Thiel-supported company. The technology Palantir has crafted is not that different from the Analyst’s Notebook, Centrifuge Systems’ solution, and quite a few other companies which provide industrial-strength software and systems to law enforcement, security firms, and the intelligence community. (I list about 15 of these companies in my forthcoming “Dark Web Notebook.” No, I won’t provide that list in this free blog. I may be retired, but I am not giving away high value information.)

So what’s caught my attention. I read the article “Palantir’s Relationship with the Intelligence Community Has Been Worse Than You Think.” The main idea is that the procurement of Palantir’s Gotham and supporting services provided by outfits specializing in Palantir systems has not been sliding on President Reagan’s type of Teflon. The story has been picked up and recycled by several “real” news outfits; for example, Brainsock. The story meshes like matryoshkas with other write ups; for example, “Inside Palantir, Silicon Valley’s Most Secretive Company” and “Palantir Struggles to Retain Clients and Staff, BuzzFeed Reports.” Palantir, it seems to me in Harrod’s Creek, is a newsy magnet.

The write up about Palantir’s lousy relationship with the intelligence community pivots on a two year old video. I learned that the Big Dog at Palantir, Alex Karp, said in a non public meeting which some clever Hobbit type videoed on a smartphone words presented this way by the real news outfit:

The private remarks, made during a staff meeting, are at odds with a carefully crafted public image that has helped Palantir secure a $20 billion valuation and win business from a long list of corporations, nonprofits, and governments around the world. “As many of you know, the SSDA’s recalcitrant,” Karp, using a Palantir codename for the CIA, said in the August 2015 meeting. “And we’ve walked away, or they walked away from us, at the NSA. Either way, I’m happy about that.” The CIA, he said, “may not like us. Well, when the whole world is using Palantir they can still not like us. They’ll have no choice.” Suggesting that the Federal Bureau of Investigation had also had friction with Palantir, he continued, “That’s de facto how we got the FBI, and every other recalcitrant place.”

Okay, I don’t know the context of the remarks. It does strike me that 2015 was more than a year ago. In the zippy doo world of Sillycon Valley, quite a bit can change in one year.

I don’t know if you recall Paul Doscher who was the CEO of Exalead USA and Lucid Imagination (before the company asserted that its technology actually “works). Mr. Doscher is a good speaker, but he delivered a talk in 2009, captured on video, during which he was interviewed by a fellow in a blue sport coat and shirt. Mr. Doscher wore a baseball cap in gangsta style, a crinkled unbuttoned shirt, and evidenced a hipster approach to discussing travel. Now if you know Mr. Doscher, he is not a manager influenced by gangsta style. My hunch is that he responded to an occasion, and he elected to approach travel with a bit of insouciance.

Could Mr. Karp, the focal point of the lousy relationship article, have been responding to an occasion? Could Mr. Karp have adopted a particular tone and style to express frustration with US government procurement? Keep in mind that a year later, Palantir sued the US Army. My hunch is that views expressed in front of a group of employees may not be news of the moment. Interesting? Sure.

What I find interesting is that the coverage of Palantir Technologies does not dig into the parts of the company which I find most significant. To illustrate: Palantir has a system and method for an authorized user to add new content to the Gotham system. The approach makes it possible to generate an audit trail to make it easy (maybe trivial) to answer these questions:

  1. What data were added?
  2. When were the data added?
  3. What person added the data?
  4. What index terms were added to the data?
  5. What entities were added to the metadata?
  6. What special terms or geographic locations were added to the data?

You get the idea. Palantir’s Gotham brings to intelligence analysis the type of audit trail I found some compelling in the Clearwell system and other legal oriented systems. Instead of a person in information technology saying in response to a question like “Where did this information come from?”, “Duh. I don’t know.”

Gotham gets me an answer.

For me, explaining the reasoning behind Palantir’s approach warrants a write up. I think quite a few people struggling with problems of data quality and what is called by the horrid term “governance” would find Palantir’s approach of some interest.

Now do I care about Palantir? Nah.

Do I care about bashing Palantir? Nah.

What I do care about is tabloidism taking precedence over substantive technical approaches. From my hollow in rural Kentucky, I see folks looking for “sort of” information.

How about more substantive information? I am fed up with podcasts which recycle old information with fake good cheer. I am weary of leaks. I want to know about Palantir’s approach to search and content processing and have its systems and methods compared to what its direct competitors purport to do.

Yeah, I know this is difficult to do. But nothing worthwhile comes easy, right?

I can hear the millennials shouting, “Wrong, you dinosaur.” Hey, no problem. I own a house. I don’t need tabloidism. I have picked out a rest home, and I own 60 cemetery plots.

Do your thing, dudes and dudettes of “real” journalism.

Stephen E Arnold, April 27, 2017

Next Page »

  • Archives

  • Recent Posts

  • Meta