Featured

Data Are a Problem? And the Solution Is?

I attended a conference about managing data last year. I sat in six sessions and listened as enthusiastic people explained that in order to tap the value of data, one has to have a process. Okay? A process is good.

Then in each of the sessions, the speakers explained the problem and outlined that knowing about the data and then putting it in a system is the way to derive value.

Neither Pros Nor Cons: Just Consulting Talk

This morning I read an article called “The Pros and Cons of Data Integration Architectures.” The write up concludes with this statement:

Much of the data owned and stored by businesses and government departments alike is constrained by the silos it’s stuck in, many of which have been built over the years as organizations grow. When you consider the consolidation of both legacy and new IT systems, the number of these data silos only increases. What’s more, the impact of this is significant. It has been widely reported that up to 80 per cent of a data scientist’s time is spent on collecting, labeling, cleaning and organizing data in order to get it into a usable form for analysis.

Now this is most true. However, the 80 percent figure is not backed up. An IDG expert whipped up some percentages about data and time, and these, I suspect, have become part of the received wisdom of those struggling with silos for decades. Most of a data scientist’s time is frittered away in meetings, struggling with budgets and other resources, and figuring out what data are “good” and what to do with the data identified by person or machine as “bad.”

The source of this statement is MarkLogic, a privately held company founded in 2001 and a magnet for $173 million from funding sources. That works out to an 18 years young start up if DarkCyber adopts a Silicon Valley T shirt.

image

A modern silo is made of metal and impervious to some pests and most types of weather.

One question the write up begs is, “After 18 years, why hasn’t the methodology of MarkLogic swept the checker board?” But the same question can be asked of other providers’ solutions, open source solutions, and the home grown solutions creaking in some government agencies in Europe and elsewhere.

Several reasons:

  1. The technical solution offered by MarkLogic-type companies can “work”; however, proprietary considerations linked with the issues inherent in “silos” have caused data management solutions to become consultantized; that is, process becomes the task, not delivering on the promise of data, elther dark or sunlit.
  2. Customers realize that the cost of dealing with the secrecy, legal, and technical problems of disparate, digital plastic trash bags of bits cannot be justified. Like odd duck knickknacks one of my failed publishers shoved into his lumber room, ignoring data is often a good solution.
  3. Individuals tasked with organizing data begin with gusto and quickly morph into bureaucrats who treasure meetings with consultants and companies pitching magic software and expensive wizards able to make the code mostly work.

DarkCyber recognizes that with boundaries like budgets, timetables, measurable objectives, federation can deliver some zip.

Silos: A Moment of Reflection

The article uses the word “silo” five times. That’s the same frequency of its use in the presentations to which I listened in mid December 2019.

image

So you want to break down this missile silo which is hardened and protected by autonomous weapons? That’s what happens when a data scientist pokes around a pharma company’s lab notebook for a high potential new drug.

Let’s pause a moment to consider what a silo is. A silo is a tower or a pit used to store core, wheat, or some other grain. Dust is silos can be exciting. Tip: Don’t light a match in a silo on a dry, hot day in a state where farms still operate. A silo can also be a structure used to house a ballistic missile, but one has to be a child of the Cold War to appreciate this connotation.

As applied to data, it seems that a silo is a storage device containing data. Unlike a silo used to house maize or a nuclear capable missile, the data silo contains information of value. How much value? No one knows. Are the data in a digital silo explosive? Who knows? Maybe some people should not know? What wants to flick a Bic and poke around?

Read more »

Interviews

Exclusive: DataWalk Explained by Chris Westphal

An Interview with Chris Westphal” provides an in-depth review of a company now disrupting the analytic and investigative software landscape.

DataWalk is a company shaped by a patented method for making sense of different types of data. The technique is novel and makes it possible for analysts to extract high value insights from large flows of data in near real time with an unprecedented ease of use.

DarkCyber interviewed in late June 2019 Chris Westphal, the innovator who co-founded Visual Analytics. That company’s combination of analytics methods and visualizations was acquired by Raytheon in 2013. Now Westphal is applying his talents to a new venture DataWalk.

Westphal, who monitors advanced analytics, learned about DataWalk and joined the firm in 2017 as the Chief Analytics Officer. The company has grown rapidly and now has client relationships with corporations, governments, and ministries throughout the world. Applications of the DataWalk technology include investigators focused on fraud, corruption, and serious crimes.

Unlike most investigative and analytics systems, users can obtain actionable outputs by pointing and clicking. The system captures these clicks on a ribbon. The actions on the ribbon can be modified, replayed, and shared.

In an exclusive interview with Mr. Westphal, DarkCyber learned:

The [DataWalk] system gets “smarter” by encoding the analytical workflows used to query the data; it stores the steps, values, and filters to produce results thereby delivering more consistency and reliability while minimizing the training time for new users. These workflows (aka “easy buttons”) represent domain or mission-specific knowledge acquired directly from the client’s operations and derived from their own data; a perfect trifecta!

One of the differentiating features of DataWalk’s platform is that it squarely addresses the shortage of trained analysts and investigators in many organizations. Westphal pointed out:

…The workflow idea is one of the ingredients in the DataWalk secret sauce. Not only do these workflows capture the domain expertise of the users and offer management insights and metrics into their operations such as utilization, performance, and throughput, they also form the basis for scoring any entity in the system. DataWalk allows users to create risk scores for any combination of workflows, each with a user-defined weight, to produce an overall, aggregated score for every entity. Want to find the most suspicious person? Easy, just select the person with the highest risk-score and review which workflows were activated. Simple. Adaptable. Efficient.

Another problem some investigative and analytic system developers face is user criticism. According to Westphal, DataWalk takes a different approach:

We listen carefully to our end-user community. We actively solicit their feedback and we prioritize their inputs. We try to solve problems versus selling licenses… DataWalk is focused on interfacing to a wide range of data providers and other technology companies. We want to create a seamless user experience that maximizes the utility of the system in the context of our client’s operational environments.

For more information about DataWalk, navigate to www.datawalk.com. For the full text of the interview, click this link. You can view a short video summary of DataWalk in the July 2, 2019, DarkCyber Video available on Vimeo.

Stephen E Arnold, July 9, 2019

Latest News

Google Allegedly Ostracized

I worked in the San Francisco area once affectionately known as Plastic Fantastic. My recollection is that most of the people with whom I worked and socialized were... Read more »

January 18, 2020 | Comment

Amazon and Microsoft: Different Ways to Leverage $1 Billion

Author and big gun Brad Smith, president of Microsoft, allegedly wrote “Microsoft Will Be Carbon Negative by 2030.” To achieve this goal, the company will spend... Read more »

January 17, 2020 | Comment

The New Doing Gooder Google

Google’s cheerleading unit likes to remind us, amid the constant criticisms, that the company makes some positive contributions to society. For example, it seems... Read more »

January 17, 2020 | Comment

US China Deal: The Honeymoon Will Not Last Long

DarkCyber spotted a write up called “China Bracing for US Tech War with Plan to Cut Reliance on Imports of Key Components to Just 25 Per Cent.” If the information... Read more »

January 17, 2020 | Comment

Library Software Soutron Version 4.1.4 Now Available

Library automation and cataloging firm Soutron introduces its “Latest Software Update—Soutron Version 4.1.4.” The announcement describes the updates and features,... Read more »

January 17, 2020 | Comment

Software: Duct Tape Is the Fabric of Solutions

Polygon published “The Truth Is That Many Games Are Held Together by Duct Tape.” The write up explains that software is messy. Here’s one statement from the... Read more »

January 16, 2020 | Comment

NSO Does Not Play the Facebook Game

We spotted a write up in Techdirt, an interesting publication indeed. The story is “Malware Marketer NSO Group Looks Like It’s Blowing Off Facebook’s... Read more »

January 16, 2020 | Comment

VideoStudio 19 Ultimate Installation Failure: This Procedure May Help You

DarkCyber has never in our previous 16,000 posts provided a fix for a problem with commercial software. We are providing a fix for Corel’s VideoStudio 19 Ultimate... Read more »

January 16, 2020 | Comment

A Taxonomy Vendor: Still Chugging Along

Semaphore Version 5 from Smartlogic coming soon. An indexing software company— now morphed into a semantic AI outfit — Smartlogic promises Version 5... Read more »

January 15, 2020 | Comment

An Interesting Hypothesis about Google Indexing

We noted “Google’s Crawl-Less Index.” The main idea is that something has changed in how Google indexes. We circled in yellow this statement from the article: [Google’... Read more »

January 15, 2020 | 1 Comment


  • Archives

  • Recent Posts

  • Meta