Enterprise Document Management: A Remarkable Point of View

March 3, 2020

DarkCyber spotted “What Is an Enterprise Document Management (EDM) System? How to Implement Full Document Control.” The write up is lengthy, running about 4,000 words. There are pictures like this one:

image

ECM is enterprise content management and in the middle is Enterprise Document Management which is abbreviated DMS, not EDM.

The idea is that documents have to be managed, and DarkCyber assumes that most organizations do not manage their content — regardless of its format — particularly well until the company is involved in a legal matter. Then document management becomes the responsibility of the lawyers.

In order to do any type of document or content management, employees have to follow the rules. The rules are the underlying foundation of the article. A company manufacturing interior panels for an automaker will have to have a product management system, an system to deal with drawings (paper and digital), supplier data, and other bits and pieces to make sure the “door cards” are produced.

The problem is that guidelines often do not translate into consistent employee behavior. One big reason is that the guidelines don’t fit into the work flows and the incentive schemes do not reward the time and effort required to make sure the information ends up in the “system.” Many professionals write something, text it, and move on. Enterprise systems typically do not track fine grained information very well.

Like enterprise search, the “document management” folks try to make workers who may be concerned about becoming redundant, a sick child, an angry boss, or any other perturbation in the consultant’s checklist ignore many information rules.

There is an association focused on records management. There are companies concerned with content management. There are vendors who focus on images, videos, audio, and tweets.

The myth that an EDM, ECM, or enterprise search system can create an affordable, non invasive, legally compliant, and effective way to deal with the digital fruit cake in organizations is worth lots of money.

The problem is that these systems, methods, guidelines, data lakes, federation technologies, smart software, etc. etc. don’t work.

The article does a good job of explaining what a consultant recommends. The information it presents provides fodder for the marketing animals who are going to help sell systems, training, and consulting.

The reality is that humans generate information and use a range of systems to produce content. Tweets about a missed shipment from a person mobile phone may be prohibited. Yeah, explain that to the person who got the order in the door and kept the commitment to the customer.

There are conferences, blogs, consulting firms, reports, and BrightPlanet videos about managing information.

The write up states:

There is no use documenting and managing poor workflows, processes, and documentation. To survive in business, you have to adapt, change and improve. That means continuously evaluating your business operations to identify shortfalls, areas for improvements, and strengths for continuous investment. Regular internal audits of your management systems will enable you to evaluate the effectiveness of your Enterprise Document Management solution.

Right. When these silver bullet, pie-in-the-sky solutions cost more than budgeted, employees quit using them, and triage costs threaten the survival of the company — call in the consultants.

Today’s systems do not work with the people actually doing information creation. As a result, most fail to deliver. Sound familiar? It should. You, gentle reader, will never follow the information rules unless you are specifically paid to follow them or given an ultimatum like “do this or get fired.”

Tweet that and let me know if you managed that information.

Stephen E Arnold, March 3, 2020

After Decades of Marketing Chaff, Data Silos Thrive

March 2, 2020

Here’s another round of data silo baloney—“Top 4 Ways to Eliminate Data Fragmentation Within Your Organization” from IT Brief. Surveys have found that many businesses are not making the most of all that data they’ve been collecting, and it has become common to blame data silos. It is true that some organizations could store and access their data more efficiently. There’s just one problem, and it is one we have mentioned before—there are some very good reasons to keep some data fragmented. Silos exist because of things like government requirements, legal processes, sensitive medical data, experts protecting their turf, and basic common sense.

The article asserts:

“Many organizations are finding it difficult to extract meaningful value from their data due to one endemic problem: mass data fragmentation. With mass data fragmentation, data volumes continue to rise exponentially, but companies struggle to manage that data because it’s scattered across locations and infrastructure silos, both in on-premises data centers and in the cloud. Organizations often don’t know what data exists, where it is and whether it’s being stored securely and in compliance with regulations.”

Of course, entities must ensure data is stored securely and that they comply with regulations. Also, the write-up’s advice to keep redundancies to a minimum and to understand how one’s data is being stored and accessed in the cloud are good ones. However, the exhortation to eliminate silos entirely is off the mark; trying to do so can be a fruitless exercise in expense and frustration.

Why?

  1. A person wants to hoard his or her information
  2. Rules or regulations prevent sharing to those “not in the fox hole”
  3. Lawyers and HR professionals don’t want legal documents available and “people” managers definitely do not want employee health and salary data flying around like particles motivated by Brownian motion.

Net net: Reality has silos. Accept it. Omit the marketing silliness.

Stephen E Arnold, March 2, 2020

 

Graph QL: The Future Five Years Later

February 28, 2020

Graph QL is “is a query language for APIs and a runtime for fulfilling those queries with your existing data.” The technology allegedly was a result of Facebook’s technical wizardry in 2012. The digital information weapon vendor released Graph QL to open source in 2015. You can get insights, links, and techno babble on the Graph QL Foundation Web site.

DarkCyber noted that Hasura snagged about $10 million to make Graph QL easier to use. The story appeared in TechCrunch on February 26, 2020. Is Hasura a frillback pigeon?

 

 

 

 

 

 

 

Or is the company one of those lovable creatures found in Washington Square Park in the spring?

 

 

 

 

 

 

As it turns out, Graph QL is becoming a mini boomlet in the database universe. There are the companies supporting the Graph QL Facebook innovation; for example:

 

 

 

 

 

 

Plus others like IBM and the PR world’s fave Twitter.

However, there are other companies in the “graph” business; for example:

Also, another dozen or so innovators.

 

 

 

 

 

 

Altexsoft asserts that GraphQL is that the technology is good for complex systems. Other upsides include:

  • Retrieves data with a single call
  • Delivers just what’s needed
  • Permits validation and type checks
  • Auto generates API documentation
  • Supports rapid application prototyping (the move fast and break things approach perhaps?)

There are some downsides; for example:

  • Complexity
  • Performance
  • The ever helpful a HTTP status code of 200 (helpful indeed)
  • Complexity (Oh, sorry, I mentioned that).

Now back to the TechCrunch story about Hasura. The reason the company was funded may relate to the firm’s unique selling proposition: Our approach makes GraphQL easy.

Will easy sell? Worth watching in order to determine what breed of pigeon is flying through disparate sets of big data.

Stephen E Arnold, February 28, 2020

NoSQL DBMS: A Surprising Inclusion

February 12, 2020

Top Databases Used in Machine Learning Project” is a listicle. The information in the write up is similar to the lists of “best” products whipped up by Silicon Valley type publications, mid tier consulting firms (a shade off the blue chip outfits like McKinsey, Booz, and BCG), and 20 somethings fresh from university.

The interesting inclusion in the list of DBMS is?

If you said, Elasticsearch you would be correct. Elasticsearch is an open source play doing business as Elastic. The open source version is at its core a search and retrieval system. (Does this mean the index is the data and the database?)

DarkCyber is not going to get into a discussion of whether an enterprise search system can be a database management system. Both sides in the battle are less interested in resolving the fuzzy language than making sales.

Maybe Elasticsearch is just doing what other enterprise search systems have done since the 1980s? Vendors describe search and retrieval as the solution to the world’s data management Wu Flu.

Net net: Without boundaries, why make distinctions? Just close the deal. Distinctions are irrelevant for some business tasks.

Stephen E Arnold, February 12, 2020

Blockchain: Now What Is That Use Case?

February 7, 2020

The DarkCyber team invested some time in figuring out Amazon’s blockchain-related inventions. (A free executive summary is available at this link.) There were some interesting use cases explained in these public documents. But blockchain in Amazon is different in blockchain in the world of a specialist blockchain firm if the information in “Major Blockchain Developer ConsenSys Announces Job Losses” is accurate.

The write up states:

Major blockchain developer ConsenSys has laid off around 14% of its workforce, it said on Tuesday, a move that comes as companies around the world frantically search for applications for the much-hyped technology.

Blockchain in frantic search for applications? Yikes.

The issues blockchain faces range from “good enough”, better known alternatives to scaling.

The write up explains:

Companies from banks and oil traders to retailers and tech vendors, drawn to its promise of making cumbersome processes more efficient and secure, have invested billions as they look to find uses for the technology. Many have turned to blockchain development startups in the process for technical expertise. Yet there have so far there have been few major breakthroughs in the practical application of blockchain, despite the spate of tests and pilots.

Complexity, performance, cost, and security may be barriers. Just what catches Amazon’s attention?

Stephen E Arnold, February 7, 2020

 

.

Buzzword Alert: Programmable Networks

February 5, 2020

DarkCyber noted “University Researchers Succeed in Boosting Computer Speeds by 2.5 Times.” The headline suggests zippy computers. Well, sort of. One bottleneck is accessing data written to a storage device. The innovation or insight, if it is economically and technically implementable, trims data access bottlenecks. DarkCyber noted:

Current data storage systems use only one storage server to process information, making them slow to retrieve information to display for the user. A backup server only becomes active if the main storage server fails. The new approach, called FLAIR, optimizes data storage systems by using all the servers within a given network. Therefore, when a user makes a data request, if the main server is full, another server automatically activates to fill it, the scientists state.

The approach exploits programmable networks. A network of servers is like a microprocessor. The shift is to meta-think about these components. Therefore, create a wrap up layer like the one described in the write up.

Popping up a level sometimes make sense. Marketing a meta-play may be even more beneficial.

Stephen E Arnold, February 3, 2020

A Solution to the Blockchain Trilemma?

February 3, 2020

Struggling to deliver a blockchain application which is decentralized, secure, and scalable? A solution may have been developed. Navigate to “Ex-Microsoft Researcher Says He’s Solved the Blockchain Scalability Problem.” Despite the hype about blockchain, there’s a problem mixed with the promise:

…It says that it’s easy to have a blockchain with two of three key attributes: decentralization, security, and scalability. What’s difficult is getting all three; so far, cranking up the volume has always meant sacrificing on another.

The alleged solution comes from Asensys, led by former Microsoft lead researcher JiaPing Wang. The alleged solution is avoiding “going off-chain or sharding transactions.” The idea is to eliminate duplicative processes:

instead spreading the workload across the entire network by creating multiple “zones” within it that work independently and asynchronously.

For now, this is a work in progress. And those marketing assurances about decentralization, security, and scalability? Yeah, right.

Stephen E Arnold, February 3, 2020

Amazon Blockchain: How Secure?

January 27, 2020

This write up does not address Amazon’s blockchain innovations. We have a summary of our Amazon blockchain technology which points out specific systems and methods, the online bookstore has “invented” to make blockchain more secure. (Keep in mind, Amazon is the inventor of S3 buckets, which in some circumstances, are somewhat leaky.) You can get a copy of the free DarkCyber Amazon Blockchain report using the information at the end of this blog post.

The article “Trust No One. Not Even a Blockchain” suggests that one of the most hyped data management technologies may have a weakness. Technology experts are not fond of weaknesses. Technology is a solution, and solutions must not have fatal flaws like mere humans working at a giant company or in the semi isolation of a coffee shop.

The write up points out:

Similarly, just because a person claims to have uploaded all of her photographs to a blockchain—like Mila’s mother in Parker’s story—does not mean there are no other pictures from her life. Omitted data, bad data, too much data: These dynamics rob a blockchain of the claim of being a source of truth. Garbage in, garbage out. This concept in computer science means that an input consisting of flawed data will generate a flawed output. So it is with blockchain technology. We can record false claims on a blockchain. We can omit data. Suddenly, that source of truth does not appear so honest.

The essay concludes with this observation:

Distortion of reality is a growing threat. Deepfakes, synthetic videos that replace an image of one person with that of another, may soon become indistinguishable from authentic videos. Today, deepfakes may largely be used in the making of memes, face-swapping celebrities, but their proliferation will undoubtedly have major implications on everything from political campaigns to policies around pornography. What makes the threat of deepfakes so profound is that they render a medium formerly viewed as reliable—namely video—undependable. We cannot trust the very thing that we are supposed to trust. This constitutes the most substantial danger to a society’s notion of reality. If we are supposed to trust whatever is on a blockchain, then we are in trouble indeed. After all, the blockchain is only as good as the data we put on it.

Amazon’s blockchain inventions address the “control” of the information placed in the blockchain. That may give Amazon an advantage in the policeware market.

If you want a copy of the DarkCyber executive summary for our 54 page report about Amazon’s blockchain and some of the implications of these inventions, send an email to darkcyber333 at yandex dot com. No charge for the summary. The full report, however, is not free.

Stephen E Arnold, January 27, 2020

Data Are a Problem? And the Solution Is?

January 8, 2020

I attended a conference about managing data last year. I sat in six sessions and listened as enthusiastic people explained that in order to tap the value of data, one has to have a process. Okay? A process is good.

Then in each of the sessions, the speakers explained the problem and outlined that knowing about the data and then putting it in a system is the way to derive value.

Neither Pros Nor Cons: Just Consulting Talk

This morning I read an article called “The Pros and Cons of Data Integration Architectures.” The write up concludes with this statement:

Much of the data owned and stored by businesses and government departments alike is constrained by the silos it’s stuck in, many of which have been built over the years as organizations grow. When you consider the consolidation of both legacy and new IT systems, the number of these data silos only increases. What’s more, the impact of this is significant. It has been widely reported that up to 80 per cent of a data scientist’s time is spent on collecting, labeling, cleaning and organizing data in order to get it into a usable form for analysis.

Now this is most true. However, the 80 percent figure is not backed up. An IDG expert whipped up some percentages about data and time, and these, I suspect, have become part of the received wisdom of those struggling with silos for decades. Most of a data scientist’s time is frittered away in meetings, struggling with budgets and other resources, and figuring out what data are “good” and what to do with the data identified by person or machine as “bad.”

The source of this statement is MarkLogic, a privately held company founded in 2001 and a magnet for $173 million from funding sources. That works out to an 18 years young start up if DarkCyber adopts a Silicon Valley T shirt.

image

A modern silo is made of metal and impervious to some pests and most types of weather.

One question the write up begs is, “After 18 years, why hasn’t the methodology of MarkLogic swept the checker board?” But the same question can be asked of other providers’ solutions, open source solutions, and the home grown solutions creaking in some government agencies in Europe and elsewhere.

Several reasons:

  1. The technical solution offered by MarkLogic-type companies can “work”; however, proprietary considerations linked with the issues inherent in “silos” have caused data management solutions to become consultantized; that is, process becomes the task, not delivering on the promise of data, elther dark or sunlit.
  2. Customers realize that the cost of dealing with the secrecy, legal, and technical problems of disparate, digital plastic trash bags of bits cannot be justified. Like odd duck knickknacks one of my failed publishers shoved into his lumber room, ignoring data is often a good solution.
  3. Individuals tasked with organizing data begin with gusto and quickly morph into bureaucrats who treasure meetings with consultants and companies pitching magic software and expensive wizards able to make the code mostly work.

DarkCyber recognizes that with boundaries like budgets, timetables, measurable objectives, federation can deliver some zip.

Silos: A Moment of Reflection

The article uses the word “silo” five times. That’s the same frequency of its use in the presentations to which I listened in mid December 2019.

image

So you want to break down this missile silo which is hardened and protected by autonomous weapons? That’s what happens when a data scientist pokes around a pharma company’s lab notebook for a high potential new drug.

Let’s pause a moment to consider what a silo is. A silo is a tower or a pit used to store core, wheat, or some other grain. Dust is silos can be exciting. Tip: Don’t light a match in a silo on a dry, hot day in a state where farms still operate. A silo can also be a structure used to house a ballistic missile, but one has to be a child of the Cold War to appreciate this connotation.

As applied to data, it seems that a silo is a storage device containing data. Unlike a silo used to house maize or a nuclear capable missile, the data silo contains information of value. How much value? No one knows. Are the data in a digital silo explosive? Who knows? Maybe some people should not know? What wants to flick a Bic and poke around?

Read more

Blockchain: A Loser in 2020?

December 31, 2019

I recently completed a report about Amazon’s R&D work in blockchain. If you want a free summary of the report, write darkcyber333 at yandex dot com. If not, no problem. You will want to read “Please Blockchain, Prove Me Wrong.” The author likes to use words on some online services stop list, but that’s okay. The writer is passionate about the perceived failings of blockchain.

Blockchain is, according to the write up:

a solution looking for a problem.”

More proof needed, you gentle but skeptical reader? How about this?

According to Gartner’s Hype Cycle, blockchain is still “sliding into the trough of disillusionment,” meaning the technology is struggling to live up to the expectations created by the hype around it.

There you go. Proof from a marketing company.

DarkCyber’s view is that encryption is likely to continue to toddle forward. Also, the charm of the distributed database continues to woe some people’s attention.

There may be hope, and perhaps that is why Amazon has more than a dozen patents related to blockchain technology. We learn from the impassioned analysis:

Blockchain’s purported promise is such that everyone is willingly taking a multi-faceted approach, not giving much thought to the possibility that its potential may, in fact, be limited. Or maybe blockchain is just the first iteration of something far more powerful, a base we can build on to restore our faith in decentralized systems.

To sum up, for a dead duck, there are some feathers afloat. And there are those Amazon patents? Maybe Mr. Bezos is just off base and should stick to bulldozing outfits like mom and pop stores and outfits like FedEx?

Stephen E Arnold, December 31, 2019

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta