A Field of Data Silos: No Problem

May 5, 2021

The hype about silos has followed data to the cloud. IT Brief grumbles, “How Cloud Silos Are Holding Organisations Back.” Although the brief write-up acknowledges that silos can be desirable, it issues the familiar call to unify the data therein. PureStorage CTO Mark Jobbins writes:

“Overcoming the challenges presented by having cloud silos requires organisations to develop a robust data architecture. Having a common data platform should form the foundation of the data architecture, one that decouples applications and their data from their underlying infrastructure, preventing organizations from being locked into a single delivery model. Working with a multi-cloud architecture is valuable because it helps organizations utilize best-in-breed services from the various cloud service providers. It also reduces vendor lock-in, improves redundancy, and lets businesses choose the ideal features they need for their operations. It’s important to have a strong multi-cloud strategy to ensure the business gets the right mix of security, performance, and cost. The strategy should include the tools and technologies that consolidate cloud resources into a single, cohesive interface for managing cloud infrastructure. Hybrid clouds bring public and private clouds together.”

Such “hybrid clouds” allow an organization to retain those advantages of that multi-cloud architecture with the blessed unified platform. Of course, this is no simple task, so we are told one must recruit a gifted storage specialist to help. We presume this is where Jobbins’ company comes in.

Cynthia Murrell, May 5, 2021

TikTok: A Good Point about Data Collection

April 21, 2021

I wish I could recall which addled Silicon Valley podcaster explained that TikTok was not a problem. I would urge this individual to read in the British paper the article “Case Launched Against TikTok over Collection of Children’s Data.” The essay explains:

Despite a minimum age requirement of 13, Ofcom found last year that 42% of UK eight to 12-year-olds used TikTok. As with other social media companies such as Facebook, there have long been concerns about data collection and the UK’s Information Commissioner’s Office is investigating TikTok’s handling of children’s personal information. Longfield said: “We’re not trying to say that it’s not fun. Families like it. It’s been something that’s been really important over lockdown, it’s helped people keep in touch, they’ve had lots of enjoyment. But my view is that the price to pay for that shouldn’t be there – for their personal information to be illegally collected en masse, and passed on to others, most probably for financial gain, without them even knowing about it. “And the excessive nature of that collection is something which drove us to [challenge] TikTok rather than others.

The cloud of unknowing swirling around individuals who insist that data collection from children is no big deal is large and possibly impenetrable.

TikTok says it is an outfit staying within the bright white lines. Nevertheless, according to the write up:

In February last year, ByteDance, the Chinese company legally domiciled in the Cayman Islands that owns TikTok, was fined a record £4.2m ($5.7m) in the US for illegally collecting personal information from children under 13.

Add to the actions which triggered the fine, TikTok is an outfit associated with China. The data from TikTok might add some useful insights about user predilections if those data flow into a Chinese aggregation system.

To the cheerleaders for TikTok, I would suggest a rethink of your position. However, it is possible that funding for some cheerleading squads may be coming from interesting sources and carry along some other agendas. Bad actors can operate within a regulation lax environment. That’s a reality.

Stephen E Arnold, April 21, 2021

Google: Cookies Not Enough! More More More!

April 6, 2021

Cookies are a necessary Internet evil. They are annoying, but they power Internet commerce at the expense of user privacy. And users demand more privacy, tech giants are already designing technology and the Internet for a post-cookie world. Google, says One Zero via Medium, wants to control everything a user does on the Internet: “Google’s ‘Privacy-First Web’ Is Really A Google-First Web.”

Google promised that third-party cookies would disappear by 2022. The company also promises not to support ad technology that tracks user information across the Web. Google is not doing this to be kind, instead Google wants to be a become a better contender in private Internet browsing. Apple and Mozilla, companies that do not rely on targeted advertising revenue, already protect users from cookies with their Internet browsers.

Google’s business strategy is to use its status as the world’s most popular search engine and provider of many free Internet services to its advantage. That means Google has access to loads of first-party data aka the stuff that advertisers want to create targeted ads.

Google is also working on alternate tracking frameworks, but some tech experts see it as a bad idea. These alternate tracking frameworks would delete the old cookie problems and replace them with a brand new set of problems.

It appears cookies will become obsolete by the middle of the 2020s, but how does that translate into money and user privacy?

“Merits aside, it’s clear that Google is positioning itself for a more privacy-conscious future in ways that seek to preserve its dominance — likely at the expense of a slew of smaller rivals. There is a whole value chain built around third-party cookies and individual user tracking, and a lot of that value is likely to go poof…. The big picture here is that a handful of giants — in this case, Apple and Google — are powerful enough to essentially dictate the terms of the modern internet to everyone else. That they’re now moving toward models that are (arguably) better for consumer privacy is welcome. The problem is that they’re also quite obviously remolding the playing field in their own interests.”

Users will effectively have better privacy protections, but their information will be in the hands of a few powerful companies. Is that good? Is that bad? History shows it is better for there to be competition to ensure stability in a mixed capitalist economy.

Whitney Grace, April 6, 2021

Who Spends $69 Million on a Digital String? Pals Do.

April 1, 2021

The buyer of Beeple’s digital art is Metakovan. One suggestion is a person allegedly named Vignesh Sundaresan. NBC, the real news outfit, was not convinced and reported: “Metakovan’s real identity is not known.

Sure but don’t tell The Straits Times which reported in the story “I Don’t Have a Car or House” that the savvy buyer of a digital string is allegedly Vignesh Sundaresan, an entrepreneur, a technopreneur in fact. Plus, I love the quote attributed to the digital Warrant Buffet type:

I don’t have a car or house.

Makes sense. Singapore has apartments, lots of apartments. A rental in a Marina Bay makes it easy to get around. No encumbrances to haul around like some Roman statues from a covert dig near Naples (Italy, not lovely Florida). A Grab ride is good enough when physical movement is required.

Yep, a digital Warren Buffet.

Stephen E Arnold, April 1, 2021

Why Use an Open Source Database? Brilliant Inadvertent Explanation

February 15, 2021

I thought, “Why bother to read ‘Everything You Should Know about the Oracle Database.’” I am delighted that I did. I read the article in The Tech Block twice! The information attempts to explain some of Oracle’s licensing guidelines. The author does a workmanlike job of explaining number of users; for example:

If you create an account for five hundred individuals, and only fifty individuals use it, you still need about five hundred licenses. This means that you’ve got to pay utmost attention to who is accessing the software. In addition, you may require a separate license not only for people but also for devices that directly or indirectly access the database. It’s also essential that you constantly check who needs access and who doesn’t. This will help you not only reduce your risk of exposure but also save you money. Being found contravening Oracle licensing agreements can be very costly. In some extreme cases, organizations have been fined millions of dollars.

The point is Oracle charges for people who don’t use the database. On one hand, this makes sense. Oracle has to do “work” to configure a database to handle users. (Remember the good old days of having to allocate more memory to a table. Ho ho ho. Wait. The good old days are today’s days.)

The write up contains eight more missteps an Oracle customer can trip and break the bean counter’s financial ankles.

Net net: The explanation makes it quite clear why some organizations use open source databases. Perhaps the author did not intend to anti-market Oracle’s database? From my point of view, that is exactly what the information in “Everything You Should Know…” delivers.

Stephen E Arnold, February 16, 2021

Oracle: Looking Like an AARP Magazine Cover Shot

February 9, 2021

Oracle used to be a game changing name in the tech industry, but now it has become an industry standard and, for lack of better terms, old. Oracle might be old, but the company continues to release reliable technology. They recently updated Oracle Database 21c to operate on Oracle Cloud. Channel Life comments on the upgrade consisting of over 200 improvements in the article: “Oracle Releases New Version Of Converged Database.”

One of the top new features for the Oracle Database 21c is the availability of the Oracle APEX Application Development. The Oracle APEX combined with Oracle Cloud offers developers a browser-based, low-code cloud environment to create apps. Other new features include native JSON data type representation, immutable blockchain tables, AutoML for in-database machine learning, persistent memory support, in-database javascript, tiger performance graph models, database in-memory automation, and Sharding automation. Sharding automation is a nifty tool that:

“Native Database Sharding delivers hyperscale performance and availability while enabling global enterprises to meet data sovereignty and data privacy regulations. Data shards share no hardware or software and can reside on-premises or in the cloud. To simplify the design and use of sharding, Database 21c includes a Sharding Advisor Tool that assesses a database schema plus its workload characteristics and then provides a sharded database design optimised for performance, scalability, and availability.  Backup and Recovery across shards is also automated.”

These updates are great refreshers for the Oracle Database 21c. The only problem with some of these features is that AWS added them a few years ago. Does Oracle stand a chance competing against AWS on a factor other than price?

Whitney Grace, February 9, 2021

Oracle: An Interesting Take on the Outfit Once Occupying Dolphin Way

December 30, 2020

The Sea World thing off 101 is history. The weird “aquatorium” has been replaced with glass structures which look like black oat meal boxes on my grandmother’s pantry shelf. Now more insight into the Coddish (not codfish) style database company has been revealed in “When You Can’t Innovate, You Litigate: Oracle Gleefully Takes Credit For Attacks On Section 230 And Google.” The write up explains that Oracle has shifted from technology to litigation and included the catch phrase “When you cannot innovate, litigate.” I like the phrase.

This passage is particularly interesting:

For a while now, people in Silicon Valley have been well aware of Oracle’s reputation as the anti-innovation behemoth, especially following its attack on APIs, interfaces, and how software is developed with the case against Google’s reimplementation of the Java API.


The thing is, Oracle more or less admits that it’s doing this purely out of spite and the fact that it has failed to innovate and keep up with more nimble and innovative competitors. Oracle and Larry Ellison made some big bets early on that flopped. And rather than correct course and innovate, it has focused on what we’ve referred to as political entrepreneurship: lobbying and using the powers of government to shut down competitors, rather than innovate.

There are, however, several other facets of Oracle which can explain the company’s behavior; for instance:

  • The firm’s investment approach using special purpose entities off shore
  • The company’s policy of acquiring companies and allowing them to drift. (I am not sure if this was Oracle’s “invention” or its version of the OpenText approach to gaining revenue and prospects for upselling.)
  • The drift down systemic problem affecting HP, IBM, Intel, and SAP. Oracle is just responding in a “path of least resistance” manner.

Interesting write up, but there’s quite a bit of corporate activity beyond the “let’s litigate” mantra.

Stephen E Arnold, December 30, 2020

Alleged CCP Database: 1.9 Million Entries

December 14, 2020

DarkCyber noted the availability of 1.9 million members of the Chinese Communist Party in 2016. We think we can here “The data are old,” “The data are a scam,” and “That was then, this is now” statements from those listed in the file. The information, which you will have to figure out for yourself, may be on the money or a bit of a spoof. Elaborate spoof, yes. It will help if you can read Chinese or have access to a system which can translate the ideographs into ASCII characters and normalized. Spellings can be variable depending on the translator or the machine translation system one uses. For now, the file is available on Go File at this link.\

Here’s a tiny snippet:

chinese database

Are there uses of the data? Sure, how about:

  • Filtering the list for those individuals in Canada, the UK, and the US and mapping the names against university faculty
  • Filtering the list for graduate students in such countries as Australia, Canada, and France. While you are at it, why not do the same for graduate students in the US
  • Filtering the list for individuals who are or have been part of a cultural or scientific exchange, particularly within driving or drone distance of a US national research laboratory; e.g., University of New Mexico or the University of Tennessee?

The data appear to be at least four years old and may turn out to be little more than a listing of individuals who purchased a SIM from a Chinese vendor in the last 48 months. On the other hand, some of the information may be a cyber confection. DarkCyber finds the circumstances of the data’s “availability,” its possible accuracy, and its available as open source information interesting.

Stephen E Arnold, December 14, 2020

Checking Out Registered Foreign Agents

December 14, 2020

Navigate to https://datasette.io. The Web page explains a service which permits manipulation of structured data. The service seems quite useful. One of the demonstrations makes it possible to explore Datasette functionality by searching for registered foreign agents. This is an interesting demonstration and some of the information returned are quite useful. You can locate the FARA Department of Justice data at this link.

Stephen E Arnold, December 14, 2020

Why Investigative Software Is Expensive

December 3, 2020

In a forthcoming interview, I explore industrial-strength policeware and intelware with a person who was Intelligence Officer of the Year. In that review, which will appear in a few weeks, the question of cost of policeware and intelware is addressed. Systems like those from IBM’s i2, Palantir Technologies, Verint, and similar vendors are pricey. Not only is there a six or seven figure license fee, the client has to pay for training, often months of instruction. Plus, these i2-type systems require systems and engineering support. One tip off of to the fully loaded costs is the phrase “forward deployed engineer.” The implicit message is that these i2-type systems require an outside expert to keep the digital plumbing humming along. But who is responsible for the data? The user. If the user fumbles the data bundle, bad outputs are indeed possible.

What’s the big deal? Why not download Maltego? Why not use one of the $100 to $3,000 solutions from jazzy startups by former intelligence officers? These are “good enough”, some may assert. One facet of the cost of industrial strength systems available to qualified licensees is a little appreciated function: Dealing with data.

Keep Data Consistency During Database Migration” does a good job of explaining what has to happen in a reliable, consistent way when one of the multiple data sources contributes “new” or “fresh” data to an intelware or policeware system. The number of companies providing middleware to perform these functions is growing. Why?

Most companies wanting to get into the knowledge extraction business have to deal with the issues identified in the article. Most organizations do not handle these tasks elegantly, rapidly, or accurately.

Injecting incorrect, stale, inaccurate data into a knowledge centric process like those in industrial strength policeware causes those systems to output unreliable results.

What’s the consequence?

Investigators and analysts learn to ignore certain outputs.

Why? The outputs can be more serious than a flawed diagram whipped up by an MBA who worries only about the impression he or she makes on a group of prospects attending a Zoom meeting.

Data consistency is a big deal.

Stephen E Arnold, December 2, 2020

Next Page »

  • Archives

  • Recent Posts

  • Meta