Meta Zuck: AIR SC Sort of Sketched Out

January 25, 2022

I read Facebook’s (Meta’s) blog post called “Introducing the AI Research SuperCluster — Meta’s Cutting-Edge AI Supercomputer for AI Research.” The AIR SC states:

Today, Meta is announcing that we’ve designed and built the AI Research SuperCluster (RSC) — which we believe is among the fastest AI supercomputers running today and will be the fastest AI supercomputer in the world when it’s fully built out in mid-2022.

Then this statement:

Ultimately, the work done with RSC will pave the way toward building technologies for the next major computing platform — the metaverse, where AI-driven applications and products will play an important role.

So the AIR SC is sort of real. The applications for the AIR SC are sort of metaverse. That’s not here either in my opinion.

So what’s going on? Here are my thoughts:

  1. Facebook wants to stake out conceptual territory claims as AT&T did with its non 5G announcements about the under construction 5G capabilities.
  2. Facebook wants to show that its AIR SC is bigger, better, faster, and more super than anything from the Amazon, Google, or other quasi-monopolies who want systems that will dominate the super computer league table for now and possibly forever unless government regulators or user behavior changes the game plan.
  3. Facebook believes the Silicon Valley marketing mantra, “Fake it until you make it” with a possible change. I interpret the announcement to say, “Over promise and under deliver.” I admit I have become jaded with the antics of these corporate giants who have been able to operate without meaningful oversight or what some might call ethical guidelines for a couple of decades.

In the old days, companies in the Silicon Valley mode did vaporware. The tradition continues? Sure, why not? There’s even a TikTok style video to get the AIR SC message across.

Stephen E Arnold, January 25, 2022

Google Identifies Smart Software Trends

January 18, 2022

Straight away the marketing document “Google Research: Themes from 2021 and Beyond” is more than 8,000 words. Anyone familiar with Google’s outputs may have observed that Google prefers short, mostly ambiguous phraseology. Here’s an example from Google support:

Your account is disabled

If you’re redirected to this page, your Google Account has been disabled.

When a Google document is long, it must be important. Furthermore, when that Google document is allegedly authored by Dr. Jeff Dean, a long time Googler, you know it is important. Another clue is the list of contributors which includes 32 contributors helpfully alphabetized by the individual’s first name. Hey, those traditional bibliographic conventions are not useful. Chicago Manual of Style? Balderdash it seems.

Okay, long. Lots of authors. What are the trends? Based on my humanoid processes, it appears that the major points are:

TREND 1: Machine learning is cranking out “more capable, general purpose machine learning models.” The idea, it seems, that the days of hand-crafting a collection of numerical recipes, assembling and testing training data, training the model, fixing issues in the model, and then applying the model are either history or going to be history soon. Why’s this important? Cheaper, faster, and allegedly better machine learning deployment. What happens if the model is off a bit or drifts, no worries. Machine learning methods which make use of a handful of human overseers will fix up the issues quickly, maybe in real time.,

TREND 2: There is more efficiency improvements in the works. The idea is the more efficiency is better, faster, and logical. One can look at the achievements of smart software in autonomous automobiles to see the evidence of these efficiencies. Sure, there are minor issues because smart software is sometimes outputting a zero when a one is needed. What’s a highway fatality in the total number of safe miles driven? Efficiency also means it is smarter to obtain machine learning, ready to roll models and data sets from large efficient, high technology outfits. One source could be Google. No kidding? Google?

TREND 3: “Machine learning is becoming more personally and communally beneficial.” Yep, machine learning helps the community. Now is the “community” the individual who works on deep dives into Google’s approach to machine learning or a method that sails in a different direction. Is the community the advertisers who rely on Google to match in an intelligent and efficient manner the sales’ messages to users human and system communities? Is the communally beneficial group the users of Google’s ad supported services? The main point is that Google and machine learning are doing good and will do better going forward. This is a theme Google management expresses each time it has an opportunity to address a concern in a hearing about the company’s activities in a hearing in Washington, DC.

TREND 4: Machine learning is going to have “growing impact” on science, health, and sustainability. This is a very big trend. It implicitly asserts that smart software will improve “science.” In the midst of the Covid issue, humans appear to have stumbled. The trend is that humans won’t make such mistakes going forward; for example, Theranos-type exaggeration, CDC contradictory information, or Google and the allegations of collusion with Facebook. Smart software will make these examples shrink in number. That sounds good, very good.

TREND 5: A notable trend is that there will be a “deeper and broader understanding of machine learning.” Okay, who is going to understand? Google-certified machine learning professionals, advertising intermediaries, search engine optimization experts, consumers of free Google Web search, Google itself, or some other cohort? Will the use of off the shelf, pre packaged machine learning data sets and models make it more difficult to figure out what is behind the walls of a black box? Anyway, this trend sounds a suitable do good, technology will improve the world that appears to promise a bright, sunny day even though a weathered fisherperson says, “A storm is a-coming.”

The write up includes art, charts, graphs, and pictures. These are indeed Googley. Some are animated. Links to YouTube videos enliven the essay.

The content is interesting, but I noted several omissions:

  1. No reference to making making decisions which do not allegedly contravene one or more regulations or just look like really dicey decisions. Example: “Executives Personally Signed Off on Facebook-Google Ad Collusion Plot, States Claim
  2. No reference to the use of machine learning to avoid what appear to be ill-conceived and possibly dumb personnel decisions within the Google smart software group. Example: “Google Fired a Leading AI Scientist but Now She’s Founded Her Own Firm
  3. No reference to anti trust issues. Example: “India Hits Google with Antitrust Investigation over Alleged Abuse in News Aggregation.”

Marketing information is often disconnected from the reality in which a company operates. Nevertheless, it is clear that the number of words, the effort invested in whizzy diagrams, and the over-wrought rhetoric are different from Google’s business-as-usual-approach.

What’s up or what’s covered up? Perhaps I will learn in 2022 and beyond?

Stephen E Arnold, January 18, 2022

Business Intelligence: Popping Up a Level Pushes Search into the Background

January 17, 2022

I spotted a diagram in this Data Science Central article “Business Intelligence Analytics in One Picture.” The diagram takes business intelligence and describes it as an “umbrella term.” From my point of view, this popping up a conceptual label creates confusion. First, can anyone define “intelligence” as the word is used in computer sectors. Now how about “artificial intelligence,” “government intelligence,” or “business intelligence.” Each of these phrases is designed to sidestep the problem of explaining what functions are necessary to produce useful or higher value information.

Let’s take an example. Business intelligence suggests that information about a market, a competitor, a potential new hire, or a technology can be produced, obtained (fair means or foul means), or predicted (fancy math, synthetic data, etc.) The core idea is gaining an advantage. That is too crude for many professionals who are providers of business intelligence; for example, the mid tier consulting firms cranking out variations of General Eisenhower’s four square graph or a hyperbole cycle.

Business intelligence is a marketing confection. The graph identifies specific “components” of business intelligence. Some of the techniques necessary to obtain high value information are not included; for example, running a fake job posting designed to attract employees who currently work at the company one is subject to a business intelligence process, surveillance via mobile phones, sitting in a Starbucks watching and eavesdropping, or using analytic procedures to extract “secrets” from publicly available documents like patent applications, among others.

Business intelligence is not doing any of those things because they are [a] unethical, [b] illegal, [c] too expensive, or [d] difficult. The notion of “ethical behavior” is an interesting one. We have certain highly regarded companies taking actions which some in government agencies find improper. Nevertheless, the actions continue, not for a week or two but for decades. So maybe ethics applied to business intelligence is a non-starter. Nevertheless, certain research groups are quick to point out that unethical information gathering is not the dish served as conference luncheons.

Here are the elements or molecules of business intelligence:

  • Data mining
  • Data visualization
  • Data preparation
  • Data analytics
  • Performance metrics / benchmarking
  • Querying
  • Reporting
  • Statistical analysis
  • Visual analysis

Data mining, data analytics, performance metrics / benchmarking, and statistical analysis strike me as one thing: Numerical procedures.

Now the list looks like this:

  • Numerical procedures
  • Data visualization
  • Data preparation
  • Querying
  • Reporting
  • Visual analysis

Let’s concatenate data visualization and visual analysis into one function: Producing charts and graphs.

Now the list looks like this:

  • Producing charts and graphs
  • Data preparation
  • Numerical procedures
  • Querying
  • Reporting.

Querying, in this simplification, has moved from one of nine functions to one of five functions.

What’s up with business intelligence whipping up disciplines? Is the goal to make business intelligence more important? Is it a buzzword exercise so consultants can preach doom and sell snake oil? Is it a desire to add holiday lights and ornaments to distract people from what business intelligence is?

My hunch is that business intelligence professionals don’t want to use the words spying, surveillance, intercepts, eavesdrop, or operate like a nation state’s intelligence agency professionals.

One approach is business intelligence which seems to mean good, mathy, and valuable. The spy approach is bad and could lead to an on one Lifetime Report Card.

The fact is that one of the most important components of any intelligence operation is asking the right question. Without querying, masses of data, statistics software, and online experts with MBAs would not be able to find an online ad using Google.

Net net: The chart makes spying and surveillance into a math-centric operation. The chart fails to provide a hierarchy based on asking the right question. Will the diagram help sell business intelligence consulting and services? The scary answer is, “Absolutely.”

Stephen E Arnold, January 14, 2022

Microsoft: Putting Teeth on Edge

January 11, 2022

Usually a basic press release for an update to Microsoft receives little discussion, but OS News recently posted a small quip: “Update For Windows 10 And 11 Blocks Default Browser Redirect, But There Is a Workaround” and users left testy comments. The sting fighting words were:

“It seems that Microsoft has quietly backported the block, introduced a month ago in a Dev build of Windows 11, on tools like EdgeDeflector and browsers from being the true default browser in Windows 10, with the change being implemented in Windows 11 too. Starting from KB5008212, which was installed on all supported versions of Windows 10 yesterday with Patch Tuesday, it is no longer possible to select EdgeDeflector as the default MICROSOFT-EDGE protocol.”

Followed by this sarcastic line: “They spent engineering resources on this.”

Users were upset because it meant Microsoft blocked other Web browsers from becoming a system’s default. It is a corporate strategy to normalize anti-competitive restrictions, but there are users who defended Microsoft’s move. They stated that blocking other Web browsers protected vulnerable users, like the elderly, from accidentally downloading malware and adware.

The comments then turned into an argument between tech-savvy experts and the regular users who do not know jack about technology. The discussion ended with semi-agreement that users need protection from freeware that forcefully changes a system, but ultimately users have the choice on their system settings.

In the end, the comments shifted to why Microsoft wants Edge to be the system default: money and deflecting attention from its interesting approaches to security.

Whitney Grace, January 11, 2022

Windows 11: Loved and Wanted? Sure As Long As No One Thinks about MSFT Security Challenges

January 10, 2022

I hold the opinion that the release of Windows 11 was a red herring. How does one get the tech pundits, podcasters, and bloggers to write about something other than SolarWinds, Exchange, etc.? The answer from my point of view was to release the mostly odd Windows 10 refresh.

Few in my circle agreed with me. One of my team installed Windows 11 on one of our machines and exclaimed, “I’m feeling it.” Okay, I’m not. No Android app support, round corners, and like it, dude, you must use Google Chrome, err, I mean Credge.

I read “Only 0.21%, Almost No One Wants to Upgrade Windows 11.” Sure, the headline is confusing, but let’s look at the data. I believe everything backed by statistical procedures practiced by an art history major whose previous work experience includes taking orders at Five Guys.

The write up states:

According to the latest research by IT asset management company Lansweeper, although Windows 10 users can update Windows 11 for free, it is currently only 0.21%. Of PC users are running Windows 11.

I am not sure what this follow on construction means:

At present, Windows 11 is very good. Probably the operating system with the least proportion.

I think the idea is that people are not turning cartwheels over Windows 11. Wasn’t Windows 10 supposed to be the last version of Windows?

I am going to stick with my hypothesis that Windows 11 was pushed out the door, surprising Windows experts with allegedly “insider knowledge” about what Microsoft was going to do. The objective was to deflect attention from Microsoft’s significant security challenges.

Those challenges have been made a little more significant with Bleeping Computer’s report “Microsoft Code Sign Check Bypassed to Drop Zloader.”

Is it time for Windows 12, removing Paint, and charging extra for Notepad?


Stephen E Arnold, January 10, 2022

Perhaps Someone Wants to Work at Google?

January 7, 2022

I read another quantum supremacy rah rah story. What’s quantum supremacy? IBM and others want it whatever it may be. “Google’s Time Crystals Could Be the Greatest Scientific Achievement of Our Lifetimes” slithers away from the genome thing, whatever the Nobel committee found interesting, and dark horses like the NSO Group’s innovation for seizing an iPhone user’s mobile device just by sending the target a message.

None of these is in the running. What we have it, according to The Next Web, is what may  be:

the world’s first time crystal inside a quantum computer.

Now the quantum computer is definitely a Gartner go-to technology magnet. Google is happy with DeepMind’s modest financial burn rate to reign supreme. The Next Web outfit is doing its part. Two questions?

What’s a quantum computer? A demo, something that DARPA finds worthy of supporting, or a financial opportunity for clever physicists and assorted engineers eager to become the Seymour Crays of 2022.

What’s a time crystal? Frankly I have no clue. Like some hip phrases — synaptic plasticity, phubbing, and vibrating carbon nanohorns, for instance — time crystal is definitely evocative. The write up says:

Time crystals don’t give a damn what Newton or anyone else thinks. They’re lawbreakers and heart takers. They can, theoretically, maintain entropy even when they’re used in a process.

The write up includes a number of disclaimers, but the purpose of the time crystal strikes me as part of the Google big PR picture. Whether time crystals are a thing like yeeting alphabet boys or hyperedge replacement graph grammars, the intriguing linkage of Google, quantum computing, and zippy time crystals further cements the idea that Google is a hot bed of scientific research, development, and innovation.

My thought is that Google is better at getting article writers to make their desire to work at Google evident. Google has not quite mastered the Timnit Gebru problem, however.

And are the Google results reproducible? Yeah, sure.

Stephen E Arnold, January 7, 2022

A New Spin on Tech Recruitment

January 7, 2022

Knock Knock! Who’s There? – An NSA VM” is an interesting essay for three reasons.

First, it contains a revealing statement about the NSO Group:

Significant time has passed and everyone went crazy last week with the beautiful NSO exploit VM published by Project Zero, so why not ride the wave and present a simple NSA BPF VM. It is still an interesting work and you have to admire the great engineering that goes behind this code. It’s not everyday that you can take a peek at code developed by a well funded state actor.

I noticed that the write up specifically identifies the NSO Group as a “state actor.” I think this means that NSO Group was working for a country, not the customers. This point is one that has not poked through the numerous write ups about the Israel-based company.

Second, the write up walks through a method associated with the National Security Agency. In terms of technical usefulness, one could debate whether the write up contains old news or new news. The information does make it clear that there are ideas for silent penetration of targeted systems. The targets are not specific mobile phones. It appears that the targets of the methods referenced and the sample code provided are systems higher in the food chain.

Third, the write up is actually a recruitment tool. This is not novel, but it is probably going to lead to more “look how smart and clever we are, come join us” blandishments in the near future. My hunch is that some individual, eager to up their games, will emulate the approach.

Is this method of sharing information a positive or negative? That depends on whom one asks, doesn’t it?

Stephen E Arnold, January 7, 2022

Datasets: An Analysis Which Tap Dances around Some Consequences

December 22, 2021

I read “3 Big Problems with Datasets in AI and Machine Learning.” The arguments presented support the SAIL, Snorkel, and Google type approach to building datasets. I have addressed some of my thoughts about configuring once and letting fancy math do the heavy lifting going forward. This is probably not the intended purpose of the Venture Beat write up. My hunch is that pointing out other people’s problems frames the SAIL, Snorkel, and Google type approaches. No one asks, “What happens if the SAIL, Snorkel, and Google type approaches don’t work or have some interesting downstream consequences?” Why bother?

Here are the problems as presented by the cited article:

  1. The Training Dilemma. The write up says: “History is filled with examples of the consequences of deploying models trained using flawed datasets.” That’s correct. The challenge is that creating and validating a training set for a discipline, topic, or “space” is that new content arrives using new lingo and even metaphors instead of words like “rock.” Building a dataset and doing what informed people from the early days of Autonomy’s neuro-linguistic method know is that no one wants to spend money, time, and computing resources in endless Sisyphean work. That rock keeps rolling back down the hill. This is a deal breaker, so considerable efforts has been expended figuring out how to cut corners, use good enough data, set loose shoes thresholds, and rely on normalization to smooth out the acne scars. Thus, we are in an era of using what’s available. Make it work or become a content creator on TikTok.
  2. Issues with Labeling. I don’t like it when the word “indexing” is replaced with works like labels, metatags, hashtags, and semantic sign posts. Give me a break. Automatic indexing is more consistent than human indexers who get tired and fall back on a quiver of terms because who wants to work too hard at a boring job for many. But the automatic systems are in the same “good enough” basket as smart training data set creation. The problem is words and humans. Software is clueless when it comes to snide remarks, cynicism, certain types of fake news and bogus research reports in peer reviewed journals, etc. Indexing using esoteric words means the Average Joe and Janet can’t find the content. Indexing with everyday words means that search results work great for pizza near me but no so well for beatles diet when I want food insects eat, not what kept George thin. The write up says: “Still other methods aim to replace real-world data with partially or entirely synthetic data — although the jury’s out on whether models trained on synthetic data can match the accuracy of their real-world-data counterparts.” Yep, let’s make up stuff.
  3. A Benchmarking Problem. The write up asserts: “SOTA benchmarking [also] does not encourage scientists to develop a nuanced understanding of the concrete challenges presented by their task in the real world, and instead can encourage tunnel vision on increasing scores. The requirement to achieve SOTA constrains the creation of novel algorithms or algorithms which can solve real-world problems.” Got that. My view is that validating data is a bridge too far for anyone except a graduate student working for a professor with grant money. But why benchmark when one can go snorkeling? The reality is that datasets are in most cases flawed but no one knows how flawed. Just use them and let the results light the path forward. Cheap and sounds good when couched in jargon.

What’s the fix? The fix is what I call the SAIL, Snorkel, and Google type solution. (Yep, Facebook digs in this sandbox too.)

My take is easily expressed just not popular. Too bad.

  1. Do the work to create and validate a training set. Rely on subject matter experts to check outputs and when the outputs drift, hit the brakes, and recalibrate and retrain.
  2. Admit that outputs are likely to be incomplete, misleading, or just plain wrong. Knock of the good enough approach to information.
  3. Return to methods which require thresholds to be be validated by user feedback and output validity. Letting cheap and fast methods decide which secondary school teacher gets fired strikes me as not too helpful.
  4. Make sure analyses of solutions don’t functions as advertisements for the world’s largest online ad outfit.

Stephen E Arnold, December 22, 2021

Microsoft Has a Digital Death Star and Windows 11

December 21, 2021

If you are not familiar with Microsoft’s digital Death Star, you will want to watch the story in the December 26, 2021, Dark Cyber video news program. You can find it in the mini player at this link. More than a year after the SolarWinds’ security misstep became public, the Redmond giant can digitally slay the 1,000 malefactors responsible for some data exfiltration. Quick.

My hunch has been that Microsoft rolled out Windows 11 as part of a red herring campaign. The idea may have been that Windows 11 would capture the attention of “real” journalists, thus reducing the blow torch directed at the Microsoft enterprise software processes. It seems to have worked. No one I have spoken with knows much about the Death Star meme and quite a few people are excited about Windows 11.

ZDNet remains firmly in the camp of writing about Windows 11. Why not? Users who want to use a browser other than Edge or a specialized software to perform a specific PDF function find that some noodling is required. Windows 11 is supposed to be simpler better cheaper faster more wonderfuler, right?

8 Harsh Realities of Being a Windows 11 User” presents a distinguished lecturer’s view of some Windows 11 foibles. Let’s take a quick look at three of the eight and then circle back to the year long wait for digital retribution against the 1,000 engineers who created the SolarWinds’ misstep and made the Softies look inept and sort of silly in the security department.

Reality 1. The Browser Lock In

Microsoft does not want a Windows 11 user to load up a non Microsoft browser. I find this amusing because Edge is not really Microsoft code. Microsoft pulled out what I call soft taco engineering; that is, the Chrome engine is wrapped in a tortilla crafted in the kitchens of Microsoft Café 34. I am a suspicious type; therefore, I think the browser lock in is designed to make darned sure the geek bloggers and the “real” journalists have something to Don Quixote.

Reality 5. Control Panel / Settings Craziness

Okay, where is the widget to have the weird File Explorer show me “details”? And what about Display controls? I have a couple of places to look now. That’s helpful. Exactly what is the difference between a bunch of icons grouped in one place under one jargonized name? I am not sure about the logic of this bit of silliness, but, hey, one has to do more than clean the microwave in the snack area or hunt for the meeting room on the campus. (Where did the alleged interpersonal abuses take place? Is there a Bing Map for that?)

Reality 8. What Runs Windows 11?

Now if there is a super sized red herring being dragged over the SolarWinds’ misstep it is this one: Will my PC run Windows 11? Lame? You bet, but we are in the distraction business, not in the useful software business. Subscribe and pay now for the greatness which may not run on your PC, you computer dolt. But why? Maybe SolarWinds’ stuff saying, “Look here, not there.”

You have to navigate to the distinguished lecturer’s cited post for Realities of 2, 3, 4, 6, and 7. There are more Dusies too.

Now the circle back: SolarWinds’s misstep is still with us and Microsoft. At least I can understand Windows 11 as a quick and dirty distraction. Can users?

Stephen E Arnold, December 21, 2021

Semantics Have Become an Architecture: Sounds Good but

December 17, 2021

Semantic Architecture Is A Big Data Cash Grab

A few years ago, big data was the hot topic term and in its wake a surge of techno babble followed. Many technology companies develop their own techno babble to peddle their wares, while some of the jargon does have legitimate means to exist. Epiexpress has the lowdown on one term that does have actual meaning: “What Is Semantic Architecture, And How To Build One?”

The semantic data layer is a system’s brain or hub, because most data can be found through a basic search. It overlays the more complex data in a system. Companies can leverage the semantic layer for business decisions and discover new insights. The semantic layer uses an ontology model and enterprise knowledge graph to organize data. Before building the architecture, one should consider the following:

“1. Defining and listing the organizational needs

When developing a semantic enterprise solution, properly-outlined use cases provide the critical questions that the semantic architecture will answer. It, in turn, gives a better knowledge of the stakeholders and users, defines the business value, and facilitates the definition of measurable success criteria.

2. Survey the relevant business data

Many enterprises possess a data architecture founded on data warehouses, relational databases, and an array of hybrid cloud systems and applications that aid analytics and data analysis abilities
In such enterprises, employing relevant unification processes and model mapping practices based on the enterprise’s use cases, staff skill-sets, and enterprise architecture capabilities will be an effective approach for data modeling and mapping from source systems.

3. Using semantic web standards for ensuring governance and interoperability

When implementing semantic architecture, it is important to use semantic technology such as graph management apps to be middleware. Middleware acts as organizational tools for proper metadata governance. Do not forger that users will need tools to interact with the data, such as enterprise search, chatbots, and data visualization tools.

Semantic babble?

Whitney Grace, December 17, 2021

Next Page »

  • Archives

  • Recent Posts

  • Meta