Googzilla Versus OpenAI: Moving Up to Pillow Fighting

May 17, 2024

dinosaur30a_thumbThis essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.

Mike Tyson is dressed in a Godzilla outfit. He looks like a short but quite capable Googzilla. He is wearing a Google hat. (I have one, but it is soiled. Bummer.) Googzilla is giving the stink eye to Sam AI-Man, who has followed health routines recommended by Huberman Lab and Anatoly, the fellow who hawks supplements after shaming gym brutes dressed as a norm core hero.

image

Sam AI-Man asks an important question. Googzilla seems to be baffled. But the cane underscores that he is getting old for a thunder lizard selling online advertising. Thanks, MSFT Copilot. How are the security initiatives coming along? Oh, too bad.

Now we have the first exhibition: Googzilla is taking on Sam AI-Man.

I read an analysis of this high-stakes battle in “ChatGPT 4o vs Gemini 1.5 Pro: It’s Not Even Close.” The article appeared in the delightfully named online publication “Beebom.” I am writing in Beyond Search, which is — quite frankly — a really boring name. But I am a dinobaby, and I am going to assume that Beebom has a much more tuned in owner operator.

The article illustrates a best practice in database comparison, just tweaked to provide some insights into how alike or different the Googzilla is from the AI-Man. There’s a math test. There is a follow the instructions query. There is an image test. A programming challenge. You get the idea. The article includes what a reader will need to run similar brain teasers to Googzilla and Sam AI-Man.

Who cares? Let’s get to the results.

The write up says:

It’s evidently clear that Gemini 1.5 Pro is far behind ChatGPT 4o. Even after improving the 1.5 Pro model for months while in preview, it can’t compete with the latest GPT-4o model by OpenAI. From commonsense reasoning to multimodal and coding tests, ChatGPT 4o performs intelligently and follows instructions attentively. Not to miss, OpenAI has made ChatGPT 4o free for everyone.

Welp. This statement is not going to make Googzilla happy. Anyone who plays Foosball with the beastie today will want to be alert that re-Fooses are not allowed. You lose when you what the ball out of the game.

But the sun has not set over the Googzilla computer lab. The write up opines:

The only thing going for Gemini 1.5 Pro is the massive context window with support for up to 1 million tokens. In addition, you can upload videos too which is an advantage. However, since the model is not very smart, I am not sure many would like to use it just for the larger context window.

I chuckled at the last line of the write up:

If Google has to compete with OpenAI, a substantial leap is required.

Several observations:

  1. Who knows the names of the “new” products Google rolled out?
  2. With numerous “new” products, has Google a grand vision or is it one of those high school stunts in which passengers in a luxury car jump out and run around the car shouting. Then the car drives off?
  3. Will Google’s management align its AI with its staff management methods in the context of the regulatory scrutiny?
  4. Where’s DeepMind in this somewhat confusing flood of “new” smart products?

Net net: Google is definitely showing the results of having its wizards work under Code Red’s flashing lights. More pillow fights ahead. (Can you list the “new” products announced at Google I/O? Don’t worry. Neither can I.)

Stephen E Arnold, May 17, 2024

Apple and a Recycled Carnival Act: Woo Woo New New!

May 13, 2024

dinosaur30a_thumbThis essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.

A long time ago, for a project related to a new product which was cratering, one person on my team suggested I read a book by James B. Twitchell. Carnival Culture: The Trashing of Taste in America provided a broad context, but the information in the analysis of taste was not going to save the enterprise software I was supposed to analyze. In general, I suggest that investment outfits with an interest in online information give me a call before writing checks to the tale-spinning entrepreneurs.

image

A small creative spark getting smashed in an industrial press. I like the eyes. The future of humans in Apple’s understanding of the American datasphere. Wow, look at those eyes. I can hear the squeals of pain, can’t you?

Dr. Twitchell did a good job, in my opinion, of making clear that some cultural actions are larger than a single promotion. Popular movies and people like P.T. Barnum (the circus guy) explain facets of America. These two examples are not just entertaining; they are making clear what revs the engines of the US of A.

I read “Hating Apple Goes Mainstream” and realized that Apple is doing the marketing for which it is famous. The roll out of the iPad had a high resolution, big money advertisement. If you are around young children, squishy plastic toys are often in small fingers. Squeeze the toy and the eyes bulge. In the image above, a child’s toy is smashed in what seems to me be the business end of a industrial press manufactured by MSE Technology Ltd in Turkey.

image

Thanks, MSFT Copilot. Glad you had time to do this art. I know you are busy on security or is it AI or is AI security or security AI? I get so confused.

The Apple iPad has been a bit of an odd duck. It is a good substitute for crappy Kindle-type readers. We have a couple, but they don’t get much use. Everything is a pain for me because the super duper Apple technology does not detect my fingers. I bought the gizmos so people could review the PowerPoint slides for one of my lectures at a conference. I also experimented with the iPad as a teleprompter. After a couple of tests, getting content on the device, controlling it, and fiddling so the darned thing knew I was poking the screen to cause an action — I put the devices on the shelf.

Forget the specific product, let’s look at the cited write ups comments about the Apple “carnival culture” advertisement. The write up states:

Apple has lost its presumption of good faith over the last five years with an ever-larger group of people, and now we’ve reached a tipping point. A year ago, I’m sure this awful ad would have gotten push back, but I’m also sure we’d heard more “it’s not that big of a deal” and “what Apple really meant to say was…” from the stalwart Apple apologists the company has been able to count on for decades. But it’s awfully quiet on the fan-boy front.

I think this means the attempt to sell sent weird messages about a company people once loved. What’s going on, in my opinion, is that Apple is explaining what technology is going to do to people who once used software to create words, images, and data exhaust will be secondary to cosmetics of technology.

In short, people and their tools will be replaced by a gizmo or gizmos that are similar to bright lights and circus posters. What do these artifacts tell us. My take on the Apple iPad M4 super duper creative juicer is, at this time:

  1. So what? I have an M2 Air, and it does what I hoped the two touch insensitive iPads would do.
  2. Why create a form factor that is likely to get crushed when I toss my laptop bad on a security screening belt? Apple’s products are, in my view, designed to be landfill residents.
  3. Apple knows in its subconscious corporate culture heat sink that smart software, smart services, and dumb users are the future. The wonky expensive high-resolution shouts, “We know you are going to be out of job. You will be like the yellow squishy toy.”

The message Apple is sending is that innovation has moved from utility to entertainment to the carnival sideshow. Put on your clown noses, people. Buy Apple.

Stephen E Arnold, May 13, 2024

Google Stomps into the Threat Intelligence Sector: AI and More

May 7, 2024

dinosaur30a_thumbThis essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.

Before commenting on Google’s threat services news. I want to remind you of the link to the list of Google initiatives which did not survive. You can find the list at Killed by Google. I want to mention this resource because Google’s product innovation and management methods are interesting to say the least. Operating in Code Red or Yellow Alert or whatever the Google crisis buzzword is, generating sustainable revenue beyond online advertising has proven to be a bit of a challenge. Google is more comfortable using such methods as [a] buying and trying to scale it, [b] imitating another firm’s innovation, and [c] dumping big money into secret projects in the hopes that what comes out will not result in the firm’s getting its “glass” kicked to the curb.

image

Google makes a big entrance at the RSA Conference. Thanks, MSFT Copilot. Have you considerate purchasing Google’s threat intelligence service?

With that as background, Google has introduced an “unmatched” cyber security service. The information was described at the RSA security conference and in a quite Googley blog post “Introducing Google Threat Intelligence: Actionable threat intelligence at Google Scale.” Please, note the operative word “scale.” If the service does not make money, Google will “not put wood behind” the effort. People won’t work on the project, and it will be left to dangle in the wind or just shot like Cricket, a now famous example of animal husbandry. (Google’s Cricket was the Google Appliance. Remember that? Take over the enterprise search market. Nope. Bang, hasta la vista.)

Google’s new service aims squarely at the comparatively well-established and now maturing cyber security market. I have to check to see who owns what. Venture firms and others with money have been buying promising cyber security firms. Google owned a piece of Recorded Future. Now Recorded Future is owned by a third party outfit called Insight. Darktrace has been or will be purchased by Thoma Bravo. Consolidation is underway. Thus, it makes sense to Google to enter the threat intelligence market, using its Mandiant unit as a springboard, one of those home diving boards, not the cliff in Acapulco diving platform.

The write up says:

we are announcing Google Threat Intelligence, a new offering that combines the unmatched depth of our Mandiant frontline expertise, the global reach of the VirusTotal community, and the breadth of visibility only Google can deliver, based on billions of signals across devices and emails. Google Threat Intelligence includes Gemini in Threat Intelligence, our AI-powered agent that provides conversational search across our vast repository of threat intelligence, enabling customers to gain insights and protect themselves from threats faster than ever before.

Google to its credit did not trot out the “quantum supremacy” lingo, but the marketers did assert that the service offers “unmatched visibility in threats.” I like the “unmatched.” Not supreme, just unmatched. The graphic below illustrates the elements of the unmatchedness:

image

Credit to the Google 2024

But where is artificial intelligence in the diagram? Don’t worry. The blog explains that Gemini (Google’s AI “system”) delivers

AI-driven operationalization

But the foundation of the new service is Gemini, which does not appear in the diagram. That does not matter, the Code Red crowd explains:

Gemini 1.5 Pro offers the world’s longest context window, with support for up to 1 million tokens. It can dramatically simplify the technical and labor-intensive process of reverse engineering malware — one of the most advanced malware-analysis techniques available to cybersecurity professionals. In fact, it was able to process the entire decompiled code of the malware file for WannaCry in a single pass, taking 34 seconds to deliver its analysis and identify the kill switch. We also offer a Gemini-driven entity extraction tool to automate data fusion and enrichment. It can automatically crawl the web for relevant open source intelligence (OSINT), and classify online industry threat reporting. It then converts this information to knowledge collections, with corresponding hunting and response packs pulled from motivations, targets, tactics, techniques, and procedures (TTPs), actors, toolkits, and Indicators of Compromise (IoCs). Google Threat Intelligence can distill more than a decade of threat reports to produce comprehensive, custom summaries in seconds.

I like the “indicators of compromise.”

Several observations:

  1. Will this service be another Google Appliance-type play for the enterprise market? It is too soon to tell, but with the pressure mounting from regulators, staff management issues, competitors, and savvy marketers in Redmond “indicators” of success will be known in the next six to 12 months
  2. Is this a business or just another item on a punch list? The answer to the question may be provided by what the established players in the threat intelligence market do and what actions Amazon and Microsoft take. Is a new round of big money acquisitions going to begin?
  3. Will enterprise customers “just buy Google”? Chief security officers have demonstrated that buying multiple security systems is a “safe” approach to a job which is difficult: Protecting their employers from deeply flawed software and years of ignoring online security.

Net net: In a maturing market, three factors may signal how the big, new Google service will develop. These are [a] price, [b] perceived efficacy, and [c] avoidance of a major issue like the SolarWinds’ matter. I am rooting for Googzilla, but I still wonder why Google shifted from Recorded Future to acquisitions and me-too methods. Oh, well. I am a dinobaby and cannot be expected to understand.

Stephen E Arnold, May 7, 2024

The Everything About AI Report

May 7, 2024

dinosaur30a_thumbThis essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.

I read the Stanford Artificial Intelligence Report. If you have have not seen the 500 page document, click here.  I spotted an interesting summary of the document. “Things Everyone Should Understand About the Stanford AI Index Report” is the work of Logan Thorneloe, an author previously unknown to me. I want to highlight three points I carried away from Mr. Thorneloe’s essay. These may make more sense after you have worked through the beefy Stanford document, which, due to its size, makes clear that Stanford wants to be linked to the the AI spaceship. (Does Stanford’s AI effort look like Mr. Musk’s or Mr. Bezos’ rocket? I am leaning toward the Bezos design.)

image

An amazed student absorbs information about the Stanford AI Index Report. Thanks, MSFT. Good enough.

The summary of the 500 page document makes clear that Stanford wants to track the progress of smart software, provide a policy document so that Stanford can obviously influence policy decisions made by people who are not AI experts, and then “highlight ethical considerations.” The assumption by Mr. Thorneloe and by the AI report itself is that Stanford is equipped to make ethical anything. The president of Stanford departed under a cloud for acting in an unethical manner. Plus some of the AI firms have a number of Stanford graduates on their AI teams. Are those teams responsible for depictions of inaccurate historical personages? Okay, that’s enough about ethics. My hunch is that Stanford wants to be perceived as a leader. Mr. Thorneloe seems to accept this idea as a-okay.

The second point for me in the summary is that Mr. Thorneloe goes along with the idea that the Stanford report is unbiased. Writing about AI is, in my opinion of course, inherently biased. That’s’ the reason there are AI cheerleaders and AI doomsayers. AI is probability. How the software gets smart is biased by [a] how the thresholds are rigged up when a smart system is built, [b] the humans who do the training of the system and then “fine tune” or “calibrate” the smart software to produce acceptable results, and [b] the information used to train the system. More recently, human developers have been creating wrappers which effectively prevent the smart software from generating pornography or other “improper” or “unacceptable” outputs. I think the “bias” angle needs some critical thinking. Stanford’s report wants to cover the AI waterfront as Stanford maps and presents the geography of AI.

The final point is the rundown of Mr. Thorneloe’s take-aways from the report. He presents ten. I think there may just be three. First, the AI work is very expensive. That leads to the conclusion that only certain firms can be in the AI game and expect to win and win big. To me, this means that Stanford wants the good old days of Silicon Valley to come back again. I am not sure that this approach to an important, yet immature technology, is a particularly good idea. One does not fix up problems with technology. Technology creates some problems, and like social media, what AI generates may have a dark side. With big money controlling the game, what’s that mean? That’s a tough question to answer. The US wants China and Russia to promise not to use AI in their nuclear weapons system. Yeah, that will work.

Another take-away which seems important is the assumption that workers will be more productive. This is an interesting assertion. I understand that one can use AI to eliminate call centers. However, has Stanford made a case that the benefits outweigh the drawbacks of AI? Mr. Thorneloe seems to be okay with the assumption underlying the good old consultant-type of magic.

The general take-away from the list of ten take-aways is that AI is fueled by “industry.” What happened the Stanford Artificial Intelligence Lab, synthetic data, and the high-confidence outputs? Nothing has happened. AI hallucinates. AI gets facts wrong. AI is a collection of technologies looking for problems to solve.

Net net: Mr. Thorneloe’s summary is useful. The Stanford report is useful. Some AI is useful. Writing 500 pages about a fast moving collection of technologies is interesting. I cannot wait for the 2024 edition. I assume “everyone” will understand AI PR.

Stephen E Arnold, May 7, 2024

Microsoft Security Messaging: Which Is What?

May 6, 2024

dinosaur30a_thumbThis essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.

I am a dinobaby. I am easily confused. I read two “real” news items and came away confused. The first story is “Microsoft Overhaul Treats Security As Top Priority after a Series of Failures.” The subtitle is interesting too because it links “security” to monetary compensation. That’s an incentive, but why isn’t security just part of work at an alleged monopoly’s products and services? I surmise the answer is, “Because security costs money, a lot of money.” That article asserts:

After a scathing report from the US Cyber Safety Review Board recently concluded that “Microsoft’s security culture was inadequate and requires an overhaul,” it’s doing just that by outlining a set of security principles and goals that are tied to compensation packages for Microsoft’s senior leadership team.

Okay. But security emerges from basic engineering decisions; for instance, does a developer spend time figuring out and resolving security when dependencies are unknown or documented only by a grousing user in a comment posted on a technical forum? Or, does the developer include a new feature and moves on to the next task, assuming that someone else or an automated process will make sure everything works without opening the door to the curious bad actor? I think that Microsoft assumes it deploys secure systems and that its customers have the responsibility to ensure their systems’ security.

image

The cyber racoons found the secure picnic basket was easily opened. The well-fed, previously content humans seem dismayed that their goodies were stolen. Thanks, MSFT Copilot. Definitely good enough.

The write up adds that Microsoft has three security principles and six security pillars. I won’t list these because the words chosen strike me like those produced by a lawyer, an MBA, and a large language model. Remember. I am a dinobaby. Six plus three is nine things. Some car executive said a long time ago, “Two objectives is no objective.” I would add nine generalizations are not a culture of security. Nine is like Microsoft Word features. No one can keep track of them because most users use Word to produce Words. The other stuff is usually confusing, in the way, or presented in a way that finding a specific feature is an exercise in frustration. Is Word secure? Sure, just download some nifty documents from a frisky Telegram group or the Dark Web.

The write up concludes with a weird statement. Let me quote it:

I reported last month that inside Microsoft there is concern that the recent security attacks could seriously undermine trust in the company. “Ultimately, Microsoft runs on trust and this trust must be earned and maintained,” says Bell. “As a global provider of software, infrastructure and cloud services, we feel a deep responsibility to do our part to keep the world safe and secure. Our promise is to continually improve and adapt to the evolving needs of cybersecurity. This is job #1 for us.”

First, there is the notion of trust. Perhaps Edge’s persistence and advertising in the start menu, SolarWinds, and the legions of Chinese and Russian bad actors undermine whatever trust exists. Most users are clueless about security issues baked into certain systems. They assume; they don’t trust. Cyber security professionals buy third party security solutions like shopping at a grocery store. Big companies’ senior executive don’t understand why the problem exists. Lawyers and accountants understand many things. Digital security is often not a core competency. “Let the cloud handle it,” sounds pretty good when the fourth IT manager or the third security officer quit this year.

Now the second write up. “Microsoft’s Responsible AI Chief Worries about the Open Web.” First, recall that Microsoft owns GitHub, a very convenient source for individuals looking to perform interesting tasks. Some are good tasks like snagging a script to perform a specific function for a church’s database. Other software does interesting things in order to help a user shore up security. Rapid 7 metasploit-framework is an interesting example. Almost anyone can find quite a bit of useful software on GitHub. When I lectured in a central European country’s main technical university, the students were familiar with GitHub. Oh, boy, were they.

In this second write up I learned that Microsoft has released a 39 page “report” which looks a lot like a PowerPoint presentation created by a blue-chip consulting firm. You can download the document at this link, at least you could as of May 6, 2024. “Security” appears 78 times in the document. There are “security reviews.” There is “cybersecurity development” and a reference to something called “Our Aether Security Engineering Guidance.” There is “red teaming” for biosecurity and cybersecurity. There is security in Azure AI. There are security reviews. There is the use of Copilot for security. There is something called PyRIT which “enables security professionals and machine learning engineers to proactively find risks in their generative applications.” There is partnering with MITRE for security guidance. And there are four footnotes to the document about security.

What strikes me is that security is definitely a popular concept in the document. But the principles and pillars apparently require AI context. As I worked through the PowerPoint, I formed the opinion that a committee worked with a small group of wordsmiths and crafted a rather elaborate word salad about going all in with Microsoft AI. Then the group added “security” the way my mother would chop up a red pepper and put it in a salad for color.

I want to offer several observations:

  1. Both documents suggest to me that Microsoft is now pushing “security” as Job One, a slogan used by the Ford Motor Co. (How are those Fords fairing in the reliability ratings?) Saying words and doing are two different things.
  2. The rhetoric of the two documents remind me of Gertrude’s statement, “The lady doth protest too much, methinks.” (Hamlet? Remember?)
  3. The US government, most large organizations, and many individuals “assume” that Microsoft has taken security seriously for decades. The jargon-and-blather PowerPoint make clear that Microsoft is trying to find a nice way to say, “We are saying we will do better already. Just listen, people.”

Net net: Bandying about the word trust or the word security puts everyone on notice that Microsoft knows it has a security problem. But the key point is that bad actors know it, exploit the security issues, and believe that Microsoft software and services will be a reliable source of opportunity of mischief. Ransomware? Absolutely. Exposed data? You bet your life. Free hacking tools? Let’s go. Does Microsoft have a security problem? The word form is incorrect. Does Microsoft have security problems? You know the answer. Aether.

Stephen E Arnold, May 6, 2024

From the Cyber Security Irony Department: We Market and We Suffer Breaches. Hire Us!

April 24, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

Irony, according to You.com, means:

Irony is a rhetorical device used to express an intended meaning by using language that conveys the opposite meaning when taken literally. It involves a noticeable, often humorous, difference between what is said and the intended meaning. The term “irony” can be used to describe a situation in which something which was intended to have a particular outcome turns out to have been incorrect all along. Irony can take various forms, such as verbal irony, dramatic irony, and situational irony. The word “irony” comes from the Greek “eironeia,” meaning “feigned ignorance”

I am not sure I understand the definition, but let’s see if these two “communications” capture the smart software’s definition.

The first item is an email I received from the cyber security firm Palo Alto Networks. The name evokes the green swards of Stanford University, the wonky mall, and the softball games (co-ed, of course). Here’s the email solicitation I received on April 15, 2024:

image

The message is designed to ignite my enthusiasm because the program invites me to:

Join us to discover how you can harness next-generation, AI-powered security to:

  • Solve for tomorrow’s security operations challenges today
  • Enable cloud transformation and deployment
  • Secure hybrid workforces consistently and at scale
  • And much more.

I liked the much more. Most cyber outfits do road shows. Will I drive from outside Louisville, Kentucky, to Columbus, Ohio? I was thinking about it until I read:

Major Palo Alto Security Flaw Is Being Exploited via Python Zero-Day Backdoor.”

Maybe it is another Palo Alto outfit. When I worked in Foster City (home of the original born-dead mall), I think there was a Palo Alto Pizza. But my memory is fuzzy and Plastic Fantastic Land does blend together. Let’s look at the write up:

For weeks now, unidentified threat actors have been leveraging a critical zero-day vulnerability in Palo Alto Networks’ PAN-OS software, running arbitrary code on vulnerable firewalls, with root privilege. Multiple security researchers have flagged the campaign, including Palo Alto Networks’ own Unit 42, noting a single threat actor group has been abusing a vulnerability called command injection, since at least March 26 2024.

Yep, seems to be the same outfit wanting me to “solve for tomorrow’s security operations challenges today.” The only issue is that the exploit was discovered a couple of weeks ago. If the write up is accurate, the exploit remains unfixed.,

Perhaps this is an example of irony? However, I think it is a better example of the over-the-top yip yap about smart software and the efficacy of cyber security systems. Yes, I know it is a zero day, but it is a zero day only to Palo Alto. The bad actors who found the problem and exploited already know the company has a security issue.

I mentioned in articles about some intelware that the developers do one thing, the software project manager does another, and the marketers output what amounts to hoo hah, baloney, and Marketing 101 hyperbole.

Yep, ironic.

Stephen E Arnold, April 24, 2024

Google Cracks Infinity Which Overshadows Quantum Supremacy Maybe?

April 16, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

The AI wars are in overdrive. Google’s high school rhetoric is in another dimension. Do you remember quantum supremacy? No, that’s okay, but it makes it clear that the Google is the leader in quantum computing. When will that come to the Pixel mobile device? Now Google’s wizards, infused with the juices of a rampant high school science club member (note the words rampant and member, please. They are intentional.)

An article in Analytics India (now my favorite cheerleading reference tool) uses this headline: “Google Demonstrates Method to Scale Language Model to Infinitely Long Inputs.” Imagine a demonstration of infinity using infinite inputs. I thought the smart software outfits were struggling to obtain enough content to train their models. Now Google’s wizards can handle “infinite” inputs. If one demonstrates infinity, how long will that take? Is one possible answer, “An infinite amount of time.”

Wow.

The write up says:

This modification to the Transformer attention layer supports continual pre-training and fine-tuning, facilitating the natural extension of existing LLMs to process infinitely long contexts.

Even more impressive is the diagram of the “infinite” method. I assure you that it won’t take an infinite amount of time to understand the diagram:

image

See, infinity may have contributed to Cantor’s mental issues, but the savvy Googlers have sidestepped that problem. Nifty.

But the write up suggests that “infinite” like many Google superlatives has some boundaries; for instance:

The approach scales naturally to handle million-length input sequences and outperforms baselines on long-context language modelling benchmarks and book summarization tasks. The 1B model, fine-tuned on up to 5K sequence length passkey instances, successfully solved the 1M length problem.

Google is trying very hard to match Microsoft’s marketing coup which caused the Google Red Alert. Even high schoolers can be frazzled by flashing lights, urgent management edicts, and the need to be perceived as a leader in something other than online advertising. The science club at Google will keep trying. Next up quantumly infinite. Yeah.

Stephen E Arnold, April 16, 2024

The Only Dataset Search Tool: What Does That Tell Us about Google?

April 11, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

If you like semi-jazzy, academic write ups, you will revel in “Discovering Datasets on the Web Scale: Challenges and Recommendations for Google Dataset Search.” The write up appears in a publication associated with Jeffrey Epstein’s favorite university. It may be worth noting that MIT and Google have teamed to offer a free course in Artificial Intelligence. That is the next big thing which does hallucinate at times while creating considerable marketing angst among the techno-giants jousting to emerge as the go-to source of the technology.

Back to the write up. Google created a search tool to allow a user to locate datasets accessible via the Internet. There are more than 700 data brokers in the US. These outfits will sell data to most people who can pony up the cash. Examples range from six figure fees for the Twitter stream to a few hundred bucks for boat license holders in states without much water.

The write up says:

Our team at Google developed Dataset Search, which differs from existing dataset search tools because of its scope and openness: potentially any dataset on the web is in scope.

image

A very large, money oriented creature enjoins a worker to gather data. If someone asks, “Why?”, the monster says, “Make up something.” Thanks MSFT Copilot. How is your security today? Oh, that’s too bad.

The write up does the academic thing of citing articles which talk about data on the Web. There is even a table which organizes the types of data discovery tools. The categorization of general and specific is brilliant. Who would have thought there were two categories of a vertical search engine focused on Web-accessible data. I thought there was just one category; namely, gettable. The idea is that if the data are exposed, take them. Asking permission just costs time and money. The idea is that one can apologize and keep the data.

The article includes a Googley graphic. The French portal, the Italian “special” portal, and the Harvard “dataverse” are identified. Were there other Web accessible collections? My hunch is that Google’s spiders such down as one famous Googler said, “All” the world’s information. I will leave it to your imagination to fill in other sources for the dataset pages. (I want to point out that Google has some interesting technology related to converting data sets into normalized data structures. If you are curious about the patents, just write benkent2020 at yahoo dot com, and one of my researchers will send along a couple of US patent numbers. Impressive system and method.)

The section “Making Sense of Heterogeneous Datasets” is peculiar. First, the Googlers discovered the basic fact of data from different sources — The data structures vary. Think in terms  of grapes and deer droppings. Second, the data cannot be “trusted.” There is no fix to this issue for the team writing the paper. Third, the authors appear to be unaware of the patents I mentioned, particularly the useful example about gathering and normalizing data about digital cameras. The method applies to other types of processed data as well.

I want to jump to the “beyond metadata” idea. This is the mental equivalent of “popping” up a perceptual level. Metadata are quite important and useful. (Isn’t it odd that Google strips high value metadata from its search results; for example, time and data?) The authors of the paper work hard to explain that the Google approach to data set search adds value by grouping, sorting, and tagging with information not in any one data set. This is common sense, but the Googley spin on this is to build “trust.” Remember: This is an alleged monopolist engaged in online advertising and co-opting certain Web services.

Several observations:

  1. This is another of Google’s high-class PR moves. Hooking up with MIT and delivering razz-ma-tazz about identifying spiderable content collections in the name of greater good is part of the 2024 Code Red playbook it seems. From humble brag about smart software to crazy assertions like quantum supremacy, today’s Google is a remarkable entity
  2. The work on this “project” is divorced from time. I checked my file of Google-related information, and I found no information about the start date of a vertical search engine project focused on spidering and indexing data sets. My hunch is that it has been in the works for a while, although I can pinpoint 2006 as a year in which Google’s technology wizards began to talk about building master data sets. Why no time specifics?
  3. I found the absence of AI talk notable. Perhaps Google does not think a reader will ask, “What’s with the use of these data? I can’t use this tool, so why spend the time, effort, and money to index information from a country like France which is not one of Google’s biggest fans. (Paris was, however, the roll out choice for the answer to Microsoft and ChatGPT’s smart software announcement. Plus that presentation featured incorrect information as I recall.)

Net net: I think this write up with its quasi-academic blessing is a bit of advance information to use in the coming wave of litigation about Google’s use of content to train its AI systems. This is just a hunch, but there are too many weirdnesses in the academic write up to write off as intern work or careless research writing which is more difficult in the wake of the stochastic monkey dust up.

Stephen E Arnold, April 11, 2024

Has Google Aligned Its AI Messaging for the AI Circus?

April 10, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

I followed the announcements at the Google shindig Cloud Next. My goodness, Google’s Code Red has produced quite a new announcements. However, I want to ask a simple question, “Has Google organized its AI acts under one tent?” You can wallow in the Google AI news because TechMeme on April 10, 2024, has a carnival midway of information.

I want to focus on one facet: The enterprise transformation underway. Google wants to cope with Microsoft’s pushing AI into the enterprise, into the Manhattan chatbot, and into the government.  One example of what Google envisions is what Google calls “genAI agents.” Explaining scripts with smarts requires a diagram. Here’s one, courtesy of Constellation Research:

image

Look at the diagram. The “customer”, which is the organization, is at the center of a Googley world: plumbing, models, and a “platform.” Surrounding this core with the customer at the center are scripts with smarts. These will do customer functions. This customer, of course, is the customer of the real customer, the organization. The genAI agents will do employee functions, creative functions, data functions, code functions, and security functions. The only missing function is the “paying Google function,” but that is baked into the genAI approach.

If one accepts the myriad announcements as the “as is” world of Google AI, the Cloud Next conference will have done its job. If you did not get the memo, you may see the Googley diagram as the work of enthusiastic marketers. The quantumly supreme lingo as more evidence that Code Red has been one output of the Code Red initiative.

I want to call attention, however, to the information in the allegedly accurate “Google DeepMind’s CEO Reportedly Thinks It’ll Be Tough to Catch Up with OpenAI’s Sora.” The write up states:

Google DeepMind CEO may think OpenAI’s text-to-video generator, Sora, has an edge. Demis Hassabis told a colleague it’d be hard for Google to draw level with Sora … The Information reported.  His comments come as Big Tech firms compete in an AI race to build rival products.

Am I to believe the genAI system can deliver what enterprises, government organizations, and non governmental entities want: Ways to cut costs and operate in a smarter way?

If I tell myself, “Believe Google’s Cloud Next statements?” Amazon, IBM, Microsoft, OpenAI, and others should fold their tents, put their animals back on the train, and head to another city in Kansas.

If I tell myself, “Google is not delivering and one cannot believe the company which sells ads and outputs weird images of ethnically interesting historical characters,” then the advertising company is a bit disjointed.

Several observations:

  1. The YouTube content processing issues are an indication that Google is making interesting decisions which may have significant legal consequences related to copyright
  2. The senior managers who are in direct opposition about their enthusiasm for Google’s AI capabilities need to get in the same book and preferably read from the same page
  3. The assertions appear to be marketing which is less effective than Microsoft’s at this time.

Net net: The circus has some tired acts. The Sundar and Prabhakar Show seemed a bit tired. The acts were better than those features on the Gong Show but not as scintillating as performances on the Masked Singer. But what about search? Oh, it’s great. And that circus train. Is it powered by steam?

Stephen E Arnold, April 9, 2024

x

x

x

x

Google AI Has a New Competitive Angle: AI Is a Bit of Problem for Everyone Except Us, Of Course

April 2, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

Google has not recovered from the MSFT Davos PR coup. The online advertising company with a wonderful approach to management promptly did a road show in Paris which displayed incorrect data. Next the company declared a Code Red emergency (whatever that means in an ad outfit). Then the Googley folk reorganized by laterally arabesque-ing Dr. Jeff Dean somewhere and putting smart software in the hands of the DeepMind survivors. Okay, now we are into Phase 2 of the quantumly supreme company’s push into smart software.

image

An unknown person in Hyde Park at Speaker’s Corner is explaining to the enthralled passers by that “AI is like cryptocurrency.” Is there a face in the crowd that looks like the powerhouse behind FTX? Good enough, MSFT Copilot.

A good example of this PR tactic appears in “Google DeepMind Co-Founder Voices Concerns Over AI Hype: ‘We’re Talking About All Sorts Of Things That Are Just Not Real’.” Some additional color similar to that of sour grapes appears in “Google’s DeepMind CEO Says the Massive Funds Flowing into AI Bring with It Loads of Hype and a Fair Share of Grifting.”

The main idea in these write ups is that the Top Dog at DeepMind and possible candidate to take over the online ad outfit is not talking about ruing the life of a Go player or folding proteins. Nope. The new message, as I understand it, AI is just not that great. Here’s an example of the new PR push:

The fervor amongst investors for AI, Hassabis told the Financial Times, reminded him of “other hyped-up areas” like crypto. “Some of that has now spilled over into AI, which I think is a bit unfortunate,” Hassabis told the outlet. “And it clouds the science and the research, which is phenomenal.”

Yes, crypto. Digital currency is associated with stellar professionals like Sam Bankman-Fried and those engaged in illegal activities. (I will be talking about some of those illegal activities at the US National Cyber Crime Conference in a few weeks.)

So what’s the PR angle? Here’s my take on the message from the CEO in waiting:

  1. The message allows Google and its numerous supporters to say, “We think AI is like crypto but maybe worse.”
  2. Google can suggest, “Our AI is not so good, but that’s because we are working overtime to avoid the crypto-curse which is inherent in outfits engaged in shoving AI down your throat.”
  3. Googlers gardons la tête froide unlike the possibly criminal outfits cheerleading for the wonders of artificial intelligence.

Will the approach work? In my opinion, yes, it will add a joke to the Sundar and Prabhakar Comedy Act. No, I don’t think it will not alter the scurrying in the world of entrepreneurs, investment firms, and “real” Silicon Valley journalists, poohbahs, and pundits.

Stephen E Arnold, April 2, 2024

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta