The Google Explains the Future of the Google Cloud: Very Googley, Of Course

April 30, 2024

green-dino_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

At its recent Next 24 conference, Google Cloud and associates shared their visions for the immediate future of AI. Through the event’s obscurely named Session Library, one can watch hundreds of sessions and access resources connected to many more. The idea — if you  have not caught on to the Googley nomenclature — is to make available videos of the talks at the conference. To narrow, one can filter by session category, conference track, learning level, solution, industry, topic of interest, and whether video is available. Keep in mind that the words you (a normal human, I presume) may use to communicate your interest may not be the lingo Googzilla speaks. AI and Machine Learning feature prominently. Other key areas include data and databases, security, development and architecture, productivity, and revenue growth (naturally). There is even a considerable nod to diversity, equity, and inclusion (DEI). Okay, nod, nod.

Here are a few session titles from just the “AI and ML” track to illustrate the scope of this event and the available information:

  • A cybersecurity expert’s guide to securing AI products with Google SAIF
  • AI for banking: Streamline core banking services and personalize customer experiences
  • AI for manufacturing: Enhance productivity and build innovative new business models
  • AI for telecommunications: Transform customer interactions and network operations
  • AI in capital markets: The biggest bets in the industry
  • Accelerate software delivery with Gemini and Code Transformations
  • Revolutionizing healthcare with AI
  • Streamlining access to youth mental health services

It looks like there is something for everybody. We think the titles make reasonably clear the scope and bigness of Google’s aspirations. Nor would we expect less from a $2 trillion outfit based on advertising, would we? Run a query for Code Red or in Google lingo CodeRED, and you will be surprised that the state of emergency, Microsoft is a PR king mentality persists. (Is this the McKinsey way?) Well, not for those employed at McKinsey. Former McKinsey professionals have more latitude in their management methods; for example, emulating high school science club planning techniques. There are no sessions we could spot about Google’s competition. If one is big enough, there is no competition. One of Googzilla’s relatives made a mess of Tokyo real estate largely without lasting consequences.

Cynthia Murrell, April 30, 2024

Is Grandma Google Making Erratic Decisions?

April 24, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

Clowny Fish TV is unknown to me. The site published a write up which I found interesting and stuffed full of details I had not seen previously. The April 18, 2024,  essay is “YouTubers Claim YouTube is Very Broken Right Now.” Let’s look at a handful of examples and see if these spark any thoughts in my dinobaby mind. As Vladimir Shmondenko says, “Let’s go.”

image

Grandma Googzilla has her view of herself. Nosce teipsum, right? Thanks, MSFT Copilot. How’s your security today?

Here’s a statement to consider:

Over the past 72 hours, YouTubers have been complaining on X about everything from delayed comments to a noticeable decline in revenue and even videos being removed by Google for nebulous reasons after being online for years.

Okay, sluggish functions from the video ad machine. I have noticed either slow-loading or dead video ads; that is, the ads take a long time (maybe a second or two to 10 seconds to show up) or nothing happens and a “Skip” button just appears. No ad to skip. I wonder, “Do the advertisers pay for a non-displayed ad followed by a skip?” I assume there is some fresh Google word salad available in the content cafeteria, but I have not spotted it. Those arrests have, however, caught my attention.

Another item from the essay:

In fact, many longtime YouTube content creators have announced their retirements from the platform over the past year, and I have to wonder if these algorithm changes aren’t a driving force behind that. There’s no guarantee that there will be room for the “you” in YouTube six months from now, let alone six years from now.

I am not sure I know many of the big-time content creators. I do know that the famous Mr. Beast has formed a relationship with the Amazon Twitch outfit. Is that megastar hedging his bets? I think he is. Those videos cost big bucks and could be on broadcast TV if there were a functioning broadcast television service in the US.

How about this statement:

On top of the algorithm shift, and on top of the monetization hit, Google is now reportedly removing old videos that violate their current year Terms of Service.

Shades of the 23andMe approach to Terms of Service. What struck me is that one of my high school history teachers  — I think his name was Earl Skaggs — railed against Joseph Stalin’s changing Russian history and forcing textbooks to be revised to present Mr. Stalin’s interpretation of reality. Has Google management added changing history to their bag of tricks. I know that arresting employees is a useful management tool, but I have been relying on news reports. Maybe those arrests were “fake news.” Nothing surprises me where online information is in the mix.

I noted this remarkable statement in the Clown Fish TV essay:

Google was the glue that held all these websites together and let people get found. We’re seeing what a world looks like without Google. Because for many content creators and journalists, it’ll be practically worthless going forward.

I have selected a handful of items. The original article includes screenshots, quotes from people whom I assume are “experts” or whatever passes as an authority today, and a of Google algorithm questioning. But any of the Googlers with access to the algorithm can add a tweak or create a “wrapper” to perform a specific task. I am not sure too many Googlers know how to fiddle the plumbing anymore. Some of the “clever” code is now more than 25 years old. (People make fun of mainframes. Should more Kimmel humor be directed at 25 year old Google software?)

Observations are indeed warranted:

  1. I read Google criticism on podcasts; I read criticism of Google online. Some people are falling out of love with the Google.
  2. Google muffed the bunny with its transformer technology. By releasing software as open source, the outfit may have unwittingly demonstrated how out of touch its leadership team is and opened the door to some competitors able to move more quickly than Grandma Google. Microsoft. Davos. AI. Ah, yes.
  3. The Sundar & Prabhakar School of Strategic Thinking has allowed Google search to become an easy target. Metasearch outfits recycling poor old Bing results are praised for being better than Google. That’s quite an achievement and a verification that some high-school science club management methods don’t work as anticipated. I won’t mention arresting employees again. Oh, heck. I will. Google called the police on its own staff. Slick. Professional.

Net net: Clown Fish TV definitely has presented a useful image of Grandma Google and her video behaviors.

Stephen E Arnold, April 24, 2024

Google AI: Who Is on First? I Do Not Know. No, No, He Is on Third

April 23, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

A big reorg has rattled the Googlers. Not only are these wizards candidates for termination, the work groups are squished like the acrylic pour paintings thrilling YouTube crafters.

image

Image from Vizoli Art via YouTube at https://www.youtube.com/@VizoliArt

The image might be a representation of Google’s organization, but I am just a dinobaby without expertise in art or thing Googley. Let me give you an example.

I read “Google Consolidates Its DeepMind and Research Teams Amid AI Push” (from the trust outfit itself, Thomson Reuters). The story presents the date as April 18, 2024. I learned:

The search engine giant had merged its research units Google Brain and DeepMind a year back to sharpen its focus on AI development and get ahead of rivals like Microsoft,  a partner of ChatGPT and Sora maker OpenAI.

And who moves? The trust outfit says:

Google will relocate its Responsible AI teams – which focuses on safe AI development – from Research to DeepMind so that they are closer to where AI models are built and scaled, the company said in a blog post.

Ars Technica, which publishes articles without self-identifying with trust. “Google Merges the Android, Chrome, and Hardware Divisions.” That write up channels the acrylic pour approach to management, which Ars Technica describes this way:

Google Hardware SVP Rick Osterloh will lead the new “Platforms and Devices” division. Hiroshi Lockheimer, Google’s previous head of software platforms like Android and ChromeOS, will be headed to “some new projects” at Google.

Why? AI, of course.

But who runs this organizational mix up?

One answer appears in an odd little “real” news story from an outfit called Benzinga. “Google’s DeepMind to Lead Unified AI Charge as Company Seeks to Outpace Microsoft.” The write up asserts:

The reorganization will see all AI-related teams, including the development of the Gemini chatbot, consolidated under the DeepMind division led by Demis Hassabis. This consolidation encompasses research, model development, computing resources, and regulatory compliance teams…

I assume that the one big happy family of Googlers will sort out the intersections of AI, research, hardware, app software, smart software, lines of authority, P&L responsibility, and decision making. Based on my watching Google’s antics over the last 25 years, chaos seems to be part of the ethos of the company. One cannot forget that for the AI razzle dazzle, Code Red, and reorganizational acrylic pouring, advertising accounts for about 60 percent of the firm’s financial footstool.

Will Google’s management team be able to answer the question, “Who is on first?” Will the result of the company’s acrylic pour approach to organizational structures yield a YouTube video like this one? The creator Left Brained Artist explains why acrylic paints cracked, come apart, and generally look pretty darned terrible.

image

Will Google’s pouring units together result in a cracked result? Left Brained Artist’s suggestions may not apply to an online ad company trying to cope with difficult-to-predict competitors like the Zucker’s Meta or the Microsoft clump of AI stealth fighters: OpenAI, Mistral, et al.

Reviewing the information in these three write ups about Google, I will offer several of my unwanted and often irritating observations. Ready?

  1. Comparing the Microsoft AI re-organization to the Google AI re-organization it seems to be that Microsoft has a more logical set up. Judging from the information to which I have access, Microsoft is closing deals for its AI technology with government entities and selected software companies. Microsoft is doing practical engineering drawings; Google is dumping acrylic paint, hoping it will be pretty and make sense.
  2. Google seems to be struggling from a management point of view. We have sit ins, we have police hauling off Googlers, and we have layoffs. We have re-organizations. We have numerous signals that the blue chip consulting approach to an online advertising outfit is a bit unpredictable. Hey, just sell ads and use AI to help you do it without creating 1960s’ style college sophomore sit ins.
  3. Get organized. Make an attempt to answer the question, “Who is on first?

As Abbott and Costello explained:

Costello: Well, all I’m trying to find out is what’s the guy’s name on first base?

Abbott: Oh, no, no. What is on second base?

Costello: I’m not asking you who’s on second.

Abbott: Who’s on first.

Exactly. Just sell online ads.

Stephen E Arnold, April 23, 2024

Google Gem: Arresting People Management

April 18, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

I have worked for some well-managed outfits: Halliburton, Booz Allen, Ziff Communications, and others in the 55 year career. The idea that employees at Halliburton Nuclear (my assignment) would occupy the offices of a senior officer like Eugene Saltarelli was inconceivable. (Mr. Saltarelli sported a facial scar. When asked about the disfigurement, he would stare at the interlocutor and ask, “What scar?” Do you want to “take over” his office?) Another of my superiors at a firm in New York had a special method of shaping employee behavior. This professional did nothing to suppress rumors that two of his wives drowned  during “storms” after falling off his sail boat. Did I entertain taking over his many-windowed office in Manhattan? Answer: Are you sure you internalized the anecdote?

! google gems

Another Google management gem glitters in the public spot light.

But at the Google life seems to be different, maybe a little more frisky absent psychological behavior controls. I read “Nine Google Workers Get Arrested After Sit-In Protest over $1.2B Cloud Deal with Israel.” The main idea seems to be that someone at Google sold cloud services to the Israeli government. Employees apparently viewed the contract as bad, wrong, stupid, or some combination of attributes. The fix involved a 1960s-style sit in. After a period of time elapsed, someone at Google called the police. The employee-protesters were arrested.

I recall hearing years ago that Google faced a similar push back about a contract with the US government. To be honest, Google has generated so many human resource moments, I have a tough time recalling each. A few are Mt. Everests of excellence; for example, the termination of Dr. Timnit Gebru. This Googler had the nerve to question the bias of Google’s smart software. She departed. I assume she enjoyed the images of biased signers of documents related to America’s independence and multi-ethnic soldiers in the World War II German army. Bias? Google thinks not I guess.

The protest occurs as the Google tries to cope with increased market pressure and the tough-to-control costs of smart software. The quick fix is to nuke or RIF employees. “Google Lays Off Workers As Part of Pretty Large-Scale Restructuring” reports by citing Business Insider:

Ruth Porat, Google’s chief financial officer, sent an email to employees announcing that the company would create “growth hubs” in India, Mexico and Ireland. The unspecified number of layoffs will affect teams in the company’s finance department, including its treasury, business services and revenue cash operations units

That looks like off-shoring to me. The idea was a cookie cutter solution spun up by blue chip consulting companies 20, maybe 30 years ago. On paper, the math is more enticing than a new Land Rover and about as reliable. A state-side worker costs X fully loaded with G&A, benefits, etc. An off-shore worker costs X minus Y. If the delta means cost savings, go for it. What’s not to like?

According to a source cited in the New York Post:

“As we’ve said, we’re responsibly investing in our company’s biggest priorities and the significant opportunities ahead… To best position us for these opportunities, throughout the second half of 2023 and into 2024, a number of our teams made changes to become more efficient and work better, remove layers and align their resources to their biggest product priorities.

Yep, align. That senior management team has a way with words.

Will those who are in fear of their jobs join in the increasingly routine Google employee protests? Will disgruntled staff sandbag products and code? Will those who are terminated write tell-alls about their experiences at an outfit operating under Code Red for more than a year?

Several observations:

  1. Microsoft’s quite effective push of its AI products and services continues. In certain key markets like New York City and the US government, Google is on the defensive. Hint: Microsoft has the advantage, and the Google is struggling to catch up.
  2. Google’s management of its personnel seems to create the wrong type of news. Example: Staff arrests. Is that part of Peter Drucker’s management advice.
  3. The Google leadership team appears to lack the ability to do their job in a way that operates in a quiet, effective, positive, and measured way.

Net net: The online ad money machine keeps running. But if the investigations into Google’s business practices get traction, Google will have additional challenges to face. The Sundar & Prabhakar Comedy team should make a TikTok-type,  how-to video about human resource management. I would prefer a short video about the origin story for the online advertising method which allowed Google to become a fascinating outfit.

Stephen E Arnold, April 18, 2024

Will Google Fix Up On-the-Blink Israeli Intelligence Capability?

April 18, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

Voyager Labs “value” may be slipping. The poster child for unwanted specialized software publicity (NSO Group) finds itself the focal point of some legal eagles. The specialized software systems that monitor, detect, and alert — quite frankly — seemed to be distracted before and during the October 2023 attack. What’s happening to Israel’s advanced intelligence capabilities with its secret units, mustered out wizards creating intelligence solutions, and doing the Madison Avenue thing at conferences? What’s happening is that the hyperbole seems to be a bit more advanced than some of the systems themselves.

image

Government leaders and military intelligence professionals listen raptly as the young wizard explains how the online advertising company can shore up a country’s intelligence capabilities. Thanks, MidJourney. You are good enough, and the modified free MSFT Copilot is not.

What’s the fix? Let me share one wild idea with you: Let Google do it. Time (once the stablemate of the AI-road kill Sports Illustrated) published this write up with this title:

Exclusive: Google Contract Shows Deal With Israel Defense Ministry

The write up says:

Google provides cloud computing services to the Israeli Ministry of Defense, and the tech giant has negotiated deepening its partnership during Israel’s war in Gaza, a company document viewed by TIME shows. The Israeli Ministry of Defense, according to the document, has its own “landing zone” into Google Cloud—a secure entry point to Google-provided computing infrastructure, which would allow the ministry to store and process data, and access AI services. [The wonky capitalization is part of the style manual I assume. Nice, shouting with capital letters.]

The article then includes this paragraph:

Google recently described its work for the Israeli government as largely for civilian purposes. “We have been very clear that the Nimbus contract is for workloads running on our commercial platform by Israeli government ministries such as finance, healthcare, transportation, and education,” a Google spokesperson told TIME for a story published on April 8. “Our work is not directed at highly sensitive or classified military workloads relevant to weapons or intelligence services.”

Does this mean that Google shaped or weaponized information about the work with Israel? Probably not: The intent strikes me as similar to the “Senator, thank you for the question” lingo offered at some US government hearings. That’s just the truth poorly understood by those who are not Googley.

I am not sure if the Time story has its “real” news lens in focus, but let’s look at this interesting statement:

The news comes after recent reports in the Israeli media have alleged the country’s military, controlled by the Ministry of Defense, is using an AI-powered system to select targets for air-strikes on Gaza. Such an AI system would likely require cloud computing infrastructure to function. The Google contract seen by TIME does not specify for what military applications, if any, the Ministry of Defense uses Google Cloud, and there is no evidence Google Cloud technology is being used for targeting purposes. But Google employees who spoke with TIME said the company has little ability to monitor what customers, especially sovereign nations like Israel, are doing on its cloud infrastructure.

The online story included an allegedly “real” photograph of a bunch of people who were allegedly unhappy with the Google deal with Israel. Google does have a cohort of wizards who seem to enjoy protesting Google’s work with a nation state. Are Google’s managers okay with this type of activity? Seems like it.

Net net: I think the core issue is that some of the Israeli intelligence capability is sputtering. Will Google fix it up? Sure, if one believes the intelware brochures and PowerPoints on display at specialized intelligence conferences, why not perceive Google as just what the country needs after the attack and amidst increasing tensions with other nation states not too far from Tel Aviv? Belief is good. Madison Avenue thinking is good. Cloud services are good. Failure is not just bad; it could mean zero warning for another action against Israel. Do brochures about intelware stop bullets and missiles?

Stephen E Arnold, April 18, 2024

Data Thirst? Guess Who Can Help?

April 17, 2024

As large language models approach the limit of freely available data on the Internet, companies are eyeing sources supposedly protected by copyrights and user agreements. PCMag reports, “Google Let OpenAI Scrape YouTube Data Because Google Was Doing It Too.” It seems Google would rather double down on violations than be hypocritical. Writer Emily Price tells us:

“OpenAI made headlines recently after its CTO couldn’t say definitively whether the company had trained its Sora video generator on YouTube data, but it looks like most of the tech giants—OpenAI, Google, and Meta—have dabbled in potentially unauthorized data scraping, or at least seriously considered it. As the New York Times reports, OpenAI transcribed than a million hours of YouTube videos using its Whisper technology in order to train its GPT-4 AI model. But Google, which owns YouTube, did the same, potentially violating its creators’ copyrights, so it didn’t go after OpenAI. In an interview with Bloomberg this week, YouTube CEO Neal Mohan said the company’s terms of service ‘does not allow for things like transcripts or video bits to be downloaded, and that is a clear violation of our terms of service.’ But when pressed on whether YouTube data was scraped by OpenAI, Mohan was evasive. ‘I have seen reports that it may or may not have been used. I have no information myself,’ he said.”

How silly to think the CEO would have any information. Besides stealing from YouTube content creators, companies are exploring other ways to pierce untapped sources of data. According to the Times article cited above, Meta considered buying Simon & Schuster to unlock all its published works. We are sure authors would have been thrilled. Meta executives also considered scraping any protected data it could find and hoping no one would notice. If caught, we suspect they would consider any fees a small price to pay.

The same article notes Google changed its terms of service so it could train its AI on Google Maps reviews and public Google Docs. See, the company can play by the rules, as long as it remembers to change them first. Preferably, as it did here, over a holiday weekend.

Cynthia Murrell, April 17, 2024

Google Cracks Infinity Which Overshadows Quantum Supremacy Maybe?

April 16, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

The AI wars are in overdrive. Google’s high school rhetoric is in another dimension. Do you remember quantum supremacy? No, that’s okay, but it makes it clear that the Google is the leader in quantum computing. When will that come to the Pixel mobile device? Now Google’s wizards, infused with the juices of a rampant high school science club member (note the words rampant and member, please. They are intentional.)

An article in Analytics India (now my favorite cheerleading reference tool) uses this headline: “Google Demonstrates Method to Scale Language Model to Infinitely Long Inputs.” Imagine a demonstration of infinity using infinite inputs. I thought the smart software outfits were struggling to obtain enough content to train their models. Now Google’s wizards can handle “infinite” inputs. If one demonstrates infinity, how long will that take? Is one possible answer, “An infinite amount of time.”

Wow.

The write up says:

This modification to the Transformer attention layer supports continual pre-training and fine-tuning, facilitating the natural extension of existing LLMs to process infinitely long contexts.

Even more impressive is the diagram of the “infinite” method. I assure you that it won’t take an infinite amount of time to understand the diagram:

image

See, infinity may have contributed to Cantor’s mental issues, but the savvy Googlers have sidestepped that problem. Nifty.

But the write up suggests that “infinite” like many Google superlatives has some boundaries; for instance:

The approach scales naturally to handle million-length input sequences and outperforms baselines on long-context language modelling benchmarks and book summarization tasks. The 1B model, fine-tuned on up to 5K sequence length passkey instances, successfully solved the 1M length problem.

Google is trying very hard to match Microsoft’s marketing coup which caused the Google Red Alert. Even high schoolers can be frazzled by flashing lights, urgent management edicts, and the need to be perceived as a leader in something other than online advertising. The science club at Google will keep trying. Next up quantumly infinite. Yeah.

Stephen E Arnold, April 16, 2024

Are Experts Misunderstanding Google Indexing?

April 12, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

Google is not perfect. More and more people are learning that the mystics of Mountain View are working hard every day to deliver revenue. In order to produce more money and profit, one must use Rust to become twice as wonderful than a programmer who labors to make C++ sit up, bark, and roll over. This dispersal of the cloud of unknowing obfuscating the magic of the Google can be helpful. What’s puzzling to me is that what Google does catches people by surprise. For example, consider the “real” news presented in “Google Books Is Indexing AI-Generated Garbage.” The main idea strikes me as:

But one unintended outcome of Google Books indexing AI-generated text is its possible future inclusion in Google Ngram viewer. Google Ngram viewer is a search tool that charts the frequencies of words or phrases over the years in published books scanned by Google dating back to 1500 and up to 2019, the most recent update to the Google Books corpora. Google said that none of the AI-generated books I flagged are currently informing Ngram viewer results.

image

Thanks, Microsoft Copilot. I enjoyed learning that security is a team activity. Good enough again.

Indexing lousy content has been the core function of Google’s Web search system for decades. Search engine optimization generates information almost guaranteed to drag down how higher-value content is handled. If the flagship provides the navigation system to other ships in the fleet, won’t those vessels crash into bridges?

In order to remediate Google’s approach to indexing requires several basic steps. (I have in various ways shared these ideas with the estimable Google over the years. Guess what? No one cared, understood, and if the Googler understood, did not want to increase overhead costs. So what are these steps? I shall share them:

  1. Establish an editorial policy for content. Yep, this means that a system and method or systems and methods are needed to determine what content gets indexed.
  2. Explain the editorial policy and what a person or entity must do to get content processed and indexed by the Google, YouTube, Gemini, or whatever the mystics in Mountain View conjure into existence
  3. Include metadata with each content object so one knows the index date, the content object creation date, and similar information
  4. Operate in a consistent, professional manner over time. The “gee, we just killed that” is not part of the process. Sorry, mystics.

Let me offer several observations:

  1. Google, like any alleged monopoly, faces significant management challenges. Moving information within such an enterprise is difficult. For an organization with a Foosball culture, the task may be a bit outside the wheelhouse of most young people and individuals who are engineers, not presidents of fraternities or sororities.
  2. The organization is under stress. The pressure is financial because controlling the cost of the plumbing is a reasonably difficult undertaking. Second, there is technical pressure. Google itself made clear that it was in Red Alert mode and keeps adding flashing lights with each and every misstep the firm’s wizards make. These range from contentious relationships with mere governments to individual staff member who grumble via internal emails, angry Googler public utterances, or from observed behavior at conferences. Body language does speak sometimes.
  3. The approach to smart software is remarkable. Individuals in the UK pontificate. The Mountain View crowd reassures and smiles — a lot. (Personally I find those big, happy looks a bit tiresome, but that’s a dinobaby for you.)

Net net: The write up does not address the issue that Google happily exploits. The company lacks the mental rigor setting and applying editorial policies requires. SEO is good enough to index. Therefore, fake books are certainly A-OK for now.

Stephen E Arnold, April 12, 2024

The Only Dataset Search Tool: What Does That Tell Us about Google?

April 11, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

If you like semi-jazzy, academic write ups, you will revel in “Discovering Datasets on the Web Scale: Challenges and Recommendations for Google Dataset Search.” The write up appears in a publication associated with Jeffrey Epstein’s favorite university. It may be worth noting that MIT and Google have teamed to offer a free course in Artificial Intelligence. That is the next big thing which does hallucinate at times while creating considerable marketing angst among the techno-giants jousting to emerge as the go-to source of the technology.

Back to the write up. Google created a search tool to allow a user to locate datasets accessible via the Internet. There are more than 700 data brokers in the US. These outfits will sell data to most people who can pony up the cash. Examples range from six figure fees for the Twitter stream to a few hundred bucks for boat license holders in states without much water.

The write up says:

Our team at Google developed Dataset Search, which differs from existing dataset search tools because of its scope and openness: potentially any dataset on the web is in scope.

image

A very large, money oriented creature enjoins a worker to gather data. If someone asks, “Why?”, the monster says, “Make up something.” Thanks MSFT Copilot. How is your security today? Oh, that’s too bad.

The write up does the academic thing of citing articles which talk about data on the Web. There is even a table which organizes the types of data discovery tools. The categorization of general and specific is brilliant. Who would have thought there were two categories of a vertical search engine focused on Web-accessible data. I thought there was just one category; namely, gettable. The idea is that if the data are exposed, take them. Asking permission just costs time and money. The idea is that one can apologize and keep the data.

The article includes a Googley graphic. The French portal, the Italian “special” portal, and the Harvard “dataverse” are identified. Were there other Web accessible collections? My hunch is that Google’s spiders such down as one famous Googler said, “All” the world’s information. I will leave it to your imagination to fill in other sources for the dataset pages. (I want to point out that Google has some interesting technology related to converting data sets into normalized data structures. If you are curious about the patents, just write benkent2020 at yahoo dot com, and one of my researchers will send along a couple of US patent numbers. Impressive system and method.)

The section “Making Sense of Heterogeneous Datasets” is peculiar. First, the Googlers discovered the basic fact of data from different sources — The data structures vary. Think in terms  of grapes and deer droppings. Second, the data cannot be “trusted.” There is no fix to this issue for the team writing the paper. Third, the authors appear to be unaware of the patents I mentioned, particularly the useful example about gathering and normalizing data about digital cameras. The method applies to other types of processed data as well.

I want to jump to the “beyond metadata” idea. This is the mental equivalent of “popping” up a perceptual level. Metadata are quite important and useful. (Isn’t it odd that Google strips high value metadata from its search results; for example, time and data?) The authors of the paper work hard to explain that the Google approach to data set search adds value by grouping, sorting, and tagging with information not in any one data set. This is common sense, but the Googley spin on this is to build “trust.” Remember: This is an alleged monopolist engaged in online advertising and co-opting certain Web services.

Several observations:

  1. This is another of Google’s high-class PR moves. Hooking up with MIT and delivering razz-ma-tazz about identifying spiderable content collections in the name of greater good is part of the 2024 Code Red playbook it seems. From humble brag about smart software to crazy assertions like quantum supremacy, today’s Google is a remarkable entity
  2. The work on this “project” is divorced from time. I checked my file of Google-related information, and I found no information about the start date of a vertical search engine project focused on spidering and indexing data sets. My hunch is that it has been in the works for a while, although I can pinpoint 2006 as a year in which Google’s technology wizards began to talk about building master data sets. Why no time specifics?
  3. I found the absence of AI talk notable. Perhaps Google does not think a reader will ask, “What’s with the use of these data? I can’t use this tool, so why spend the time, effort, and money to index information from a country like France which is not one of Google’s biggest fans. (Paris was, however, the roll out choice for the answer to Microsoft and ChatGPT’s smart software announcement. Plus that presentation featured incorrect information as I recall.)

Net net: I think this write up with its quasi-academic blessing is a bit of advance information to use in the coming wave of litigation about Google’s use of content to train its AI systems. This is just a hunch, but there are too many weirdnesses in the academic write up to write off as intern work or careless research writing which is more difficult in the wake of the stochastic monkey dust up.

Stephen E Arnold, April 11, 2024

Google: The DMA Makes Us Harm Small Business

April 11, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

I cannot estimate the number of hours Googlers invested in crafting the short essay “New Competition Rules Come with Trade-Offs.” I find it a work of art. Maybe not the equal of Dante’s La Divina Commedia, but is is darned close.

image

A deity, possibly associated with the quantumly supreme, reassures a human worried about life. Words are reality, at least to some fretful souls. Thanks MSFT Copilot. Good enough.

The essay pivots on unarticulated and assumed “truths.” Particularly charming are these:

  1. “We introduced these types of Google Search features to help consumers”
  2. “These businesses now have to connect with customers via a handful of intermediaries that typically charge large commissions…”
  3. “We’ve always been focused on improving Google Search….”

The first statement implies that Google’s efforts have been the “help.” Interesting: I find Google search often singularly unhelpful, returning results for malware, biased information, and Google itself.

The second statement indicates that “intermediaries” benefit. Isn’t Google an intermediary? Isn’t Google an alleged monopolist in online advertising?

The third statement is particularly quantumly supreme. Note the word “always.” John Milton uses such verbal efflorescence when describing God. Yes, “always” and improving. I am tremulous.

Consider this lyrical passage and the elegant logic of:

We’ll continue to be transparent about our DMA compliance obligations and the effects of overly rigid product mandates. In our view, the best approach would ensure consumers can continue to choose what services they want to use, rather than requiring us to redesign Search for the benefit of a handful of companies.

Transparent invokes an image of squeaky clean glass in a modern, aluminum-framed window, scientifically sealed to prevent its unauthorized opening or repair by anyone other than a specially trained transparency provider. I like the use of the adjective “rigid” because it implies a sturdiness which may cause the transparent window to break when inclement weather (blasts of hot and cold air from oratorical emissions) stress the see-through structures. The adult-father-knows-best reference in “In our view, the best approach”. Very parental. Does this suggest the EU is childish?

Net net: Has anyone compiled the Modern Book of Google Myths?

Stephen E Arnold, April 11, 2024

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta