Why Be Like ClearView AI? Google Fabs Data the Way TSMC Makes Chips
April 8, 2022
Machine learning requires data. Lots of data. Datasets can set AI trainers back millions of dollars, and even that does not guarantee a collection free of problems like bias and privacy issues. Researchers at MIT have developed another way, at least when it comes to image identification. The World Economic Forum reports, “These AI Tools Are Teaching Themselves to Improve How they Classify Images.” Of course, one must start somewhere, so a generative model is first trained on some actual data. From there, it generates synthetic data that, we’re told, is almost indistinguishable from the real thing. Writer Adam Zewe cites the paper‘s lead author Ali Jahanian as he emphasizes:
“But generative models are even more useful because they learn how to transform the underlying data on which they are trained, he says. If the model is trained on images of cars, it can ‘imagine’ how a car would look in different situations — situations it did not see during training — and then output images that show the car in unique poses, colors, or sizes. Having multiple views of the same image is important for a technique called contrastive learning, where a machine-learning model is shown many unlabeled images to learn which pairs are similar or different. The researchers connected a pretrained generative model to a contrastive learning model in a way that allowed the two models to work together automatically. The contrastive learner could tell the generative model to produce different views of an object, and then learn to identify that object from multiple angles, Jahanian explains. ‘This was like connecting two building blocks. Because the generative model can give us different views of the same thing, it can help the contrastive method to learn better representations,’ he says.”
Ah, algorithmic teamwork. Another advantage of this method is the nearly infinite samples the model can generate, since more samples (usually) make for a better trained AI. Jahanian also notes once a generative model has created a repository of synthetic data, that resource can be posted online for others to use. The team also hopes to use their technique to generate corner cases, which often cannot be learned from real data sets and are especially troublesome when it comes to potentially dangerous uses like self-driving cars. If this hope is realized, it could be a huge boon.
This all sounds great, but what if—just a minor if—the model is off base? And, once this tech moves out of the laboratory, how would we know? The researchers acknowledge a couple other limitations. For one, their generative models occasionally reveal source data, which negates the privacy advantage. Furthermore, any biases in the limited datasets used for the initial training will be amplified unless the model is “properly audited.” It seems like transparency, which somehow remains elusive in commercial AI applications, would be crucial. Perhaps the researchers have an idea how to solve that riddle.
Funding for the project was supplied, in part, by the MIT-IBM Watson AI Lab, the United States Air Force Research Laboratory, and the United States Air Force Artificial Intelligence Accelerator.
Cynthia Murrell, April 8, 2022
If True, This Google Story Is Like a Stuck 45 RPM Disc
April 8, 2022
I don’t know if the information in “DeepMind Accused of Mishandling Sexual Misconduct Allegations” is spot on. The source is supposed to be one of those unimpeachable bastions of high brow journalism. (I won’t hold the Endeca search implementation up as evidence of making interesting decisions.) You will have to pay to read the source article unless you have access at the local news stand to a fungible copy of the orange thing.
The main idea in the write up is like the old hit Rag Mop stuck in a groove. You know, Rag Mop, Rag Mop, R A G G M O P P, Rag Mop? An ear worm with piranha teeth. Not a Candiru, but nasty nevertheless.
I noted this assertion:
A former DeepMind employee has accused the artificial intelligence group’s leadership of mishandling multiple allegations of sexual misconduct and harassment, raising concerns over how grievances are dealt with at the Google-acquired company.
Juicy details are not included. The approach parallels the lack of color related to the attempted suicide by a Xoogler. This particular female hooked up with Google’s Icarus, burned a family, and suicidally unlatched her safety belt. Gravity, not the Thomas Pynchon type of rainbow, presented itself.
I spotted some hint of Google’s management tactics; to wit:
Julia [this is a fake name to protect the individual making the assertion of wonky behavior] has argued that there are major flaws in how grievances such as hers are handled at DeepMind. Alleged failures include extended delays in workplace investigations and insufficient safeguarding of sexual assault victims.
Are these characteristics of a Silicon Valley type company channeling the decision making of adolescent high school science club members?
The orange newspaper slipped in some thought provoking comments; for instance:
She was also emailed a six-page confessional document by the researcher, written in the third person, on August 18 2019. The document detailed suicidal tendencies, allusions to raping unconscious women, and sex addiction indicated by reference to a string of affairs with sex workers during work hours, and with colleagues on and off DeepMind premises. Another document sent to her on September 19 2019 included graphic and degrading sexual depictions of her.
I like the use of email by an alleged Google DeepMind individual. I wonder if this particular wizard understands the concept of legal discovery?
The write up includes some details about Google DeepMind’s administrative procedures and the alacrity which some issues are addressed. If I understand the source article, we’re not talking millisecond response time. Weeks seems to be the basic unit of time.
One may want to keep in mind that one of DeepMind’s founders moved on in the time period about which the Julia persona encountered some science club analyses of outlier work behavior.
Same repetitive phrases. Here’s an example my tin ear caught:
DeepMind said it was unable to comment on that latter case but added: “Any incident of sexual assault or harassment is abhorrent and it’s unacceptable that anyone at DeepMind or in the world should experience it.”
R A G G M O P P, Rag Mop. Do doo doo, dah dee ah dah Rag Mop.
Stephen E Arnold, April 8, 2022
Google YouTube Search Working the Way Alphabet Wants?
April 8, 2022
The online news service Mashable may be in gear for April’s Fool Day early. The story “YouTube Added 1,500 Free Movies, But Good Luck Finding Them” makes clear that Google YouTube search doesn’t work.
The write up reports:
YouTube has also made browsing its free titles much more annoying than it needed to be. The platform won’t just show you all its free titles and let you scroll through them to find your next binge watch. It certainly won’t let you filter them, so you can’t narrow your search to all of YouTube’s free action movies, or free romantic comedies. Rather, YouTube’s algorithm selects a few hundred ad-supported titles to show you in its “free to watch movies” section, hiding the rest.
The Mashable take is definitely not Googley. A new age, Silicon Valley like information service should be able to make sense of Google YouTube’s brilliant approach. A casual user will have access to some, smart software selected content. The desire for a way to browse a comprehensive result set is irrelevant. The Googley person will recognize:
- Paying for Google’s TV service delivers a better experience. Presumably that experience includes a listing of available content. On second thought, I am kidding myself. Smart software does not understand exceptions unless the system was trained to implement fine grained user classification.
- There are Google Dorks available to make quick work of narrowing Google result sets. Not familiar with Google Dorks? Well, certain individuals in Russia are and possibly a bright 12 year old near your home has this expertise.
- The results you see represent “all the world’s information.” The fact that you have knowledge which indicates a partial result set makes one point and only one point: You take what you get.
- Oh, those contractors and interns are enhancing the search experience again whilst doing no evil.
I hope this explains why Mashable does not understand the brilliant method Google uses to remain in close contact with its humanoid users.
Stephen E Arnold, April 8, 2022
How about Those Commercial and US Government RFPs?
April 7, 2022
I am not familiar with an author whose name I will not put in my blog because of the Google-type systems’ stop word lists. The article is called “Your Competitor Wrote the RFP You Are Bidding On.” Some of the people who have worked on commercial bids are familiar with the process: Read the request for proposal, write the proposal, and win the proposal. Simple, eh. I have more familiarity with the swamp lands in Washington, DC. I have had an opportunity to observe how the sausage is made with regards to requests for information, requests for proposals, the proposals themselves, the decision mechanisms used to award a project, and the formal objections filed by bidders who did not win a contract. My observations are based on more than 50 years of work for government entities as well as some commercial projects for big outfits like the pre-Judge Green AT&T.
The write up states:
As a vendor, your job is to determine whether you want this. It’s costly to bid and often more costly to win. Spending absurd amounts of time across your org doing RFP submissions is rarely quantified from an ROI stand-point. If you’re in a type of business where RFP bids are involved from time to time, do your best to understand if it’s worth it. A typical RFP bid could take many hundreds of hours, from start to finish, especially if you progress past initial phases. Not only does this have easily quantifiable real costs, but the process also has runaway opportunity costs involving the product team, engineering, sales, legal and marketing.
I liked this observation:
The thing is, nobody really needs 80% of the [expletive deleted] that’s in an RFP, and they will never hold you to that. The implementation will take 3x longer than promised and your champions will no longer be with the company by the time you’re rolled-out anyway. By winning large RFP contracts, you will get buried, but not by the requirements you said Yes to. You’ll get buried trying to implement and retain this customer. Every week will be a new urgent requirement that was never covered in the RFP.
I want to point out that the reason the wordage and “wouldn’t it be nice” aspects of an RFP are included are often a result of inputs from consultants to the firm or the government agency. If these consultants have special skills, these will often be inserted into the RFP for the purpose of blocking competitors. There are other reasons too.
I look forward to more posts from so [expletive deleted] agile.
Stephen E Arnold, April 7, 2022
Consultants and Conflicts of Interest: Fast Action
April 7, 2022
My recollection is that a Northwestern graduate named Edwin Booz cooked up big chunks of modern consulting. Was this a year ago? Maybe three years? Nope. Mr. Booz helped Sears become a high-value resource in 1914. Eddie had a master’s in psychology, not business. Think about that. What modern consulting has become began in the climate wonderland of Chicago. You remember. The city with big shoulders.
Flash forward to 2022. “Citing ProPublica’s Reporting on McKinsey, Senators Propose Bill Addressing Contractors’ Conflicts of Interest” stated, after patting itself vigorously on its / thems back:
Yet the consultancy [McKinsey], which is known for maintaining a veil of secrecy around its client list, never disclosed to the FDA that other McKinsey consulting teams were simultaneously working for some of the country’s largest pharmaceutical companies. McKinsey’s commercial clients at the time included companies, such as Purdue Pharma and Johnson & Johnson, that were responsible for manufacturing and distributing the opioids that decimated communities nationwide. In some instances, McKinsey consultants working for drug makers even helped their clients ward off more robust FDA oversight.
McKinsey is one of the heirs to Eddie’s insight that clueless outfits would pay big money for reports written in summary format with lots of bullet points, horizons, and snappy aphorisms. BCG, another blue chip consulting firm, must be credited for taking General Eisenhower’s quadrant diagram and pioneering the era of easy to understand graphics and simple words like “dog” and “star” and “cash cow.”
From pop psychology to snazzy charts, the blue chip consulting business has been roaring along for more than a century. Now the opioid thing combined with the blue chip consulting firm “we’re special” thing may result in meaningful regulation.
Note I wrote “may.” Does anyone believe that government agencies can regulate the firms upon which the very same government agencies depend for advice, guidance, and a reason to have meetings.
Get real.
Here’s the wrap up to the article:
Jessica Tillipman, an assistant dean and government procurement law expert at George Washington University Law School, called the legislation a welcome development. As government contractors have merged in recent decades, the industry has grown more concentrated, increasing the risk of conflicts of interest, and the federal contracting industry, Tillipman said, could use clearer guidance on disclosure requirements tied to the private-sector work of government contractors. “Any attempt to address these growing problems is a good thing,” Tillipman said, “and important to ensuring that we reduce these risks in the government procurement system.”
What? Fix procurement? Let’s see. I estimate that another century will pass before draft regulations emerge from joint meetings between an executive branch agency and Congress. That time estimate may be too optimistic.
Think of the consultants needed to work on the issues related to regulating consultants. Think of the meetings. Think of the revolving door opportunities. Think of the inputs from law firms and accounting firms which must be obtained.
Think of the meetings. Psychology, not business acumen, fuels consulting as it did from the git go. What did that unusual poet say in “Chicago”? This sticks in my mind:
And they tell me you are crooked and I answer: yes…
Proud of it too.
Stephen E Arnold, April 7, 2022
Google: Who Makes the Tweaks? Smart Software or Humanoids?
April 7, 2022
I read “Google Tweaks Search and News Results to Direct People to Trusted Sources.” The main idea is that Google wants to do good. Instead of letting people read any old news, the Google “will offer information literacy tips and highlight widely cited source.” That was quick. Google News became available in 2002. Let’s see. My math is no too good, but that sure looks like more than a week ago.
How are the tweaks implemented? That’s a good question. The write up reports:
Since last June, the company has applied labels to results for “rapidly evolving topics,” which include things like breaking news and viral videos that are spreading quickly. It may suggest checking back later for more details as they become clearer. Starting in the US (in English) today, the labels will include some information literacy tips.
Right. Google and it. Are the changes implemented by Snorkelized software learns on the fly what news is not Google quality? Or, will actual Googlers peruse news and decide what’s okay and what needs to be designated l’ordure?
My bet is on one thing. Google’s many protestations that its algorithms do the heavy lifting is a useful way to put on a ghillie suit and disappear from the censorship, editing, and down checking of the inferior information.
If my assumption is incorrect, I can protest and look for my pen. I am 77 and prone to forgetfulness. Google has digital ghillies. Lucky outfit.
Stephen E Arnold, April 7, 2022
Google: Pesky Memories of the Past
April 7, 2022
We suppose some people will never understand or accept Googley ways of working. Namely European regulators. Similarly, Google may never accept the EU has any authority over its business practices. TechCrunch reports, “Google Sued in Europe for $2.4BN in Damages Over Shopping Antitrust Case.” Writer Natasha Lomas reveals:
“Google is being sued in Europe on competition grounds by price comparison service PriceRunner which is seeking at least €2.1 billion (~2.4 billion) in damages. The lawsuit accuses Google of continuing to breach a 2017 European Commission antitrust enforcement order against Google Shopping. As well as fining Google what was — at the time — a record-breaking antitrust penalty (€2.42 billion), the EU’s competition division ordered the search giant to cease illegal behaviors, after finding it Google giving prominent placement to its own shopping comparison service while simultaneously demoting rivals in organic search results.”
But cease those behaviors it did not, though it made a gesture or two in that direction. Meanwhile, according to Sky News, Google tried to sidestep the ruling with fake comparison sites packed with ads for their clients’ products running alongside the Google Shopping box. Very creative. The platform also continues to run product search ads alongside general search results. Apparently, PriceRunner decided five years of flouting the enforcement order was enough. The write-up continues:
“PriceRunner’s lawsuit alleges Google has continued to violate competition law in relation to product search, as well as seeking compensation for historical infringements that have allowed Google to reap revenue at rivals’ expense. To back up its allegations, the search comparison company points to a study conducted by accountancy company, Grant Thornton, which it says found prices for offers shown in Google’s own comparison shopping service can be 16-37% higher for popular categories like clothes and shoes, and between 12-14% higher for other types of products vs rival price comparison services.”
Many of our readers will not be surprised to learn Google search continues to dominate in Europe. It maintains a greater than 90% market share in most of the European Economic Area and in the U.K. Nevertheless, PriceRunner is prepared to fight for many years, if necessary, with help from litigation funder Nivalion. We shall see whether the suit gets anywhere, but either way we suspect Google will continue with business as usual.
Cynthia Murrell, April 7, 2022
Do The Google AI Claims Grow Like a Pinocchio Body Part?
April 6, 2022
“Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance” is a variant of the Google quantum supremacy announcement. Bigger, better, faster, more powerful, able to leap problems with a single tap on the Enter key. The graphic in the Google AI Blog post does grow. Didn’t Carlo Collodi cook up a dummy. The chief feature — other than teaching some how not to lie — was that the marketing was handled by Walt Disney. Like IBM’s humorous announcement that a mainframe could defeat a quantum computer’s ability to crack encryption, a claim pointed at something not invented yet is interesting. Are those marketing people at Google and IBM mentally enervated by swigs of Five Hour Energy?
Like a certain fictional character’s nose and the anigif in the blog post, the claims continue to grow.
I looked at this graphic closely. I noted a few omissions; for example:
- A mechanism to report the incidence of outliers or exceptions between the baseline system and the state of the system after iterating over a period of a month
- Any reference to bias identification and amelioration. This is Dr. Timnit Gebru territory, and this landscape is one that Google appears to ignore, at least in public. In private negotiations and legal chambers, maybe the Google addresses the baked in biases? Maybe not?
- Any reference to the handling of images, content, videos that are related to sexual harassment; for instance, allegations about personnel issues at Google and DeepMind themselves?
- Data about the accuracy of the outputs? Are we in 95 percentile territory or close enough for horse shoes and ad matching?
The write up uses a number of buzzwords, some Google jargon, and quite a few links to other Google documents and experts at Microsoft and NVidia. I am convinced. I believe everything I read on the Internet and Google’s blogs.
Three observations:
First, what’s at stake in my opinion is dominance if possible of off the shelf smart methods. Consolidation is the name of the game, and Google wants to beat out Amazon, Microsoft, assorted China backed outfits, and any other challengers who want to go a different direction. Not every company wants to SAIL down a certain flow of methods.
Second, Google is — bless its single revenue stream — embracing Madison Avenue techniques to convince people that it is the Big Dog in smart methods: New, improved, money back guarantee, and free trial sell toothpaste. Why not Google AI?
Third, Google — despite the alleged monopoly position — is struggling with the what’s next? Legal hassles, management practices, competition from nuisance companies like Amazon, competition for technical talent, hard to control costs — These are real issues at the Alphabet Google YouTube construct.
At end of a Silicon Valley day, some in Mountain View see Google as a one trick pony. It seems far fetched, but it looks as if Steve Ballmer may have been spot on with that one-trick pony metaphor. And there is Pinocchio’s nose.
Stephen E Arnold, April 6, 2022
PR or Reality? Only the Cyber Firms Know the Answer
April 6, 2022
Cyber crimes are on the rise. Businesses and individuals are the targets of malware bad actors. IT Online details how cyber security firms handle attacks: “What Happens Inside A Cybercrime War-Room?” As a major business player in Africa, South Africa fends off many types of cyber attacks: coin miner modules, viruses downloaded with bad software, self-spreading crypto mining malware, and ransomware.
The good news about catching cyber criminals is that white hat experts know how their counterparts work and can use technology like automation and machine learning against them. Carlo Bolzonello is the country manager for South Africa’s Trellis’s branch. He said that cyber crime organizations are run like regular businesses, except their job is to locate and target IT vulnerable environments. Once the bad business has the victim in its crosshairs, the bad actors exploit it for money or other assets for exploration or resale.
Bolzonello continued to explain that while it is important to understand how the enemy works, it is key that organizations have a security operations center armed with various tools that can pull information about possible threats into one dashboard:
“That single dashboard can show where a threat has emerged, and where it has spread to, so that action can be taken, immediately. It can reveal whether ransomware has gained access via a “recruitment” email sent to executives, whether a “living off the land” binary has taken hold via a download of an illicit copy of a movie, or whether a coin miner module has inserted itself via pirated software. Having this information to hand helps the SOC design and implement a quick and effective response, to stop the attack spreading further, and to prevent it costing money for people and businesses.”
Having a centralized dashboards allows organizations respond quicker and keep their enemies in check. Black hat cyber organizations actually might have a reverse of a security operations center that allow them to locate vulnerabilities. PR or reality? A bit of both perhaps?
Whitney Grace, April 6, 2022
Twitter and a Loophole? Unfathomable
April 6, 2022
Twitter knows Russia is pushing false narratives about the war in Ukraine. That is why it now refuses to amplify tweets from Russian state-affiliated media outlets like RT or Sputnik. However, the platform is not doing enough to restrain the other hundred-some Russian government accounts, according to the BBC News piece, “How Kremlin Accounts Manipulate Twitter.” Reporter James Clayton cites QUT Digital Media Research Centre‘s Tim Graham as he writes:
“Intrigued by this spider web of Russian government accounts, Mr Graham – who specializes in analyzing co-ordinated activity on social media – decided to investigate further. He analyzed 75 Russian government Twitter profiles which, in total, have more than 7 million followers. The accounts have received 30 million likes, been retweeted 36 million times and been replied to 4 million times. He looked at how many times each Twitter account retweeted one of the other 74 profiles within an hour. He discovered that the Kremlin’s network of Twitter accounts work together to retweet and drive up traffic. This practice is sometimes called ‘astroturfing’ – when the owner of several accounts uses the profiles they control to retweet content and amplify reach. ‘It’s a coordinated retweet network,’ Mr Graham says. ‘If these accounts weren’t retweeting stuff at the same time, the network would just be a bunch of disconnected dots. … They are using this as an engine to drive their preferred narrative onto Twitter, and they’re getting away with it,’ he says. Coordinated activity, using multiple accounts, is against Twitter’s rules.”
Twitter is openly more lenient on tweets by government officials under what it calls “public interest exceptions.” Even so, we are told there are supposed to be no exceptions on coordinated behavior. The BBC received no response from Twitter officials when it asked them about Graham’s findings. Clayton generously notes it can be difficult to prove content is false amid the chaos of war, and the platform has been removing claims as they are proven false. He also notes Facebook and other social media platforms have a similar Russia problem. The article allows Twitter may eventually ban Kremlin accounts entirely, as it banned Donald Trump in January 2021. Perhaps.
Cynthia Murrell, April 6, 2022