Google: Struggles with Curation

April 21, 2022

Should Google outsource Play store content curation to Amazon’s Mechanical Turk or Fiverr?

Sadly, one cannot assume that because an app is available through Google Play it is safe. Engadget reports, “Google Pulls Apps that May Have Harvested Data from Millions of Android Devices.” Writer S. Dent reveals:

“Google has pulled dozens of apps used by millions of users after finding that they covertly harvested data, The Wall Street Journal has reported. Researchers found weather apps, highway radar apps, QR scanners, prayer apps and others containing code that could harvest a user’s precise location, email, phone numbers and more. It was made by Measurement Systems, a company that’s reportedly linked to a Virginia defense contractor that does cyber-intelligence and more for US national-security agencies. It has denied the allegations.”

Naturally. We find it interesting that, according to the report, the firm was after data mainly from the Middle East, Central and Eastern Europe and Asia. The write-up continues:

“The code was discovered by researchers Serge Egelman from UC Berkeley and the University of Calgary’s Joel Reardon, who disclosed their findings to federal regulators and Google. It can ‘without a doubt be described as malware,’ Egelman told the WSJ. Measurement Systems reportedly paid developers to add their software development kits (SDKs) to apps. The developers would not only be paid, but receive detailed information about their user base. The SDK was present on apps downloaded to at least 60 million mobile devices. One app developer said it was told that the code was collecting data on behalf of ISPs along with financial service and energy companies.”

So how did these apps slip through the vetting process? Maybe the app review methods are flawed, not applied rigorously, not applied consistently. Or perhaps they are simply a bit of PR hogwash? We don’t know but the question is intriguing. Google has removed the apps from the Play store but of course they still lurk on millions of devices. In its email to the Wall Street Journal, Measurement Systems not only insists its apps are innocent, but it also asserts it is “not aware” of any connection between it and US defense contractors.

And what about the quantumly supreme Google smart software?

Cynthia Murrell, April 21, 2022

Googley Fact-Checking Efforts

April 14, 2022

Perhaps feeling the pressure to do something about the spread of falsehoods online, “Google Rolls Out Fact-Checking Features to Help Spot Misinformation” on developing news stories, reports Silicon Republic. The company’s product manager Nidhi Hebbar highlighted several of these features in a recent blog post. One is the search platform’s new resource page that offers suggestions for evaluating information. Then there is a new label within Google Search that identifies stories frequently cited by real news outfits. We also learn about the company’s Fact Check Explorer, which answers user queries on various topics with fact checks from “reputable publishers.” We are told Google is also going out of its way to support fact-checkers. Writer Leigh McGowran explains:

“Google has also partnered with a number of fact-checking organisations globally to bolster efforts to deal with misinformation. This includes a collaboration with the International Fact Checking Network (IFCN) at the non-profit Poynter Institute. This partnership is designed to provide training and resources to fact checkers and industry experts around the world, and Google said the IFCN will create a new programme to help collaboration, support fact checkers against harassment and host training workshops. Google is also working with the collaborative network LatamChequea to train 500 new fact checkers in Argentina, Colombia, Mexico and Peru.”

The problem of misinformation online has only grown since it became a hot topic in the mid-teens. The write-up continues:

“Events such as the Covid-19 pandemic and the US Capitol riots in January 2021 flung online misinformation into the sphere of public debate, with many online platforms taking action on misleading or inaccurate info, whether posted deliberately or otherwise. Misinformation has come to the fore again with the Russian invasion of Ukraine, as people have reported seeing misleading, manipulated or false information about the conflict on social media platforms such as Facebook, Twitter, TikTok and Telegram.”

Will Google’s resources help stem the tide?

Cynthia Murrell, April 14, 2022

AI Helps Out Lawyers

April 11, 2022

Artificial intelligence algorithms have negatively affected as many industries as they have assisted. One of the industries that has benefitted from AI is law firms explains Medium in: “How Artificial Intelligence Is Helping Solve The Needs Of Small Law Practitioners.” In the past, small law firms were limited in the amount of cases they could handle. AI algorithms now allow small law practices to compete with the larger firms in all areas of laws. How is this possible?

“The latest revolution in legal research technology ‘puts a lawyer’s skill and expertise in the driver’s seat…’ New artificial intelligence tools give lawyers instant access to vast amounts of information and analysis online, but also the ability to turn that into actionable insights. They can be reminded to check specific precedents and the latest rulings, or be directed to examine where an argument might be incomplete. That leaves the lawyers themselves to do what only they can: think, reason, develop creative arguments and negotiation strategies, provide personal service, and respond to a client’s changing needs.”

Lawyers used to rely on printed reference materials from databases and professional publications. They were limited on the number of hours in a day, people, and access to the newest and best resources. That changed when computers entered the game and analytical insights were delivered from automated technology. As technology has advanced, lawyers can cross reference multiple resources and improve legal decision making.

While lawyers are benefitting from the new AI, if they do not keep up they are quickly left behind. Lawyers must be aware of current events, how their digital tools change, and how to keep advancing the algorithms so they can continue to practice. That is not much different from the past, except it is moving at a faster rate.

Whitney Grace, April 11, 2022

Why Be Like ClearView AI? Google Fabs Data the Way TSMC Makes Chips

April 8, 2022

Machine learning requires data. Lots of data. Datasets can set AI trainers back millions of dollars, and even that does not guarantee a collection free of problems like bias and privacy issues. Researchers at MIT have developed another way, at least when it comes to image identification. The World Economic Forum reports, “These AI Tools Are Teaching Themselves to Improve How they Classify Images.” Of course, one must start somewhere, so a generative model is first trained on some actual data. From there, it generates synthetic data that, we’re told, is almost indistinguishable from the real thing. Writer Adam Zewe cites the paper‘s lead author Ali Jahanian as he emphasizes:

“But generative models are even more useful because they learn how to transform the underlying data on which they are trained, he says. If the model is trained on images of cars, it can ‘imagine’ how a car would look in different situations — situations it did not see during training — and then output images that show the car in unique poses, colors, or sizes. Having multiple views of the same image is important for a technique called contrastive learning, where a machine-learning model is shown many unlabeled images to learn which pairs are similar or different. The researchers connected a pretrained generative model to a contrastive learning model in a way that allowed the two models to work together automatically. The contrastive learner could tell the generative model to produce different views of an object, and then learn to identify that object from multiple angles, Jahanian explains. ‘This was like connecting two building blocks. Because the generative model can give us different views of the same thing, it can help the contrastive method to learn better representations,’ he says.”

Ah, algorithmic teamwork. Another advantage of this method is the nearly infinite samples the model can generate, since more samples (usually) make for a better trained AI. Jahanian also notes once a generative model has created a repository of synthetic data, that resource can be posted online for others to use. The team also hopes to use their technique to generate corner cases, which often cannot be learned from real data sets and are especially troublesome when it comes to potentially dangerous uses like self-driving cars. If this hope is realized, it could be a huge boon.

This all sounds great, but what if—just a minor if—the model is off base? And, once this tech moves out of the laboratory, how would we know? The researchers acknowledge a couple other limitations. For one, their generative models occasionally reveal source data, which negates the privacy advantage. Furthermore, any biases in the limited datasets used for the initial training will be amplified unless the model is “properly audited.” It seems like transparency, which somehow remains elusive in commercial AI applications, would be crucial. Perhaps the researchers have an idea how to solve that riddle.

Funding for the project was supplied, in part, by the MIT-IBM Watson AI Lab, the United States Air Force Research Laboratory, and the United States Air Force Artificial Intelligence Accelerator.

Cynthia Murrell, April 8, 2022

How about Those Commercial and US Government RFPs?

April 7, 2022

I am not familiar with an author whose name I will not put in my blog because of the Google-type systems’ stop word lists. The article is called “Your Competitor Wrote the RFP You Are Bidding On.” Some of the people who have worked on commercial bids are familiar with the process: Read the request for proposal, write the proposal, and win the proposal. Simple, eh. I have more familiarity with the swamp lands in Washington, DC. I have had an opportunity to observe how the sausage is made with regards to requests for information, requests for proposals, the proposals themselves, the decision mechanisms used to award a project, and the formal objections filed by bidders who did not win a contract. My observations are based on more than 50 years of work for government entities as well as some commercial projects for big outfits like the pre-Judge Green AT&T.

The write up states:

As a vendor, your job is to determine whether you want this. It’s costly to bid and often more costly to win. Spending absurd amounts of time across your org doing RFP submissions is rarely quantified from an ROI stand-point. If you’re in a type of business where RFP bids are involved from time to time, do your best to understand if it’s worth it. A typical RFP bid could take many hundreds of hours, from start to finish, especially if you progress past initial phases. Not only does this have easily quantifiable real costs, but the process also has runaway opportunity costs involving the product team, engineering, sales, legal and marketing.

I liked this observation:

The thing is, nobody really needs 80% of the [expletive deleted] that’s in an RFP, and they will never hold you to that. The implementation will take 3x longer than promised and your champions will no longer be with the company by the time you’re rolled-out anyway. By winning large RFP contracts, you will get buried, but not by the requirements you said Yes to. You’ll get buried trying to implement and retain this customer. Every week will be a new urgent requirement that was never covered in the RFP.

I want to point out that the reason the wordage and “wouldn’t it be nice” aspects of an RFP are included are often a result of inputs from consultants to the firm or the government agency. If these consultants have special skills, these will often be inserted into the RFP for the purpose of blocking competitors. There are other reasons too.

I look forward to more posts from so [expletive deleted] agile.

Stephen E Arnold, April 7, 2022

Google: Who Makes the Tweaks? Smart Software or Humanoids?

April 7, 2022

I read “Google Tweaks Search and News Results to Direct People to Trusted Sources.” The main idea is that Google wants to do good. Instead of letting people read any old news, the Google “will offer information literacy tips and highlight widely cited source.” That was quick. Google News became available in 2002. Let’s see. My math is no too good, but that sure looks like more than a week ago.

How are the tweaks implemented? That’s a good question. The write up reports:

Since last June, the company has applied labels to results for “rapidly evolving topics,” which include things like breaking news and viral videos that are spreading quickly. It may suggest checking back later for more details as they become clearer. Starting in the US (in English) today, the labels will include some information literacy tips.

Right. Google and it. Are the changes implemented by Snorkelized software learns on the fly what news is not Google quality? Or, will actual Googlers peruse news and decide what’s okay and what needs to be designated l’ordure?

My bet is on one thing. Google’s many protestations that its algorithms do the heavy lifting is a useful way to put on a ghillie suit and disappear from the censorship, editing, and down checking of the inferior information.

If my assumption is incorrect, I can protest and look for my pen. I am 77 and prone to forgetfulness. Google has digital ghillies. Lucky outfit.

Stephen E Arnold, April 7, 2022

Let the Smart Software Do It!

April 6, 2022

Eventually we will produce so much data it will be impossible for mere humans to manage it; AI will simply have to take over soon. This sums up the position of new Dynatrace CEO Rick McConnell as characterized in Diginomica‘s piece, “In Pursuit of General Intelligence—Dynatrace and the Death of the Dashboard.” Here’s a section heading that tickled our fancy: “The [bleeding] edge will make complexity more complex.” You don’t say? Writer Martin Banks describes McConnell perspective:

“Without a strong mixture of AI and operational management, the ability to generate any value out of the exploding growth of data will be difficult to maintain. Indeed, control may degrade enough to start reducing the value that can be created. For example, he sees potential growth in edge-related applications and consequent new growth in the data it will inevitably generate. This points to an underlying truth – that the ability for business users to move up the levels of abstraction, to stop seeing the data and instead see the questions and possible answers data represents – read words and sentences rather than see characters from an alphabet – will become essential for fast and effective business management. It will also play an increasingly important role in the management and development of the applications that will get used, especially as they grow to incorporate the edge into what will have to be a holistic soup-to-nuts business management solution. … The goal here is to completely automate out the need for manual intervention and interaction in tasks such as operations remediation.”

The write-up shares some notes about how Dynatrace approaches such automation. Banks also supplies example situations in which only the immediacy of AI will do, from a shopping cart that drops a users’ items to downtime in a large financial system. We see the logic behind these assertions, but there is one complication the article does not address—the already thorny and opaque problem of biased machine learning systems. It seems to us that without human oversight, that issue will only get worse.

Cynthia Murrell, April 6, 2022

System Glitches: A Glimpse of Our Future?

April 4, 2022

I read “Nearly All Businesses Hit by IT Downtime Last Year – Here’s What’s to Blame.” The write up reports:

More than three-quarters (75%) of businesses experienced downtime in 2021, up 25% compared to the previous year, new research has claimed. Cybersecurity firm Acronis polled more than 6,200 IT users and IT managers from small businesses and enterprises in 22 countries, finding that downtime stemmed from multiple sources, with system crashes (52%) being the most prevalent cause. Human error (42%) was also a major issue, followed by cyber attacks (36%) and insider attacks (20%).

Interesting. A cyber security company reports these data. The cyber security industry sector should know. Many of the smart systems have demonstrated that those systems are somewhat slow when it comes to safeguarding licensees.

What’s the cause of the issue?

There are “crashes.” But what’s a crash. Human error. Humans make mistakes and most of the software systems with which I am familiar are dumb: Blackmagic ATEM software which “forgets” that users drag and drop. Users don’t intuitively know to put an image one place and then put that image another so that the original image is summarily replaced. Windows Defender lights up when we test software from an outfit named Chris. Excel happily exports to PowerPoint but loses the format of the table when it is pasted. There are USB keys and Secure Digital cards which just stop working. Go figure. There are enterprise search systems which cannot display a document saved by a colleague before lunch. Where is it? Yeah, good question. In the indexing queue maybe? Oh, well, perhaps tomorrow the colleague will get the requested feedback?

My takeaway from the write up is that the wild and crazy, helter skelter approach to software and some hardware has created weaknesses, flaws, and dependencies no one knows about. When something goes south, the Easter egg hunt begins. A dead Android device elicits button pushing and the hope that the gizmo shows some signs of life. Mostly not in my experience.

Let’s assume the research is correct. The increase noted in the write up means that software and systems will continue to degrade. What’s the fix? Like many things — from making a government bureaucracy more effective to having an airline depart on time — seem headed on a downward path.

My take is that we are getting a glimpse of the future. Reality is very different from the perfectly functioning demo and the slick assertions in a PowerPoint deck.

Stephen E Arnold, April 4, 2022

Facebook: Fooled by Ranking?

April 1, 2022

I sure hope the information in “A Facebook Bug Led to Increased Views of Harmful Content Over Six Months.” The subtitle is interesting too. “The social network touts downranking as a way to thwart problematic content, but what happens when that system breaks?”

The write up explains:

Instead of suppressing posts from repeat misinformation offenders that were reviewed by the company’s network of outside fact-checkers, the News Feed was instead giving the posts distribution, spiking views by as much as 30 percent globally.

Now let’s think about time. The article reports:

In 2018, CEO Mark Zuckerberg explained that downranking fights the impulse people have to inherently engage with “more sensationalist and provocative” content. “Our research suggests that no matter where we draw the lines for what is allowed, as a piece of content gets close to that line, people will engage with it more on average — even when they tell us afterwards they don’t like the content,” he wrote in a Facebook post at the time.

Why did this happen?

The answer may be that assumptions about the functionality of online systems must be verified by those who know the mechanisms used. Then the functions must be checked on a periodic business. The practice of slipstreaming changes may introduce malfunctions, which no one catches because no one is rewarded for slowing down the operation.

Based on my work for assorted reports and monographs, there are several other causes of a disconnect between what a high technology outfits and its systems actually do. Let me highlight what I call the Big Three:

  1. Explaining something that might be is different from delivering the reality of the system. Management wants to believe that code works, and not too many people want to be the person who says, “Yeah, this is what the system is actually doing?” Institutional momentum can crush certain types of behavior.
  2. The dependencies within complex software systems are not understood, particularly by recently hired outside experts, new hires, or — heaven help us — interns who are told to do X without meaningful checks, reviews, and fixes.
  3. An organization’s implicit policies keep feedback contained so the revenue continues to flow. Who gets promoted for screwing up ad sales? As a result, news releases, public statements, and sworn testimony operates in an adjacent but separate conceptual space from the mechanisms that generate live systems.

It has been my experience that when major problems are pointed out, reactions range from “What do you mean?” to a chuckled comment, “That’s just the way software works.”

What intrigues me is the larger question, “Is the revelation that Facebook smart software does not work as the company believed it did, the baseline for the company’s systems. On the other hand, the information could be an ill considered April Fool’s joke.

My hunch is that the article is not humor. Much of Facebook’s and Silicon Valley behavior does not tickly my funny bone. My prediction is that some US regulators and possibly Margrethe Vestager will take this information under advisement.

Stephen E Arnold, April 1, 2022

The Artificial Intelligence Balloon: Leaking a Bit, Eh?

March 30, 2022

I noted “Enterprise AI Needs to Deliver Real Value As Adoption Slows.” I am not able to define “real value,” but let’s not quibble. The write up reports that a survey from a publisher / conference organizer / Silicon Valley luminary has identified what might be a leaking hyperbole balloon.

I noted:

The latest annual AI Adoption in the Enterprise survey from O’Reilly finds that over the last two years the number of organizations with AI applications in production has remained steady at 26 percent. However, many enterprises still lack AI governance. Among respondents with AI products in production, the number of those whose organizations have a governance plan in place to oversee how projects are created, measured, and observed (49 percent) is roughly the same as those that don’t (51 percent).

But AI is the next big thing. Innovation will soar. Employees will be wallowing in extra time to do “human things.” Money will flow.

These statements are indeed true for Amazon, Facebook, Google, and a handful of other outfits. But for Bob’s Trucking Company or small accounting firm in the Rust Belt, well, not so much it seems.

The reason may be nestled in this comment in the article:

For years, AI has been the focus of the technology world,” says Mike Loukides, vice president of content strategy at O’Reilly and the report’s author. “Now that the hype has died down, it’s time for AI to prove that it can deliver real value, whether that’s cost savings, increased productivity for businesses, or building applications that can generate real value to human lives. This will no doubt require practitioners to develop better ways to collaborate between AI systems and humans, and more sophisticated methods for training AI models that can get around the biases and stereotypes that plague human decision-making.”

What’s the fix? Remediation of algorithmic biases, a shift to NFT innovation, or online gambling?

Those are questions for the little people. The largely unregulated giants are happy to do the smart software thing. Big value is well understood by these firms’ management teams.

Stephen E Arnold, March 30, 2022

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta