Free Resource on AI for Physical Simulations

September 27, 2021

The academics at the Thuerey Group have made a useful book on artificial intelligence operations and smart software applications available online. The Physics-Based Deep Learning Book is a comprehensive yet practical introduction to machine learning for physical simulations. Included are code examples presented via Jupyter notebooks. The book’s introduction includes this passage:

“People who are unfamiliar with DL methods often associate neural networks with black boxes, and see the training processes as something that is beyond the grasp of human understanding. However, these viewpoints typically stem from relying on hearsay and not dealing with the topic enough. Rather, the situation is a very common one in science: we are facing a new class of methods, and ‘all the gritty details’ are not yet fully worked out. However, this is pretty common for scientific advances. … Thus, it is important to be aware of the fact that – in a way – there is nothing magical or otherworldly to deep learning methods. They’re simply another set of numerical tools. That being said, they’re clearly fairly new, and right now definitely the most powerful set of tools we have for non-linear problems. Just because all the details aren’t fully worked out and nicely written up, that shouldn’t stop us from including these powerful methods in our numerical toolbox.”

This virtual tome would be a good place to start doing just that. Interested readers may want to begin studying it right away or bookmark it for later. Also see the Thuerey Group’s other publications for more information on numerical methods for deep-learning physics simulations.

Cynthia Murrell, September 27, 2021

Hard Working Coders Love Code That Writes Itself

September 14, 2021

Code programmers are excited about an AI software that writes new code. The BBC investigates the software in, “Why Coders Love The AI That Could Put Them Out Of A Job.” Github revealed the new AI Copilot in June 2021. Users type code into Copilot, then it suggests how to finish it. Copilot is very intuitive and its suggestions are on par with what coders want.

Copilot has made waves in the coding community:

“It is based on an artificial intelligence called GPT-3, released last summer by OpenAI, a San Francisco-based AI lab, co-founded by Elon Musk. This GPT (which stands for generative pre-training) engine does a “very simple but very large thing – predicting the next letter in a text,” explains Grzegorz Jakacki, Warsaw-based founder of Codility, which makes a popular hiring test.

OpenAI trained the AI on texts already available online such as books, Wikipedia and hundreds of thousands of web pages, a diet that was “somewhat curated but in all possible human languages,” he says. And “spookily, it wasn’t taught the rules of any particular language,” adds Mr Jakacki. The result was plausible passages of text.”

Despite its accuracy, new AI always makes mistakes and anything Copilot suggests needs to be reviewed by real programmers. Instead of worrying about losing their jobs, coders are happy because Copilot helps them. Copilot edits their code for them and also provides instantaneous feedback as they write.

One problem that arises from Copilot is that it could write auto-generated code that someone already created. It also brings into question about how much code is original and comes from the source code and how to detect that. At the moment, Copilot is only writing short code passages, not full software. AI is a long way from evolving past human intelligence, but it can imitate basic behaviors.

Whitney Grace, September 14, 2021

When AI Goes Off the Rails: Who Gets Harmed?

September 13, 2021

One of the worst things about modern job hunting is the application process. Hiring systems require potential applicants to upload their resume, then retype their resume into specified fields. It is a harrowing process that would annoy anyone. What is even worse is that most resume are rejected thanks to resume-scanning software. The Verge details how bad automation harms job seekers in the story, “Automated Hiring Software Is Mistakenly Rejecting Millions of Viable Job Candidates.”

Automated resume-scanning software rejects viable candidates. The software accidentally rejecting the candidates created a new pocket of qualified workers, who are locked out of the job market. Seventy-five percent of US employers use resume software and it is one of the biggest factors harming job applicants. There are many problems with resume software and they appear to stem from how they are programmed to “evaluate” candidates:

“For example, some systems automatically reject candidates with gaps of longer than six months in their employment history, without ever asking the cause of this absence. It might be due to a pregnancy, because they were caring for an ill family member, or simply because of difficulty finding a job in a recession. More specific examples cited by one of the study’s author, Joseph Fuller, in an interview with The Wall Street Journal include hospitals who only accepted candidates with experience in “computer programming” on their CV, when all they needed were workers to enter patient data into a computer. Or, a company that rejected applicants for a retail clerk position if they didn’t list “floor-buffing” as one of their skills, even when candidates’ resumes matched every other desired criteria.”

Employers use rigid criteria to filter job applicants. On one hand, resume software was supposed to make hiring easier, but employers are inundated with hundreds of resumes with an average of 250 applicants per job. Automation in job hiring is not slowing down and the industry is projected to be worth $3.1 billion by 2025.

How will off-the-rails AI apps be avoided or ameliorated? My hunch is that they cannot.

Whitney Grace, September 13, 2021

More AI Bias? Seems Possible

September 10, 2021

Freddie Mac and Fannie Mae are stuck in the past—the mid-1990s, to be specific, when the Classic FICO loan-approval software was developed. Since those two quasi-government groups basically set the rules for the mortgage industry, their reluctance to change is bad news for many would-be home buyers and their families. The Markup examines “The Secret Bias Hidden in Mortgage-Approval Algorithms.” Reporters Emmanuel Martinez and Lauren Kirchner reveal what their organization’s research has uncovered:

“An investigation by The Markup has found that lenders in 2019 were more likely to deny home loans to people of color than to white people with similar financial characteristics — even when we controlled for newly available financial factors the mortgage industry for years has said would explain racial disparities in lending. Holding 17 different factors steady in a complex statistical analysis of more than two million conventional mortgage applications for home purchases, we found that lenders were 40 percent more likely to turn down Latino applicants for loans, 50 percent more likely to deny Asian/Pacific Islander applicants, and 70 percent more likely to deny Native American applicants than similar White applicants. Lenders were 80 percent more likely to reject Black applicants than similar White applicants. These are national rates. In every case, the prospective borrowers of color looked almost exactly the same on paper as the White applicants, except for their race.”

Algorithmic bias is a known and devastating problem in several crucial arenas, but recent years have seen efforts to mitigate it with better data sets and tweaked machine-learning processes. Advocates as well as professionals in the mortgage and housing industries have been entreating Fannie and Freddie to update their algorithm since 2014. Several viable alternatives have been developed but the Federal Housing Finance Agency, which oversees those entities, continues to drag its heels. No big deal, insists the mortgage industry—bias is just an illusion caused by incomplete data, representatives wheedle. The Markup’s research indicates otherwise. We learn:

“The industry had criticized previous similar analyses for not including financial factors they said would explain disparities in lending rates but were not public at the time: debts as a percentage of income, how much of the property’s assessed worth the person is asking to borrow, and the applicant’s credit score. The first two are now public in the Home Mortgage Disclosure Act data. Including these financial data points in our analysis not only failed to eliminate racial disparities in loan denials, it highlighted new, devastating ones.”

For example, researchers found high-earning Black applicants with less debt get rejected more often than white applicants with similar income but more debt. See the article for more industry excuses and the authors’ responses, as well some specifics on mechanisms of systemic racism and how location affects results. There are laws on the books that should make such discrimination a thing of the past, but they are difficult to enforce. An outdated algorithm shrouded in secrecy makes it even more so. The Federal Housing Finance Agency has been studying its AI’s bias and considering alternatives for five years now. When will it finally make a change? Families are waiting.

Cynthia Murrell, September 10, 2021

Smart Software: Boiling Down to a Binary Decision?

September 9, 2021

I read a write up which contained a nuance which is pretty much a zero or a one; that is, a binary decision. The article is “Amid a Pandemic, a Health Care Algorithm Shows Promise and Peril.” Okay, good news and bad news. The subtitle introduces the transparency issue:

A machine learning-based score designed to aid triage decisions is gaining in popularity — but lacking in transparency.

The good news? A zippy name: The Deterioration Index. I like it.

The idea is that some proprietary smart software includes explicit black boxes. The vendor identifies the basics of the method, but does not disclose the “componentized” or “containerized” features. The analogy I use in my lectures is that no one pays attention to a resistor; it just does its job. Move on.

The write up explains:

The use of algorithms to support clinical decision making isn’t new. But historically, these tools have been put into use only after a rigorous peer review of the raw data and statistical analyses used to develop them. Epic’s Deterioration Index, on the other hand, remains proprietary despite its widespread deployment. Although physicians are provided with a list of the variables used to calculate the index and a rough estimate of each variable’s impact on the score, we aren’t allowed under the hood to evaluate the raw data and calculations.

From my point of view this is now becoming a standard smart software practice. In fact, when I think of “black boxes” I conjure an image of Stanford University and the University of Washington professors, graduate students, and Google-AI types which share these outfits’ DNA. Keep the mushrooms in the cave, not out in the sun’s brilliance. I could be wrong, of course, but I think this write up touches upon what may be a matter that some want to forget.

And what is this marginalized issue?

I call it the Timnit Gebru syndrome. A tiny issue buried deep in a data set or method assumed to be A-Okay may not be. What’s the fix? An ostrich-type reaction, a chuckle from someone with droit de seigneur? Moving forward because regulators and newly-minted government initiatives designed to examine bias in AI are moving with pre-Internet speed?

I think this article provides an interest case example about zeros and ones. Where’s the judgment? In a black box? Embedded and out of reach.

Stephen E Arnold, September 9, 2021

Has TikTok Set Off an Another Alarm in Washington, DC?

September 9, 2021

Perhaps TikTok was hoping the recent change to its privacy policy would slip under the radar. The Daily Dot reports that “Senators are ‘Alarmed’ at What TikTok Might Be Doing with your Biometric Data.” The video-sharing platform’s new policy specifies it now “may collect biometric identifiers and biometric information,” like “faceprints and voiceprints.” Why are we not surprised? Two US senators expressed alarm at the new policy which, they emphasize, affects nearly 130 million users while revealing few details. Writer Andrew Wyrich reports,

“That change has sparked Sen. Amy Klobuchar (D-Minn.) and Sen. John Thune (R-S.D.) to ask TikTok for more information on how the app plans to use that data they said they’d begin collecting. Klobuchar and Thune wrote a letter to TikTok earlier this month, which they made public this week. In it, they ask the company to define what constitutes a ‘faceprint’ and a ‘voiceprint’ and how exactly that collected data will be used. They also asked whether that data would be shared with third parties and how long the data will be held by TikTok. … Klobuchar and Thune also asked the company to tell them whether it was collecting biometric data on users under 18 years old; whether it will ‘make any inferences about its users based on faceprints and voiceprints;’ and whether the company would use machine learning to determine a user’s age, gender, race, or ethnicity based on the collected faceprints or voiceprints.”

Our guess is yes to all three, though we are unsure whether the company will admit as much. Nevertheless, the legislators make it clear they expect answers to these questions as well as a list of all entities that will have access to the data. We recommend you do not hold your breath, Senators.

Cynthia Murrell, September 9, 3021

FR Is Going Far

September 6, 2021

Law enforcement officials are using facial recognition software and the array of cameras that cover the majority of the world to identify bad actors. The New York Times reports on a story that was used to track down a terroristic couple: “A Fire In Minnesota. An Arrest In Mexico. Cameras Everywhere.”

Mena Yousif is an Iranian refuge and Jose Felan is a felon. The couple were frustrated about the current state of the American law enforcement system and government, especially after George Floyd’s death. They set fire to buildings, including schools, stores, gas stations, and caused damage to over 1500. The ATF posted videos of the pair online, asking for any leads to their arrests. The ATF received tips as Felan and Yousif traveled across the US to the Mexican border. The were on the run for two weeks before they were identified outside of a motel in Texas.

Mexican authorities deployed a comprehensive facial recognition system, deployed in 2019, and it was used to find Felan and Yousif. Dahua Technology designed Mexico’s facial recognition system. Dahua is a Chinese company, one of the largest video surveillance companies in the world, and is partially owned by the its government. The US Defense and Commerce departments blacklisted Dahua for China’s treatment of Uighur Muslims and the trade war. Dahua denies the allegations and stated that it cannot control how its technology is used. Facial recognition did not catch Yousif and Felan, instead they were given a tip.

China is marketing surveillance technology to other countries, particularly in South America, Asia, and Africa, as a means to minimize crime and promote order. There are issues with the technology being perfect and the US does use it despite them:

“In the United States, facial recognition technology is widely used by law enforcement officials, though poorly regulated. During a congressional hearing in July, lawmakers expressed surprise that 20 federal agencies were using it without having fully assessed the risks of misuse or bias — some algorithms have been found to work less accurately on women and people of color, and it has led to mistaken arrests. The technology can be a powerful and effective crime-solving tool, though, placing it, for now, at a tipping point. At the start of the hearing, Representative Sheila Jackson Lee, Democrat of Texas, highlighted the challenge for Congress — or anyone — in determining the benefits and downsides to using facial recognition: It’s not clear how well it works or how widely it’s used. As Ms. Jackson Lee said, “Information on how law enforcement agencies have adopted facial recognition technology remains underreported or nonexistent.”

Many governments around the world, including the US, seem poised to their increase the amount of facial recognition and tracking technology for law and order. What is interesting is that China has been a pacesetter.

Whitney Grace, September 9, 2021

MIT Creates a Prediction Tool for Tech Improvements

August 25, 2021

Leave it to researchers at MIT to find a way to predict the future, at least when it comes to rates of improvement for some 1,757 technologies. Fast Company tells us, “MIT Built a Google Search to Spot the Most Important Tech Innovations of the Future.” The team behind the search tool, dubbed simply technology rates, specializes in studying innovation. Former decades-long Ford engineer and designer Christopher Magee is now a professor of practice at the university, and he put together a team of graduate students to do just that. One, Anuraag Singh, used to work at Honda’s R&D lab determining which technologies that company should invest in long-term. The researchers’ experience led them to this conclusion: The key to rapid but accurate predictions was to create AI that examines relationships within the U.S. patent system. Reporter Mark Wilson explains:

“Like scientific research papers, patents routinely reference other patents. The AI—developed and validated by Giorgio Triulzi, assistant professor at Universidad de los Andes—can build a whole networked web of these patent relationships, seeing not just which are influential within their own fields, but also which are pulling from completely disparate fields. As Magee explains, semiconductor patents alone don’t explain their improvement over time, because the most successful fields dip into other research topics. In the case of semiconductors, improvements in lasers actually improved chips in the past, and the team believes new plasma research will in the future. Similarly, patents in software are now being cited by all sorts of patents in other industries, because so much of the world is operated digitally.”

Magee notes the further one traces these referential patents, the more the relationships “explode” outward. It is a complex task that calls for an AI solution. The article continues:

“The team created the patent search tool because it was the most practical means to look up over 1,000 different technologies—which makes it a superb tool for R&D teams and other future-casting groups in the public or private sector. Any other sort of graphic interface or list would be unwieldy. When you type your technology into the search box, it will pull up its innovation rate, along with the top 10 most-cited recent patents about it.”

Wilson reminds us to think of annual improvement rates like compound interest on money— gains build on gains. What to make of those innovation rates in practical terms, though, can be a bit of a mystery and depends on the field being investigated. Enterprising fellows that they are, Magee and Singh have launched a commercial enterprise, Technext, to make sense of the data. They are off to a strong start—their first client is the U.S. Air Force.

One question: Would this tool have predicted that MIT would accept money from Jeffrey Epstein and then have professional staff who attempted to sidestep scrutiny?

Yes or no, please, My thought is that MIT’s software and its institutional actions may be easier to puff up than deliver on certain core values. Just a hunch.

Cynthia Murrell, August 25, 2021

Federated AI: A Garden of Eden. Will There Be a Snake or Two?

August 23, 2021

I read “Eden AI Launches Platform to Unify ML APIs.” I had two immediate reactions. The first was content marketing, and the second was that there was a dark side to the Garden of Eden, wasn’t there?

Eden is a company pulling a meta-play or leveling up. The idea is that one can pop up higher, pull disparate items together, and create a new product or service.

This works for outfits ranging from a plumbing supply company serving smaller towns to an outfit like the Bezos bulldozer. Why not apply this model to the rock solid world of machine learning application programming interfaces.

The write up states:

… using Eden AI, a company could feed a document in Chinese into Google Cloud Platform’s optical character recognition service to extract its contents. Then it could have an IBM Watson model translate the extracted Chinese characters into English words and queue up an Amazon Web Services API to analyze for keywords. Eden AI makes money by charging providers a commission on the revenues generated by its platform.

Latency? Apparently no problem. The costs of maintaining the meta-code as the APIs change. Apparently no problem. Competition from outfits like Microsoft who whether the technology works or not wants to maintain its role as the go-to place for advanced whatevers. No problem.


Stephen E Arnold, August 23, 2021

Quote to Note: Smart Software and Bad Data

August 16, 2021

AI or artificial intelligence is a big deal, particularly to those who are betting big bucks on the technology transforming everything. “Pepperdata CEO Says AI Ambitions Outpace Data Management Reality” is more pragmatic. In an interview with a Silicon Valley type news service, Pepperdata CEO Ash Munshi says:

I’m spending more money, but I’m not understanding anything better.

The key word in my opinion is “understanding.” Knowing is different from saying one knows.

Pepperdata’s big dog adds:

At the end of the day, data provides you insights. Those insights give you the ability to create a gut instinct, and that gut instinct is the fundamental thing that you use to make decisions.

If I understand this statement, smart software makes it easier to make a subjective decision.

This view strikes me as raising an important point: Smart software boils down to guessing to make it easier for the human to use “gut instinct.” Progress?

Stephen E Arnold, August 16, 2021

Next Page »

  • Archives

  • Recent Posts

  • Meta