Word Problems Are Tricky for AI Language Models
October 27, 2022
If you have trouble with word problems, rest assured you are in good company. Machine-learning researchers have only recently made significant progress teaching algorithms the concept. IEEE Spectrum reports, “AI Language Models Are Struggling to ‘Get’ Math.” Writer Dan Garisto states:
“Until recently, language models regularly failed to solve even simple word problems, such as ‘Alice has five more balls than Bob, who has two balls after he gives four to Charlie. How many balls does Alice have?’ ‘When we say computers are very good at math, they’re very good at things that are quite specific,’ says Guy Gur-Ari, a machine-learning expert at Google. Computers are good at arithmetic—plugging numbers in and calculating is child’s play. But outside of formal structures, computers struggle. Solving word problems, or ‘quantitative reasoning,’ is deceptively tricky because it requires a robustness and rigor that many other problems don’t.”
Researchers threw a couple datasets with thousands of math problems at their language models. The students still failed spectacularly. After some tutoring, however, Google’s Minerva emerged as a star pupil, having achieved 78% accuracy. (Yes, the grading curve is considerable.) We learn:
“Minerva uses Google’s own language model, Pathways Language Model (PaLM), which is fine-tuned on scientific papers from the arXiv online preprint server and other sources with formatted math. Two other strategies helped Minerva. In ‘chain-of-thought prompting,’ Minerva was required to break down larger problems into more palatable chunks. The model also used majority voting—instead of being asked for one answer, it was asked to solve the problem 100 times. Of those answers, Minerva picked the most common answer.”
Not a practical approach for your average college student during an exam. Researchers are still not sure how much Minerva and her classmates understand about the answers they are giving, especially since the more problems they solve the fewer they get right. Garisto notes language models “can have strange, messy reasoning and still arrive at the right answer.” That is why human students are required to show their work, so perhaps this is not so different. More study is required, on the part of both researchers and their algorithms.
Cynthia Murrell, October 27, 2022
Eureka! The Google Phone Call Feature
October 27, 2022
Despite not being the preferred method of communication anymore, phone calls are still a vital part of society. One thing that has always plagued phone calls is clear reception. The Verge shares how Google is innovating once more, but on old-school phone calls: “Google Is Working On ‘Clear Calling’ For Android Phone Calls.”
The new Android 13 release includes a “clear calling” feature stated to reduce background noise during calls. Twitter user Mishaal Rahman told people how to enable it. Clear Calling is supposed to work on most mobile networks, but it is not available for Wi-Fi calling and it does not send information from your phone calls to Google.
Google has experimented with noise cancellation technology before:
“Google has been flexing its noise-canceling muscles (and custom six-core audio chips) for a while. First, and most impressively, by using AI to suppress background noises like the crackling of snack bags, keyboard clicks, and dogs barking in Google Meet. More recently with the $199 Pixel Buds Pro — the company’s first earbuds with active noise cancellation.”
Noise cancellation technology can always stand improvements, especially to drown out all the noise generated by today’s media. What’s next? More data capture? More fines from non Googley government busybodies? More trimming of staff?
Whitney Grace, October 27, 2022
Measuring How Badly Social Media Amplifies Misinformation
October 26, 2022
In its ongoing examination of misinformation online, the New York Times tells us about the Integrity Institute‘s quest to measure just how much social media contributes to the problem in, “How Social Media Amplifies Misinformation More than Information.” Reporter Steven Lee Meyers writes:
“It is well known that social media amplifies misinformation and other harmful content. The Integrity Institute, an advocacy group, is now trying to measure exactly how much — and on Thursday [October 13] it began publishing results that it plans to update each week through the midterm elections on Nov. 8. The institute’s initial report, posted online, found that a ‘well-crafted lie’ will get more engagements than typical, truthful content and that some features of social media sites and their algorithms contribute to the spread of misinformation.”
In is ongoing investigation, the researchers compare the circulation of posts flagged as false by the International Fact-Checking Network to that of other posts from the same accounts. We learn:
“Twitter, the analysis showed, has what the institute called the great misinformation amplification factor, in large part because of its feature allowing people to share, or ‘retweet,’ posts easily. It was followed by TikTok, the Chinese-owned video site, which uses machine-learning models to predict engagement and make recommendations to users. … Facebook, according to the sample that the institute has studied so
far, had the most instances of misinformation but amplified such claims to a lesser degree, in part because sharing posts requires more steps. But some of its newer features are more prone to amplify misinformation, the institute found.”
Facebook‘s video content spread lies faster than the rest of the platform, we learn, because its features lean more heavily on recommendation algorithms. Instagram showed the lowest amplification rate, while the team did not yet have enough data on YouTube to draw a conclusion. It will be interesting to see how these amplifications do or do not change as the midterms approach. The Integrity Institute shares its findings here.
Cynthia Murrell, October 26, 2022
Learning Is Supposed to Be Easy. Says Who?
October 26, 2022
I am not sure what a GenZ is. I do know that if I provide cash and change for a bill at a drug store or local grocery store, the person running the cash register looks like a deer in headlights. I have a premonition that if I had my Digital Infrared Thermometer, I could watch the person’s temperature rise. Many of these young people struggle to make change. My wife had a $0.50 cent piece and gave it to the cashier at the garden center along with some bills. The GenZ or GenX or whatever young person called the manager and asked, “What is this coin?”
I read “Intelligent.com Survey Shows 87 Percent of College Students Think Classes Are Too Difficult, But Most Fail to Study Regularly.” I know little about the sponsor of the research, the sampling methodology, or the statistical procedures used to calculate the data. Caution is advised when “real news” trots out data. Let’s assume that the information is close enough for horseshoes. After all, this is the statistical yardstick for mathematical excellence in use at synthetic data companies, Google-type outfits, and many artificial intelligence experts hot for cheap training data. Yep, close enough is good enough. I should create a T shit with this silkscreened on the front. But that’s work, which I don’t do.
The findings reported in the article include some gems which appear to bolster my perception that quite a few GenZ etc. cohort members are not particularly skilled in some facets of information manipulation. I would wager that their TikTok skills are excellent. Other knowledge based functions may lag. Let’s look at these numbers:
65 percent of respondents say they put a lot of effort into their studies. However, research findings also show that one-third of students who claim to put a lot of effort into their schoolwork spend less than 5 hours a week studying.
This is the academic equivalent of a young MBAs saying, “I will have the two pager ready tomorrow morning.” The perception of task completion is sufficient for these young millionaires to be. Doing the work is irrelevant because the individual thinks the work will be done. When reminded, the excuses fly. I want to remind you that some high-tech companies trot out the well worn “the dog ate my homework” excuse when testifying.
And this finding:
Thirty-one percent of respondents spend 1-5 hours, and 37 percent spend 6-10 hours studying for classes each week. Comparatively, 8 percent of students spend 15-20 hours, and 5 percent spend more than 20 hours studying.
I have been working on Hopf fibrations for a couple of years. Sorry, I am not at the finish line yet. Those in the sample compute studying with a few hours in a week. Nope, that time commitment is plotted on flawed timeline, not the real world timeline for learning and becoming proficient in a subject.
I loved this finding:
Twenty-eight percent of students have asked a professor to change their grade, while 31 percent admit they cheated to get better grades. Almost 50 percent of college students believe a pass or fail system should replace the current academic grading system.
Wow.
Net net: No wonder young people struggle with making change and thinking clearly. Bring back the dinobabies even though there are some dull normals in that set of cohorts as well. But when one learns by watching TikToks what can one expect in the currency recognition department? Answer: Not much.
Stephen E Arnold, October 26, 2022
The Robots: Fun and Friendly Colleagues?
October 26, 2022
Robot coworkers make us uncomfortable, apparently. Who knew? ScienceDaily reports, “Robots in Workplace Contribute to Burnout, Job Insecurity.” The good news, we are told, is that simple self-affirmation exercises can help humans get past such fears. The write-up cites research from the American Psychological Association, stating:
“Working with industrial robots was linked to greater reports of burnout and workplace incivility in an experiment with 118 engineers employed by an Indian auto manufacturing company. An online experiment with 400 participants found that self-affirmation exercises, where people are encouraged to think positively about themselves and their uniquely human characteristics, may help lessen workplace robot fears. Participants wrote about characteristics or values that were important to them, such as friends and family, a sense of humor or athletics. ‘Most people are overestimating the capabilities of robots and underestimating their own capabilities,’ [lead researcher Kai Chi] Yam said.”
Yam suspects ominous media coverage about robots replacing workers is at least partially to blame for the concern. Yeah, that tracks. The write-up continues:
“Fears about job insecurity from robots are common. The researchers analyzed data about the prevalence of robots in 185 U.S. metropolitan areas along with the overall use of popular job recruiting sites in those areas (LinkedIn, Indeed, etc.). Areas with the most prevalent rates of robots also had the highest rates of job recruiting site searches, even though unemployment rates weren’t higher in those areas.”
Researchers suggest this difference may be because workers in those areas are afraid of being replaced by robots at any moment, though they allow other factors could be at play. So just remember—if you become anxious a robot is after your job, just remind yourself what a capable little human you are. Technology is our friend, even if it makes us a bit nervous.
Cynthia Murrell, October 26, 2022
A Data Taboo: Poisoned Information But We Do Not Discuss It Unless… Lawyers
October 25, 2022
In a conference call yesterday (October 24, 2022), I mentioned one of my laws of online information; specifically, digital information can be poisoned. The venom can be administered by a numerically adept MBA or a junior college math major taking short cuts because data validation is hard work. The person on the call was mildly surprised because the notion of open source and closed source “facts” intentionally weaponized is an uncomfortable subject. I think the person with whom I was speaking blinked twice when I pointed what should be obvious to most individuals in the intelware business. Here’s the pointy end of reality:
Most experts and many of the content processing systems assume that data are good enough. Plus, with lots of data any irregularities are crunched down by steamrolling mathematical processes.
The problem is that articles like “Biotech Firm Enochian Says Co Founder Fabricated Data” makes it clear that MBA math as well as experts hired to review data can be caught with their digital clothing in a pile. These folks are, in effect, sitting naked in a room with people who want to make money. Nakedness from being dead wrong can lead to some career turbulence; for example, prison.
The write up reports:
Enochian BioSciences Inc. has sued co-founder Serhat Gumrukcu for contractual fraud, alleging that it paid him and his husband $25 million based on scientific data that Mr. Gumrukcu altered and fabricated.
The article does not explain precisely how the data were “fabricated.” However, someone with Excel skills or access to an article like “Top 3 Python Packages to Generate Synthetic Data” and Fiverr.com or similar gig work site can get some data generated at a low cost. Who will know? Most MBAs math and statistics classes focus on meeting targets in order to get a bonus or amp up a “service” fee for clicking a mouse. Experts who can figure out fiddled data sets take the time if they are motivated by professional jealousy or cold cash. Who blew the whistle on Theranos? A data analyst? Nope. A “real” journalist who interviewed people who thought something was goofy in the data.
My point is that it is trivially easy to whip up data to support a run at tenure or at a group of MBAs desperate to fund the next big thing as the big tech house of cards wobbles in the winds of change.
Several observations:
- The threat of bad or fiddled data is rising. My team is checking a smart output by hand because we simply cannot trust what a slick, new intelware system outputs. Yep, trust is in short supply among my research team.
- Individual inspection of data from assorted open and closed sources is accepted as is. The attitude is that the law of big numbers, the sheer volume of data, or the magic of cross correlation will minimize errors. Sure these processes will, but what if the data are weaponized and crafted to avoid detection? The answer is to check each item. How’s that for a cost center?
- Uninformed individuals (yep, I am including some data scientists, MBAs, and hawkers of data from app users) don’t know how to identify weaponized data nor know what to do when such data are identified.
Does this suggest that a problem exists? If yes, what’s the fix?
[a] Ignore the problem
[b] Trust Google-like outfits who seek to be the source for synthetic data
[c] Rely on MBAs
[d] Rely on jealous colleagues in the statistics department with limited tenure opportunities
[e] Blink.
Pick one.
Stephen E Arnold, October 25, 2022
Exabeam: A Remarkable Claim
October 25, 2022
I read “Exabeam New Scale SIEM Enables Security Teams to Detect the Undetectable.” I find the idea expressed in the headline interesting. A commercial firm can spot something that cannot be seen; that is, detect the undetectable. The write up states as a rock solid factoid:
Claimed to be an industry first, Exabeam New-Scale SIEM allows security teams to search query responses across petabytes of hot, warm and cold data in seconds. Organizations can use the service to process logs with limitless scale at sustained speeds of more than 1 million events per second. Key to Exabeam’s offering is the ability to understand normal behavior to detect and prioritize anomalies. Exabeam New-Scale SIEM offers more than 1,800 pre-built correlation rules and more than 1,100 anomaly detection rules that leverage in excess of 750 behavior analytics detection models, which baseline normal behavior.
The write up continues with a blizzard of buzzwords; to wit:
The full list of new Exabeam products includes Security Log Management — cloud-scale log management to ingest, parse, store and search log data with powerful dashboarding and correlation. Exabeam SIEM offers cloud-native SIEM at hyperscale with modern search and powerful correlation, reporting, dashboarding and case management, and Exabeam Fusion provides New-Scale SIEM powered by modern, scalable security log management, powerful behavioral analytics and automated TDIR, according to the company. Exabeam Security Analytics provides automated threat detection powered by user and entity behavior analytics with correlation and threat intelligence. Exabeam Security Investigation is powered by user and entity behavior analytics, correlation rules and threat intelligence, supported by alerting, incident management, automated triage and response workflows.
Now this is not detecting the undetectable. The approach relies on processing data quickly, using anomaly detection methods, and pre-formed rules.
By definition, a pre formed rule is likely to have a tough time detecting the undetectable. Bad actors exploit tried and true security weaknesses, rely on very tough to detect behaviors like a former employee selling a bad actor information about a target’s system, and new exploits cooked up in the case of NSO Group in a small mobile phone shop or in a college class in Iran.
What is notable in the write up is:
The use of SIEM without explaining that the acronym represents “security information and event management.” The bound phrase “security information” means the data marking an exploit or attack. And “event management” means what the cyber security professionals do when the attack succeeds. The entire process is reactive; that is, only after something bad has been identified can action be taken. No awareness means the attack can move forward and continue. The idea of “early warning” means one thing, and detect the undetectable is quite another.
Who is responsible for this detect the undetectable? My view is that it is an art history major now working in marketing.
Detecting the undetectable. More like detecting sloganized marketing about a very serious threat to organizations hungry for dashboarding.
Stephen E Arnold, October 25, 2022
Startup Vectara Takes Search Up Just a Notch
October 25, 2022
Is this the enterprise search innovation we have been waiting for? A team of ex-Googlers have used what they learned about large language models (LLMs), natural language processing (NLP), and transformer techniques to launch a new startup. We learn about their approach in VentureBeat‘s article, “Vectara’s AI-Based Neural Search-as-a-Service Challenges Keyword-Based Searches.” The platform combines LLMs, NLP, data integration pipelines, and vector techniques into a neural network. The approach can be used for various purposes, we learn, but the company is leading with search. Journalist Sean Michael Kerner writes:
“[Cofounder Amr] Awadallah explained that when a user issues a query, Vectara uses its neural network to convert that query from the language space, meaning the vocabulary and the grammar, into the vector space, which is numbers and math. Vectara indexes all the data that an organization wants to search in a vector database, which will find the vector that has closest proximity to a user query. Feeding the vector database is a large data pipeline that ingests different data types. For example, the data pipeline knows how to handle standard Word documents, as well as PDF files, and is able to understand the structure. The Vectara platform also provides results with an approach known as cross-attentional ranking that takes into account both the meaning of the query and the returned results to get even better results.”
We are reminded a transformer puts each word into context for studious algorithms, relating it to other words in the surrounding text. But what about things like chemical structures, engineering diagrams, embedded strings in images? It seems we must wait longer for a way to easily search for such non-linguistic, non-keyword items. Perhaps Vectara will find a way to deliver that someday, but next it plans to work on a recommendation engine and a tool to discover related topics. The startup, based in Silicon Valley, launched in 2020 under the “stealth” name Zir AI. Recent seed funding of $20 million has enabled the firm to put on its public face and put out this inaugural product. There is a free plan, but one must contact the company for any further pricing details.
Cynthia Murrell, October 25, 2022
The In Person Office Means You Do Synergy
October 25, 2022
What is the point of requiring workers to come into the office part-time? Slack’s chief executive Stewart Butterfield knows what it is not. BBC News reports, “Office Time Is Not for Video Calls, Says Tech Boss.” Writer Zoe Kleinman tells us how the messaging-app company makes the most of in-person time:
“Ongoing renovations are gearing Slack workspace more towards that of a social club, [Butterfield] says, because he wants people to come to work to collaborate and build relationships face-to-face. ‘The best thing we can do is create a comfortable environment for people to come together and actually enjoy themselves,’ he says. He accepts that some people will choose to work full time in the office because they either cannot or do not want to work from home, and also thinks that young people starting their careers generally prefer to be in the office with their peers. ‘It’s hard to imagine starting your career fresh out of university, and not going to the office, and not being able to meet all these people in person,’ he says. ‘But I think the majority of knowledge workers, over time, will settle into some sort of pattern of regular intervals of getting together.'”
But make no mistake—Butterfield is not suggesting an abundance of meetings. In fact, he thinks 20-30 percent of meetings should have been an email. He likes Jeff Bezos’ technique for making the most of meetings that do occur by prefacing them with a written brief. Another practice he suggests is to share information asynchronously. (Through an app like Slack, perhaps?) Fewer meetings will almost certainly help entice workers on-site, where they can get to know each other as more than a grid of disembodied faces. We can see how that might enhance collaboration. But what if synergy means something like the Uber interactions? Yeah.
Cynthia Murrell, October 25, 2022
Open Source Is the Answer. Maybe Not?
October 24, 2022
In my last three lectures, I have amplified and explained what I call the open source frenzy and the concomitant blind spots. One senior law enforcement professional told me after a talk in September 2022, “We’re pushing forward with open source.” To be fair, that’s been the position of many government professionals with whom I have spoken in this year. Open source delivers high value software. Open source provides useful information with metatags. These data can be cross correlated to provide useful insight for investigators. Open source has even made it easier for those following Mr. Putin’s special action to get better information than those in war fighting hot spots.
Open source is the answer.
If you want a reminder about the slippery parts of open source information, navigate to “Thousands of GitHub Repositories Deliver Fake PoC Exploits with Malware.” The write up reports:
According to the technical paper from the researchers at Leiden Institute of Advanced Computer Science, the possibility of getting infected with malware instead of obtaining a PoC could be as high as 10.3%, excluding proven fakes and prankware.
Not a big deal, right?
Wrong. These data, even if the percentage is adrift, point to a vulnerability caused by the open source cheerleaders.
The write up does a good job of providing examples, which will be incomprehensible to most people. However, the main point of the write up is that open source repositories for software can be swizzled. The software, libraries, executables, and other bits and bobs can put some additional functions in the objects. If that takes place, the vulnerabilities rides along until called upon to perform an unexpected and possibly difficult to identify action.
Cyber security is primarily reactive. Embedded malware can be proactive, particularly if it uses a previously unknown code flaw.
The interesting part of the write up is this passage in my opinion:
The researchers have reported all the malicious repositories they discovered to GitHub, but it will take some time until all of them are reviewed and removed, so many still remain available to the public. As Soufian [a Dark Trace expert] explained, their study aims not just to serve as a one-time cleaning action on GitHub but to act as a trigger to develop an automated solution that could be used to flag malicious instructions in the uploaded code.
The idea of unknown or zero day flaws is apparently not on the radar. What’s this mean in practical terms? A “good enough” set of actions to deal with known issues is not going to be good enough.
This seems to set the stage for a remedial action that does not address the workflows and verification for open source. More significantly, should the focus be on code only?
The answer is, “No.” Think about injecting Fibonacci sequences into certain quantum computer operations. Can injection of crafted numerical strings into automated content processing systems throw a wrench into the works?
The answer to this question is, “Yes.”
Stephen E Arnold, October 24, 2022