FOGINT: Security Tools Over Promise & Under Deliver
November 22, 2024
While the United States and the rest of the world has been obsessed with the fallout of the former’s presidential election, bad actors planned terrorist plots. I24 News reports that after a soccer/football match in Amsterdam, there was a preplanned attack on Israeli fans: “Evidence From WhatsApp, Telegram Groups Shows Amsterdam Pogrom Was Organized.”
The Daily Telegraph located screenshots from WhatsApp and Telegram that displayed messages calling for a “Jew Hunt” after the game. The message writers were identified as Pro-Palestinian supports. The bad actors also called Jews “cancer dogs”, a vile slur in Dutch and told co-conspirators to bring fireworks to the planned attack. Dutch citizens and other observers were underwhelmed with the response of the Netherlands’ law enforcement. Even King Willem-Alexander noted that his country failed to protect the Jewish community when he spoke with Israeli President Isaac Herzog:
“Dutch king Willem-Alexander reportedly said to Israel’s President Isaac Herzog in a phone call on Friday morning that the ‘we failed the Jewish community of the Netherlands during World War II, and last night we failed again.’”
This an unfortunate example of the failure of cyber security tools that monitor social media. If this was a preplanned attack and the Daily Telegraph located the messages, then a cyber security company should have as well. These police ware and intelware systems failed to alert authorities. Is this another confirmation that cyber security and threat intelligence tools over promise and under deliver? Well, T-Mobile is compromised again and there is that minor lapse in Israel in October 2023.
Whitney Grace, November 22, 2024
More Googley Human Resource Goodness
November 22, 2024
This essay is the work of a dumb dinobaby. No smart software required.
The New York Post reported that a Googler has departed. “Google News Executive Shailesh Prakash Resigns As Tensions with Publishers Mount: Report” states:
Shailesh Prakash had served as a vice president and general manager for Google News. A source confirmed that he is no longer with the company… The circumstances behind Prakash’s resignation were not immediately clear. Google declined to comment.
Google tapped a professional who allegedly rode in the Bezos bulldozer when the world’s second or third richest man in the world acquired the Washington Post. (How has that been going? Yeah.)
Thanks, MidJourney. Good enough.
Google has been cheerfully indexing content and selling advertising for decades. After a number of years of talking and allegedly providing some support to outfits collecting, massaging, and making “real” news available, the Google is facing some headwinds.
The article reports:
The Big Tech giant rankled online publishers last May after it introduced a feature called “AI Overviews” – which places an auto-generated summary at the top of its search results while burying links to other sites. News Media Alliance, a nonprofit that represents more than 2,200 publishers, including The Post, said the feature would be “catastrophic to our traffic” and has called on the feds to intervene.
News flash from rural Kentucky: The good old days of newspaper publishing are unlikely to make a comeback. What’s the evidence for this statement? Video and outfits like Telegram and WhatsApp deliver content to cohorts who don’t think too much about a print anything.
The article pointed out:
Last month, The Post exclusively reported on emails that revealed how Google leveraged its access to the Office of the US Trade Representative as it sought to undermine overseas regulations — including Canada’s Online News Act, which required Google to pay for the right to display news content.
You can read that report “Google Emails with US Trade Reps Reveal Cozy Ties As Tech Giant Pushed to Hijack Policy” if you have time.
Let’s think about why a member of Google leadership like Shailesh Prakash would bail out. Among the options are:
- He wanted to spend more time with his family
- Another outfit wanted to hire him to manage something in the world of publishing
- He failed in making publishers happy.
The larger question is, “Why would Google think that one fellow could make a multi-decade problem go away?” The fact that I can ask this question reveals how Google’s consulting infused leaders think about an entire business sector. It also provides some insight into the confidence of a professional like Mr. Prakash.
What flees sinking ships? Certainly not the lawyers that Google will throw at this “problem.” Google has money and that may be enough to buy time and perhaps prevail. If there aren’t any publishers grousing, the problem gets resolved. Efficient.
Stephen E Arnold, November 22, 2024
Point-and-Click Coding: An eGame Boom Booster
November 22, 2024
TheNextWeb explains “How AI Can Help You Make a Computer Game Without Knowing Anything About Coding.” That’s great—unless one is a coder who makes one’s living on computer games. Writer Daniel Zhou Hao begins with a story about one promising young fellow:
“Take Kyo, an eight-year-old boy in Singapore who developed a simple platform game in just two hours, attracting over 500,000 players. Using nothing but simple instructions in English, Kyo brought his vision to life leveraging the coding app Cursor and also Claude, a general purpose AI. Although his dad is a coder, Kyo didn’t get any help from him to design the game and has no formal coding education himself. He went on to build another game, an animation app, a drawing app and a chatbot, taking about two hours for each. This shows how AI is dramatically lowering the barrier to software development, bridging the gap between creativity and technical skill. Among the range of apps and platforms dedicated to this purpose, others include Google’s AlphaCode 2 and Replit’s Ghostwriter.”
The write-up does not completely leave experienced coders out of the discussion. Hao notes tools like Tabnine and GitHub Copilot act as auto-complete assistance, while Sourcery and DeepCode take the tedium out of code cleanup. For the 70-ish percent of companies that have adopted one or more of these tools, he tells us, the benefits include time savings and more reliable code. Does this mean developers will to shift to “higher value tasks,” like creative collaboration and system design, as Hao insists? Or will it just mean firms will lighten their payrolls?
As for building one’s own game, the article lists seven steps. They are akin to basic advice for developing a product, but with an AI-specific twist. For those who want to know how to make one’s AI game addictive, contact benkent2020 at yahoo dot com.
Cynthia Murrell, November 22, 2024
China Smart, US Dumb: LLMs Bad, MoEs Good
November 21, 2024
Okay, an “MoE” is an alternative to LLMs. An “MoE” is a mixture of experts. An LLM is a one-trick pony starting to wheeze.
Google, Apple, Amazon, GitHub, OpenAI, Facebook, and other organizations are at the top of the list when people think about AI innovations. We forget about other countries and universities experimenting with the technology. Tencent is a China-based technology conglomerate located in Shenzhen and it’s the world’s largest video game company with equity investments are considered. Tencent is also the developer of Hunyuan-Large, the world’s largest MoE.
According to Tencent, LLMs (large language models) are things of the past. LLMs served their purpose to advance AI technology, but Tencent realized that it was necessary to optimize resource consumption while simultaneously maintaining high performance. That’s when the company turned to the next evolution of LLMs or MoE, mixture of experts models.
Cornell University’s open-access science archive posted this paper on the MoE: “Hunyuan-Large: An Open-Source MoE Model With 52 Billion Activated Parameters By Tencent” and the abstract explains it is a doozy of a model:
In this paper, we introduce Hunyuan-Large, which is currently the largest open-source Transformer-based mixture of experts model, with a total of 389 billion parameters and 52 billion activation parameters, capable of handling up to 256K tokens. We conduct a thorough evaluation of Hunyuan-Large’s superior performance across various benchmarks including language understanding and generation, logical reasoning, mathematical problem-solving, coding, long-context, and aggregated tasks, where it outperforms LLama3.1-70B and exhibits comparable performance when compared to the significantly larger LLama3.1-405B model. Key practice of Hunyuan-Large include large-scale synthetic data that is orders larger than in previous literature, a mixed expert routing strategy, a key-value cache compression technique, and an expert-specific learning rate strategy. Additionally, we also investigate the scaling laws and learning rate schedule of mixture of experts models, providing valuable insights and guidance for future model development and optimization. The code and checkpoints of Hunyuan-Large are released to facilitate future innovations and applications.”
Tencent has released Hunyuan-Large as an open source project, so other AI developers can use the technology! The well-known companies will definitely be experimenting with Hunyuan-Large. Is there an ulterior motive? Sure. Money, prestige, and power are at stake in the AI global game.
Whitney Grace, November 21, 2024
Management Brilliance Microsoft Suggests to Customers, “You Did It!”
November 21, 2024
No smart software. Just a dumb dinobaby. Oh, the art? Yeah, MidJourney.
I read an amusing write up called “Microsoft Says Unexpected Windows Server 2025 Automatic Upgrades Were Due to Faulty Third-Party Tools.” I love a management action which points the fingers at “you” — Partners, customers, and anyone other than the raucous Redmond-ians.
Good enough, MidJourney. Good enough.
The write up says that Microsoft says:
“Some devices upgraded automatically to Windows Server 2025 (KB5044284). This was observed in environments that use third-party products to manage the update of clients and servers,” Microsoft explained. “Please verify whether third-party update software in your environment is configured not to deploy feature updates. This scenario has been mitigated.”
The article then provides a translation of Microsoftese:
In other words, it’s not Microsoft – it’s you. The company also added the update had the “DeploymentAction=OptionalInstallation” tag, which patch management tools should read as being an optional, rather than recommended update.
Several observations:
- Pointing fingers works in some circumstances. Kindergarten type interactions feature the tactic.
- The problems of updates seem to be standard operating procedure.
- Bad actors love these types of reports because anecdotes about glitches and flaws say, “Come on in, folks.”
Is this a management strategy or an indicator of other issues?
Stephen E Arnold, November 21, 2024
Does Smart Software Forget?
November 21, 2024
A recent paper challenges the big dogs of AI, asking, “Does Your LLM Truly Unlearn? An Embarrassingly Simple Approach to Recover Unlearned Knowledge.” The study was performed by a team of researchers from Penn State, Harvard, and Amazon and published on research platform arXiv. True or false, it is a nifty poke in the eye for the likes of OpenAI, Google, Meta, and Microsoft, who may have overlooked the obvious. The abstract explains:
“Large language models (LLMs) have shown remarkable proficiency in generating text, benefiting from extensive training on vast textual corpora. However, LLMs may also acquire unwanted behaviors from the diverse and sensitive nature of their training data, which can include copyrighted and private content. Machine unlearning has been introduced as a viable solution to remove the influence of such problematic content without the need for costly and time-consuming retraining. This process aims to erase specific knowledge from LLMs while preserving as much model utility as possible.”
But AI firms may be fooling themselves about this method. We learn:
“Despite the effectiveness of current unlearning methods, little attention has been given to whether existing unlearning methods for LLMs truly achieve forgetting or merely hide the knowledge, which current unlearning benchmarks fail to detect. This paper reveals that applying quantization to models that have undergone unlearning can restore the ‘forgotten’ information.”
Oops. The team found as much as 83% of data thought forgotten was still there, lurking in the shadows. The paper offers a explanation for the problem and suggestions to mitigate it. The abstract concludes:
“Altogether, our study underscores a major failure in existing unlearning methods for LLMs, strongly advocating for more comprehensive and robust strategies to ensure authentic unlearning without compromising model utility.”
See the paper for all the technical details. Will the big tech firms take the researchers’ advice and improve their products? Or will they continue letting their investors and marketing departments lead them by the nose?
Cynthia Murrell, November 21, 2024
Short Snort: How to Find Undocumented APIs
November 20, 2024
This essay is the work of a dumb dinobaby. No smart software required.
The essay / how to “All the Data Can Be Yours” does a very good job of providing a hacker road map. The information in the write up includes:
- Tips for finding undocumented APIs in GitHub
- Spotting “fetch” requests
- WordPress default APIs
- Information in robots.txt files
- Using the Google
- Examining JavaScripts
- Poking into mobile apps
- Some helpful resources and tools.
Each of these items includes details; for example, specific search strings and “how to make a taco” type of instructions. Assembling this write up took quite a bit of work.
Those engaged in cyber security (white, gray, and black hat types) will find the write up quite interesting.
I want to point out that I am not criticizing the information per se. I do want to remind those with a desire to share their expertise of three behaviors:
- Some computer science and programming classes in interesting countries use this type of information to provide students with what I would call hands on instruction
- Some governments, not necessarily aligned with US interests, provide the tips to the employees and contractors to certain government agencies to test and then extend the functionalities of the techniques presented in the write up
- Certain information might be more effectively distributed in other communication channels.
Stephen E Arnold, November 20, 2024
Europe Wants Its Own Search System: Filtering, Trees, and More
November 20, 2024
This essay is the work of a dumb dinobaby. No smart software required.
I am not going to recount the history of search companies and government entities building an alternative to Google. One can toss in Bing, but Google is the Big Dog. Yandex is useful for Russian content. But there is a void even though Swisscows.com is providing anonymity (allegedly) and no tracking (allegedly).
Now a new European solution may become available. If you remember Pertimm, you probably know that Qwant absorbed some of that earlier search system’s goodness. And there is Ecosia, a search system which plants trees. The union of these two systems will be an alternative to Google. I think Exalead.com tried this before, but who remembers European search history in rural Kentucky?
“Two Upstart Search Engines Are Teaming Up to Take on Google” report:
The for-profit joint venture, dubbed European Search Perspective and located in Paris, could allow the small companies and any others that decide to join up to reduce their reliance on Google and Bing and serve results that are better tailored to their companies’ missions and Europeans’ tastes.
A possible name or temporary handle for the new search system is EUSP or European Search Perspective. What’s interesting is that the plumbing will be provided by a service provider named OVH. Four years ago, OVHcloud became a strategic partner of … wait for it … Google. Apparently that deal does not prohibit OVH from providing services to a European alternative to Google.
Also, you may recall that Eric Schmidt, former adult in the room at Google, suggested that Qwant kept him awake at night. Yes, Qwant has been a threat to Google for 13 years. How has that worked out? The original Qwant was interesting with a novel way of showing results from different types of sources. Now Qwant is actually okay. The problem with any search system, including Bing, is that the cost of maintaining an index containing new content and refreshing or updating previously indexed content is a big job. Toss in some AI goodness and cash burning furiously.
“Google” is now the word for search whether it works or does not. Perhaps regulatory actions will alter the fact that in Denmark, 99 percent of user queries flow to Google. Yep, Denmark. But one can’t go wrong with a ballpark figure like 95 percent of search queries outside of China and a handful of other countries are part of the Google market share.
How will the new team tackle the Google? I hope in a way that delivers more progress than Cogito. Remember that? Okay, no problem.
PS. Is a 13-year-old company an upstart? Sigh.
Stephen E Arnold, November 20, 2024
FOGINT: Kenya Throttles Telegram to Protect KCSE Exam Integrity
November 20, 2024
Secondary school students in Kenya need to do well on their all-encompassing final exam if they hope to go to college. Several Telegram services have emerged to assist students through this crucial juncture—by helping them cheat on the test. Authorities caught on to the practice and have restricted Telegram usage during this year’s November exams. As a result, reports Kenyans.co.ke, “NetBlocks Confirms Rising User Frustrations with Telegram Slowdown in Kenya.” Since Telegram is Kenya’s fifth most downloaded social-media platform, that is a lot of unhappy users. Writer Rene Otinga tells us:
“According to an internet observatory, NetBlocks, Telegram was restricted in Kenya with their data showing the app as being down across various internet providers. Users across the country have reported receiving several error messages while trying to interact with the app, including a ‘Connecting’ error when trying to access the Telegram desktop. However, a letter shared online from the Communications Authority of Kenya (CAK) also confirmed the temporary suspension of Telegram services to quell the perpetuation of criminal activities.”
Apparently, the restriction worked. We learn:
“On Friday, Education Principal Secretary Belio Kipsang said only 11 incidents of attempted sneaking of mobile phones were reported across the country. While monitoring examinations in Kiambu County, the PS said this was the fewest number of cheating cases the ministry had experienced in recent times.”
That is good news for honest students in Kenya. But for Telegram, this may be just the beginning of its regulatory challenges. Otinga notes:
“Governments are wary of the app, which they suspect is being used to spread disinformation, spread extremism, and in Kenya, promote examination cheating. European countries are particularly critical of the app, with the likes of Belarus, Russia, Ukraine, Germany, Norway, and Spain restricting or banning the messaging app altogether.”
Encryption can hide a multitude of sins. But when regulators are paying attention, it might not be enough to keep one out of hot water.
Cynthia Murrell, November 20, 2024
Entity Extraction: Not As Simple As Some Vendors Say
November 19, 2024
No smart software. Just a dumb dinobaby. Oh, the art? Yeah, MidJourney.
Most of the systems incorporating entity extraction have been trained to recognize the names of simple entities and mostly based on the use of capitalization. An “entity” can be a person’s name, the name of an organization, or a location like Niagara Falls, near Buffalo, New York. The river “Niagara” when bound to “Falls” means a geologic feature. The “Buffalo” is not a Bubalina; it is a delightful city with even more pleasing weather.
The same entity extraction process has to work for specialized software used by law enforcement, intelligence agencies, and legal professionals. Compared to entity extraction for consumer-facing applications like Google’s Web search or Apple Maps, the specialized software vendors have to contend with:
- Gang slang in English and other languages; for example, “bumble bee.” This is not an insect; it is a nickname for the Latin Kings.
- Organizations operating in Lao PDR and converted to English words like Zhao Wei’s Kings Romans Casino. Mr. Wei has been allegedly involved in gambling activities in a poorly-regulated region in the Golden Triangle.
- Individuals who use aliases like maestrolive, james44123, or ahmed2004. There are either “real” people behind the handles or they are sock puppets (fake identities).
Why do these variations create a challenge? In order to locate a business, the content processing system has to identify the entity the user seeks. For an investigator, chopping through a thicket of language and idiosyncratic personas is the difference between making progress or hitting a dead end. Automated entity extraction systems can work using smart software, carefully-crafted and constantly updated controlled vocabulary list, or a hybrid system.
Automated entity extraction systems can work using smart software, carefully-crafted and constantly updated controlled vocabulary list, or a hybrid system.
Let’s take an example which confronts a person looking for information about the Ku Group. This is a financial services firm responsible for the Kucoin. The Ku Group is interesting because it has been found guilty in the US for certain financial activities in the State of New York and by the US Securities & Exchange Commission.