Is the AskJeeves Approach the Next Big Thing Again?

March 14, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

Way back when I worked in silicon Valley or Plastic Fantastic as one 1080s wag put it, AskJeeves burst upon the Web search scene. The idea is that a user would ask a question and the helpful digital butler would fetch the answer. This worked for questions like “What’s the temperature in San Mateo?” The system did not work for the types of queries my little group of full-time equivalents peppered assorted online services.

image

A young wizard confronts a knowledge problem. Thanks, MSFT Copilot. How’s that security today? Okay, I understand. Good enough.

The mechanism involved software and people. The software processed the query and matched up the answer with the outputs in a knowledge base. The humans wrote rules. If there was no rule and no knowledge, the butler fell on his nose. It was the digital equivalent of nifty marketing applied to a system about as aware as the man servant in Kazuo Ishiguro’s The Remains of the Day.

I thought about AskJeeves as a tangent notion as I worked through “LLMs Are Not Enough… Why Chatbots Need Knowledge Representation.” The essay is an exploration of options intended to reduce the computational cost, power sucking, and blind spots in large language models. Progress is being made and will be made. A good example is this passage from the essay which sparked my thinking about representing knowledge. This is a direct quote:

In theory, there’s a much better way to answer these kinds of questions.

  1. Use an LLM to extract knowledge about any topics we think a user might be interested in (food, geography, etc.) and store it in a database, knowledge graph, or some other kind of knowledge representation. This is still slow and expensive, but it only needs to be done once rather than every time someone wants to ask a question.
  2. When someone asks a question, convert it into a database SQL query (or in the case of a knowledge graph, a graph query). This doesn’t necessarily need a big expensive LLM, a smaller one should do fine.
  3. Run the user’s query against the database to get the results. There are already efficient algorithms for this, no LLM required.
  4. Optionally, have an LLM present the results to the user in a nice understandable way.

Like AskJeeves, the idea is a good one. Create a system to take a user’s query and match it to information answering the question. The AskJeeves’ approach embodied what I called rules. As long as one has the rules, the answers can be plugged in to a database. A query arrives, looks for the answer, and presents it. Bingo. Happy user with relevant information catapults AskJeeves to the top of a world filled with less elegant solutions.

The question becomes, “Can one represent knowledge in such a way that the information is current, usable, and “accurate” (assuming one can define accurate). Knowledge, however, is a slippery fish. Small domains with well-defined domains chock full of specialist knowledge should be easier to represent. Well, IBM Watson and its adventure in Houston suggests that the method is okay, but it did not work. Larger scale systems like an old-fashioned Web search engine just used “signals” to produce lists which presumably presented answers. “Clever,” correct? (Sorry, that’s an IBM Almaden bit of humor. I apologize for the inside baseball moment.)

What’s interesting is that youthful laborers in the world of information retrieval are finding themselves arm wrestling with some tough but elusive problems. What is knowledge? The answer, “It depends” does not provide much help. Where does knowledge originate, the answer “No one knows for sure.” That does not advance the ball downfield either.

Grabbing epistemology by the shoulders and shaking it until an answer comes forth is a tough job. What’s interesting is that those working with large language models are finding themselves caught in a room of mirrors intact and broken. Here’s what TheTemples.org has to say about this imaginary idea:

The myth represented in this Hall tells of the divinity that enters the world of forms fragmenting itself, like a mirror, into countless pieces. Each piece keeps its peculiarity of reflecting the absolute, although it cannot contain the whole any longer.

I have no doubt that a start up with venture funding will solve this problem even though a set cannot contain itself. Get coding now.

Stephen E Arnold, March 14, 2024

AI Limits: The Wind Cannot Hear the Shouting. Sorry.

March 14, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

One of my teachers had a quote on the classroom wall. It was, I think, from a British novelist. Here’s what I recall:

Decide on what you think is right and stick to it.

I never understood the statement. In school, I was there to learn. How could I decide whether what I was reading was correct. Making a decision about what I thought was stupid because I was uninformed. The notion of “stick” is interesting and also a little crazy. My family was going to move to Brazil, and I knew that sticking to what I did in the Midwest in the 1950s would have to change. For one thing, we had electricity. The town to which we were relocating had electricity a few hours each day. Change was necessary. Even as a young sprout, trying to prevent something required more than talk, writing a Letter to the Editor, or getting a petition signed.

I thought about this crazy quote as soon as I read “AI Bioweapons? Scientists Agree to Policies to Reduce Risk of Human Disaster.” The fear mongering note of the write up’s title intrigued me. Artificial intelligence is in what I would call morph mode. What this means is that getting a fix on what is new and impactful in the field of artificial intelligence is difficult. An electrical engineering publication reported that experts are not sure if what is going on is good or bad.

image

Shouting into the wind does not work for farmers nor AI scientists. Thanks, MSFT Copilot. Busy with security again?

The “AI Bioweapons” essay is leaning into the bad side of the AI parade. The point of the write up is that “over 100 scientists” want to “prevent the creation of AI bioweapons.” The article states:

The agreement, crafted following a 20230 University of Washington summit and published on Friday, doesn’t ban or condemn AI use. Rather, it argues that researchers shouldn’t develop dangerous bioweapons using AI. Such an ask might seem like common sense, but the agreement details guiding principles that could help prevent an accidental DNA disaster.

That sounds good, but is it like the quote about “decide on what you think is right and stick to it”? In a dynamic environment, change is appears to accelerate. Toss in technology and the potential for big wins (either financial, professional, or political), and the likelihood of slowing down the rate of change is reduced.

To add some zip to the AI stew, much of the technology required to do some AI fiddling around is available as open source software or low-cost applications and APIs.

I think it is interesting that 100 scientists want to prevent something. The hitch in the git-along is that other countries have scientists who have access to AI research, tools, software, and systems. These scientists may feel as thought their reminding people that doom is (maybe?) just around the corner or a ruined building in an abandoned town on Route 66.

Here are a few observations about why individuals rally around a cause, which is widely perceived by some of those in the money game as the next big thing:

  1. The shouters perception of their importance makes it an imperative to speak out about danger
  2. Getting a group of important, smart people to climb on a bandwagon makes the organizers perceive themselves as doing something important and demonstrating their “get it done” mindset
  3. Publicity is good. It is very good when a speaking engagement, a grant, or consulting gig produces a little extra fame and money, preferably in a combo.

Net net: The wind does not listen to those shouting into it.

Stephen E Arnold, March 14, 2024

AI Deepfakes: Buckle Up. We Are in for a Wild Drifting Event

March 14, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

AI deepfakes are testing the uncanny valley but technology is catching up to make them as good as the real thing. In case you’ve been living under a rock, deepfakes are images, video, and sound clips generated by AI algorithms to mimic real people and places. For example, someone could create a deepfake video of Joe Biden and Donald Trump in a sumo wrestling match. While the idea of the two presidential candidates duking it out on a sumo mat is absurd, technology is that advanced.

Gizmodo reports the frustrating news that “The AI Deepfakes Problem Is Going To Get Unstoppably Worse”. Bad actors are already using deepfakes to wreak havoc on the world. Federal regulators outlawed robocalls and OpenAI and Google released watermarks on AI-generated images. These aren’t doing anything to curb bad actors.

image

Which is real? Which is fake? Thanks, MSFT Copilot, the objects almost appear identical. Close enough like some security features. Close enough means good enough, right?

New laws and technology need to be adopted and developed to prevent this new age of misinformation. There should be an endless amount of warnings on deepfake videos and soundbites, not to mention service providers should employ them too. It is going to take a horrifying event to make AI deepfakes more prevalent:

"Deepfake detection technology also needs to get a lot better and become much more widespread. Currently, deepfake detection is not 100% accurate for anything, according to Copyleaks CEO Alon Yamin. His company has one of the better tools for detecting AI-generated text, but detecting AI speech and video is another challenge altogether. Deepfake detection is lagging generative AI, and it needs to ramp up, fast.”

Wired Magazine missed an opportunity to make clear that the wizards at Google can sell data and advertising, but the sneaker-wearing marvels cannot manage deepfake adult pictures. Heck, Google cannot manage YouTube videos teaching people how to create deepfakes. My goodness, what happens if one uploads ASCII art of a problematic item to Gemini? One of my team tells me that the Sundar & Prabhakar guard rails, don’t work too well in some situations.

Not every deepfake will be as clumsy as the one the “to be maybe” future queen of England finds herself ensnared. One can ask Taylor Swift I assume.

Whitney Grace’s March 14, 2024

Can Your Job Be Orchestrated? Yes? Okay, It Will Be Smartified

March 13, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

My work career over the last 60 years has been filled with luck. I have been in the right place at the right time. I have been in companies which have been acquired, reassigned, and exposed to opportunities which just seemed to appear. Unlike today’s young college graduate, I never thought once about being able to get a “job.” I just bumbled along. In an interview for something called Singularity, the interviewer asked me, “What’s been the key to your success?” I answered, “Luck.” (Please, keep in mind that the interviewer assumed I was a success, but he had no idea that I did not want to be a success. I just wanted to do interesting work.)

image

Thanks, MSFT Copilot. Will smart software do your server security? Ho ho ho.

Would I be able to get a job today if I were 20 years old? Believe it or not, I told my son in one of our conversations about smart software: “Probably not.” I thought about this comment when I read today (March 13, 2024) the essay “Devin AI Can Write Complete Source Code.” The main idea of the article is that artificial intelligence, properly trained, appropriately resourced can do what only humans could do in 1966 (when I graduated with a BA degree from a so so university in flyover country). The write up states:

Devin is a Generative AI Coding Assistant developed by Cognition that can write and deploy codes of up to hundreds of lines with just a single prompt.  Although there are some similar tools for the same purpose such as Microsoft’s Copilot, Devin is quite the advancement as it not only generates the source code for software or website but it debugs the end-to-end before the final execution.

Let’s assume the write up is mostly accurate. It does not matter. Smart software will be shaped to deliver what I call orchestrated solutions either today, tomorrow or next month. Jobs already nuked by smartification are customer service reps, boilerplate writing jobs (hello, McKinsey), and translation. Some footloose and fancy free gig workers without AI skills may face dilemmas about whether to pursue begging, YouTubing the van life, or doing some spelunking in the Chemical Abstracts database for molecular recipes in a Walmart restroom.

The trajectory of applied AI is reasonably clear to me. Once “programming” gets swept into the Prada bag of AI, what other professions will be smartified? Once again, the likely path is light by dim but visible Alibaba solar lights for the garden:

  1. Legal tasks which are repetitive even though the cases are different, the work flow is something an average law school graduate can master and learn to loathe
  2. Forensic accounting. Accountants are essentially Ground Hog Day people, because every tax cycle is the same old same old
  3. Routine one-day surgeries. Sorry, dermatologists, cataract shops, and kidney stone crunchers. Robots will do the job and not screw up the DRG codes too much.
  4. Marketers. I know marketing requires creative thinking. Okay, but based on the Super Bowl ads this year, I think some clients will be willing to give smart software a whirl. Too bad about filming a horse galloping along the beach in Half Moon Bay though. Oh, well.

That’s enough of the professionals who will be affected by orchestrated work flows surfing on smartified software.

Why am I bothering to write down what seems painfully obvious to my research team?

I just wanted another reason to say, “I am glad I am old.” What many young college graduates will discover that despite my “luck” over the course of my work career, smartified software will not only kill some types of work. Smart software will remove the surprise  in a serendipitous life journey.

To reiterate my point: I am glad I am old and understand efficiency, smartification, and the value of having been lucky.

Stephen E Arnold, March 13, 2024

AI Bubble Gum Cards

March 13, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

A publication for electrical engineers has created a new mechanism for making AI into a collectible. Navigate to “The AI apocalypse: A Scorecard.” Scroll down to the part of the post which looks like the gems from the 1050s:

image

The idea is to pick 22 experts and gather their big ideas about AI’s potential to destroy humanity. Here’s one example of an IEEE bubble gum card:

image

© by the estimable IEEE.

The information on the cards is eclectic. It is clear that some people think smart software will kill me and you. Others are not worried.

My thought is that IEEE should expand upon this concept; for example, here are some bubble gum card ideas:

  • Do the NFT play? These might be easier to sell than IEEE memberships and subscriptions to the magazine
  • Offer actual, fungible packs of trading cards with throw-back bubble gum
  • Create an AI movie about AI experts with opposing ideas doing battle in a video game type world. Zap. You lose, you doubter.

But the old-fashioned approach to selling trading cards to grade school kids won’t work. First, there are very few corner stores near schools in many cities. Two, a special interest group will agitate to block the sale of cards about AI because the inclusion of chewing gum will damage children’s teeth. And, three, kids today want TikToks, at least until the service is banned from a fast-acting group of elected officials.

I think the IEEE will go in a different direction; for example, micro USBs with AI images and source code on them. Or, the IEEE should just advance to the 21st-century and start producing short-form AI videos.

The IEEE does have an opportunity. AI collectibles.

Stephen E Arnold, March 13, 2024

Want Clicks: Do Sad, Really, Really Sorrowful

March 13, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

The US is a hotbed of negative news. It’s what drives the media and perpetuates the culture of fear that (arguably) has plagued the country since colonial times. US citizens and now the rest of the world are so addicted to bad news that a research team got the brilliant idea to study what words people click. Nieman Lab wrote about the study in, “Negative Words In News Headlines Generate More Clicks-But Sad Words Are More Effective Than Angry Or Scary Ones.”

image

Thanks, MSFT Copilot. One of Redmond’s security professionals I surmise?

Negative words are prevalent in headlines because they sell clicks. The Nature Human Behavior(u)r journal published a study called “Negativity Drives Online News Consumption.” The study analyzed the effect of negative and emotional words on news consumption and the research team discovered that negativity increased clickability. These findings also confirm the well-documented behavior of humans seeking negativity in all information-seeking.

It coincides with humanity’s instinct to be vigilant of any danger and avoid it. While humans instinctually gravitate towards negative headlines, certain negative words are more popular than others. Humans apparently are driven to click on sad-related synonyms, avoid anything resembling joy or fear, and angry words don’t have any effect. It all goes back to survival:

“And if we are to believe “Bad is stronger than good” derives from evolutionary psychology — that it arose as a useful heuristic to detect threats in our environment — why would fear-related words reduce likelihood to click? (The authors hypothesize that fear and anger might be more important in generating sharing behavior — which is public-facing — than clicks, which are private.)

In any event, this study puts some hard numbers to what, in most newsrooms, has been more of an editorial hunch: Readers are more drawn to negativity than to positivity. But thankfully, the effect size is small — and I’d wager that it’d be even smaller for any outlet that decided to lean too far in one direction or the other.”

It could also be a strict diet of danger-filled media too.

Whitney Grace, March 13, 2024

AI to AI Program for March 12, 2024, Now Available

March 12, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

Erik Arnold, with some assistance from Stephen E Arnold (the father) has produced another installment of AI to AI: Smart Software for Government Use Cases.” The program presents news and analysis about the use of artificial intelligence (smart software) in government agencies.

image

The ad-free program features Erik S. Arnold, Managing Director of Govwizely, a Washington, DC consulting and engineering services firm. Arnold has extensive experience working on technology projects for the US Congress, the Capitol Police, the Department of Commerce, and the White House. Stephen E Arnold, an adviser to Govwizely, also participates in the program. The current episode explores five topics in an father-and-son exploration of important, yet rarely discussed subjects. These include the analysis of law enforcement body camera video by smart software, the appointment of an AI information czar by the US Department of Justice, copyright issues faced by UK artificial intelligence projects, the role of the US Marines in the Department of Defense’s smart software projects, and the potential use of artificial intelligence in the US Patent Office.

The video is available on YouTube at https://youtu.be/nsKki5P3PkA. The Apple audio podcast is at this link.

Stephen E Arnold, March 12, 2024

AI Hermeneutics: The Fire Fights of Interpretation Flame

March 12, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

My hunch is that not too many of the thumb-typing, TikTok generation know what hermeneutics means. Furthermore, like most of their parents, these future masters of the phone-iverse don’t care. “Let software think for me” would make a nifty T shirt slogan at a technology conference.

This morning (March 12, 2024) I read three quite different write ups. Let me highlight each and then link the content of those documents to the the problem of interpretation of religious texts.

image

Thanks, MSFT Copilot. I am confident your security team is up to this task.

The first write up is a news story called “Elon Musk’s AI to Open Source Grok This Week.” The main point for me is that Mr. Musk will put the label “open source” on his Grok artificial intelligence software. The write up includes an interesting quote; to wit:

Musk further adds that the whole idea of him founding OpenAI was about open sourcing AI. He highlighted his discussion with Larry Page, the former CEO of Google, who was Musk’s friend then. “I sat in his house and talked about AI safety, and Larry did not care about AI safety at all.”

The implication is that Mr. Musk does care about safety. Okay, let’s accept that.

The second story is an ArXiv paper called “Stealing Part of a Production Language Model.” The authors are nine Googlers, two ETH wizards, one University of Washington professor, one OpenAI researcher, and one McGill University smart software luminary. In short, the big outfits are making clear that closed or open, software is rising to the task of revealing some of the inner workings of these “next big things.” The paper states:

We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI’s ChatGPT or Google’s PaLM-2…. For under $20 USD, our attack extracts the entire projection matrix of OpenAI’s ada and babbage language models.

The third item is “How Do Neural Networks Learn? A Mathematical Formula Explains How They Detect Relevant Patterns.” The main idea of this write up is that software can perform an X-ray type analysis of a black box and present some useful data about the inner workings of numerical recipes about which many AI “experts” feign total ignorance.

Several observations:

  1. Open source software is available to download largely without encumbrances. Good actors and bad actors can use this software and its components to let users put on a happy face or bedevil the world’s cyber security experts. Either way, smart software is out of the bag.
  2. In the event that someone or some organization has secrets buried in its software, those secrets can be exposed. One the secret is known, the good actors and the bad actors can surf on that information.
  3. The notion of an attack surface for smart software now includes the numerical recipes and the model itself. Toss in the notion of data poisoning, and the notion of vulnerability must be recast from a specific attack to a much larger type of exploitation.

Net net: I assume the many committees, NGOs, and government entities discussing AI have considered these points and incorporated these articles into informed policies. In the meantime, the AI parade continues to attract participants. Who has time to fool around with the hermeneutics of smart software?

Stephen E Arnold, March 12, 2024

Microsoft and Security: A Rerun with the Same Worn-Out Script

March 12, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

The Marvel cinematic universe has spawned two dozen sequels. Microsoft’s security circus features are moving up fast in the reprise business. Unfortunately there is no super hero who comes to the rescue of the giant American firm. The villains in these big screen stunners are a bit like those in the James Bond films. Microsoft seems to prefer to wrestle with the allegedly Russian cozy bear or at least convert a cartoon animal into the personification of evil.

image

Thanks, MSFT, you have nailed security theater and reruns of the same tired story.

What’s interesting about these security blockbusters is that each follows a Hollywood style “you’ve seen this before nudge nudge” approach to the entertainment. The sequence is a belated announcement that Microsoft security has been breached. The evil bad actors have stolen data, corrupted software, and by brute force foiled the norm cores in Microsoft World. Then announcements about fixes that the Microsoft custoemr must implement along with admonitions to keep that MSFT software updated and warnings about using “old” computers, etc. etc.

Russian Hackers Accessed Microsoft Source Code” is the equivalent of New York Times film review. The write up reports:

In January, Microsoft disclosed that Russian hackers had breached the company’s systems and managed to read emails belonging to senior executives. Now, the company has revealed that the breach was worse than initially understood and that the Russian hackers accessed Microsoft source code. Friday’s revelation — made in a blog post and a filing with the Securities and Exchange Commission — is the latest in a string of breaches affecting the company that have raised major questions in Washington about Microsoft’s security posture.

Well, that’s harsh. No mention of the estimable alleged monopoly’s releasing the information on March 7, 2024. I am capturing my thoughts on March 8, 2024. But with college basketball moving toward tournament time, who cares? I am not really sure any more. And Washington? Does the name evoke a person, a committee, a committee consisting of the heads of security committees, someone in the White House, an “expert” at the suddenly famous National Bureau of Standards, or absolutely no one.

The write asserts:

The company is concerned, however, that “Midnight Blizzard is attempting to use secrets of different types it has found,” including in emails between customers and Microsoft. “As we discover them in our exfiltrated email, we have been and are reaching out to these customers to assist them in taking mitigating measures,” the company said in its blog post. The company describes the incident as an example of “what has become more broadly an unprecedented global threat landscape, especially in terms of sophisticated nation-state attacks.” In response, the company has said it is increasing the resources and attention devoted to securing its systems.

Microsoft is “reaching out.” I can reach for a donut, but I do not grasp it and gobble it down. “Reach” is not the same as fixing the problems Microsoft caused.

Several observations:

  1. Microsoft is an alleged monopoly, and it is allowing its digital trains to set fire to the fields, homes, and businesses which have to use its tracks. Isn’t it time for purposeful action from the US government agencies with direct responsibility for cyber security and appropriate business conduct?
  2. Can Microsoft remediate its problems? My answer is, “No.” Vulnerabilities are engineered in because no one has the time, energy, or interest to chase down problems and fix them. There is an ageing programmer named Steve Gibson. His approach to software is the exact opposite of Microsoft’s. Mr. Gibson will never be a trillion dollar operation, but his software works. Perhaps Microsoft should consider adopting some of Mr. Gibson’s methods.
  3. Customers have to take a close look at the security breaches endlessly reported by cyber security companies. Some outfits’ software is on the list most of the time. Other companies’ software is an infrequent visitor to these breach parties. Is it time for customers to be looking for an alternative to what Microsoft provides?

Net net: A new security release will be coming to the computer near you. Don’t fail to miss it.

Stephen E Arnold, March 12, 2024

x

x

x

x

x

Another Small Victory for OpenAI Against Authors

March 12, 2024

green-dino_thumb_thumb_thumbThis essay is the work of a dumb dinobaby. No smart software required.

For those following the fight between human content creators and AI firms, score one for the algorithm engineers. TorrentFreak reports, “Court Dismisses Authors’ Copyright Infringement Claims Against OpenAI.” At issue is generative AI’s practice of feeding on humans’ work, without compensation, in order to mimic it. Multiple suits have been filed by record labels, writers, and visual artists. Reporter Ernesto Van der Sar writes:

“Several of the lawsuits filed by book authors include a piracy component. The cases allege that tech companies, including Meta and OpenAI, used the controversial Books3 dataset to train their models. The Books3 dataset was created by AI researcher Shawn Presser in 2020, who scraped the library of ‘pirate’ site Bibliotik. The general vision was that the plaintext collection of more than 195,000 books, which is nearly 37GB in size, could help AI enthusiasts build better models. The vision wasn’t wrong; large text archives are great training material for Large Language Models, but many authors disapprove of their works being used in this manner, without permission or compensation.”

image

A large group of rights holders have a football team. Those big folks are chasing the small but feisty opponent down the field. Which team will score? Thanks, MSFT Copilot. Keep up the good enough work.

Is that so unreasonable? Maybe not, but existing copyright law did not foresee this situation. We learn:

“After reviewing input from both sides, California District Judge Araceli Martínez-Olguín ruled on the matter. In her order, she largely sides with OpenAI. The vicarious copyright infringement claim fails because the court doesn’t agree that all output produced by OpenAI’s models can be seen as a derivative work. To survive, the infringement claim has to be more concrete.”

The plaintiffs are not out of moves, however. They can still file an amended complaint. But unless updated legislation is passed in the meantime, they may just be rebuffed again. So all they need is for Congress to act quickly to protect artists from tech firms. Any day now.

Cynthia Murrell, March 12, 2024

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta