Weaponization of LLMs Is a Thing. Will Users Care? Nope

October 10, 2025

This essay is the work of a dumb dinobaby. No smart software required.

A European country’s intelligence agency learned about my research into automatic indexing. We did a series of lectures to a group of officers. Our research method, the results, and some examples preceded a hands on activity. Everyone was polite. I delivered versions of the lecture to some public audiences. At one event, I did a live demo with a couple of people in the audience. Each followed a procedure, and I showed the speed with which the method turned up in the Google index. These presentations took place in the early 2000s. I assumed that the behavior we discovered would be disseminated and then it would diffuse. It was obvious that:

Weaponized content would be “noted” by daemons looking for new and changed information
The systems were sensitive to what I called “pulses” of data. We showed how widely used algorithms react to sequences of content
The systems would alter what they would output based on these “augmented content objects.”

In short, online systems could be manipulated or weaponized with specific actions. Most of these actions could be orchestrated and tuned to have maximum impact. One example in my talks was taking a particular word string and making it turn up in queries where one would not expect that behavior. Our research showed that a few as four weaponized content objects orchestrated in a specific time interval would do the trick. Yep, four. How many weaponized write ups can my local installation of LLMs produce in 15 minutes? Answer: Hundreds. How long does it take to push those content objects into information streams used for “training.” Seconds.

Fish live in an environment. Do fish know about the outside world? Thanks, Midjourney. Not a ringer but close enough in horseshoes.

I was surprised when I read “A Small Number of Samples Can Poison LLMs of Any Size.” You can read the paper and work through the prose. The basic idea is that selecting or shaping training data or new inputs to recalibrate training data can alter what the target system does. I quite like the phrase “weaponize information.” Not only does the method work, it can be automated.

What’s this mean?

The intentional selection of information or the use of a sample of information from a domain can generate biases in what the smart software knows, thinks, decides, and outputs. Dr. Timnit Gebru and her parrot colleagues were nibbling around the Google cafeteria. Their research caused the Google to put up a barrier to this line of thinking. My hunch is that she and her fellow travelers found that content that is representative will reflect the biases of the authors. This means that careful selection of content for training or updating training sets can be steered. That’s what the Anthropic write up make clear.

Several observations are warranted:

Whoever selects training data or the information used to update and recalibrate training data can control what is displayed, recommended, or included in outputs like recommendations
Users of online systems and smart software are like fish in a fish bowl. The LLM and smart software crowd are the people who fill the bowl and feed the fish. Fish have a tough time understanding what’s outside their bowl. I don’t like the word “bubble” because these pop. An information fish bowl is tough to escape and break.
As smart software companies converge into essentially an oligopoly using the types of systems I described in the early 2000s with some added sizzle from the Transformer thinking, a new type of information industrial complex is being assembled on a very large scale. There’s a reason why Sam AI-Man can maintain his enthusiasm for ChatGPT. He sees the potential of seemingly innocuous functions like apps within ChatGPT.

There are some interesting knock on effects from this intentional or inadvertent weaponization of online systems. One is that the escalating violent incidents are an output of these online systems. Inject some René Girard-type content into training data sets. Watch what those systems output. “Real” journalists are explaining how they use smart software for background research. Student uses online systems without checking to see if the outputs line up with what other experts say. What about investment firms allowing smart software to make certain financial decisions.

Weaponize what the fish live in and consume. The fish are controlled and shaped by weaponized information. How long has this quirk of online been known? A couple of decades, maybe more. Why hasn’t “anything” been done to address this problem? Fish just ask, “What problem?”

Stephen E Arnold, October x, 2025

I spotted

Written by Stephen E. Arnold · Filed Under AI, Business strategy, Data, News, Text processing | Leave a Comment

Racers, Start Your Work Around Engines

January 16, 2025

Prepared by a still-alive dinobaby.

Companies are now prohibited from sending our personal information to specific, hostile nations. Because tech firms must be forced to exercise common sense, apparently. TechRadar reports, "US Government Says Companies Are No Longer Allowed to Send Bulk Data to these Nations." The restriction is the final step in implementing Executive Order 14117, which President Biden signed nearly a year ago. It is to take effect at the beginning of April.

The rule names six countries the DoJ says have “engaged in a long-term pattern or serious instances of conduct significantly adverse to the national security of the United States or the security and safety of U.S. persons”: China, Cuba, Iran, North Korea, Russia, and Venezuela. Writer Benedict Collins tells us:

"The Executive Order is aimed at preventing countries generally hostile to the US from using the data of US citizens in cyber espionage and influence campaigns, as well as building profiles of US citizens to be used in social engineering, phishing, blackmail, and identity theft campaigns. The final rule sets out the threshold for transactions of data that carry an unacceptable level of risk, alongside the different classes of transactions that are prohibited, restricted or exempt. Companies that violate the order will face civil and criminal penalties."

The restriction covers geolocation data; personal identifiers like social security numbers; biometric identifiers; personal health data; personal financial information; and data on our very cells. The agency clarifies some activities that are not prohibited:

"The DoJ also outlined the final rule does not apply to ‘medical, health, or science research or the development and marketing of new drugs’ and ‘also does not broadly prohibit U.S. persons from engaging in commercial transactions, including exchanging financial and other data as part of the sale of commercial goods and services with countries of concern or covered persons, or impose measures aimed at a broader decoupling of the substantial consumer, economic, scientific, and trade relationships that the United States has with other countries.’"

So, outside those exceptions, the idea is that US firms will not be sending our personal data to these hostile countries. That is the theory. However, organizations gather data from mobile phone apps, from exfiltrated mobile phone records, from “gray” data aggregators. How does one find entities providing conduits for information outflows? A bit of sleuthing on Telegram or searches on Dark Web search engines provide a number of contact points. Are the data reliable, accurate, and timely? Bad data are plentiful, but by acquiring or assembling information, bad actors send out their messages. Volume and human nature work.

Cynthia Murrell, January 16, 2025

Written by Stephen E. Arnold · Filed Under Data, News, Security | Comments Off on Racers, Start Your Work Around Engines

Common Sense from an AI-Centric Outfit: How Refreshing

July 11, 2024

This essay is the work of a dumb dinobaby. No smart software required.

In the wild and wonderful world of smart software, common sense is often tucked beneath a stack of PowerPoint decks and vaporized in jargon-spouting experts in artificial intelligence. I want to highlight “Interview: Nvidia on AI Workloads and Their Impacts on Data Storage.” An Nvidia poohbah named Charlie Boyle output some information that is often ignored by quite a few of those riding the AI pony to the pot of gold at the end of the AI rainbow.

The King Arthur of senior executives is confident that in his domain he is the master of his information. By the way, this person has an MBA, a law degree, and a CPA certification. His name is Sir Walter Mitty of Dorksford, near Swindon. Thanks, MSFT Copilot. Good enough.

Here’s the pivotal statement in the interview:

… a big part of AI for enterprise is understanding the data you have.

Yes, the dwellers in carpetland typically operate with some King Arthur type myths galloping around the castle walls; specifically:

Myth 1: We have excellent data

Myth 2: We have a great deal of data and more arriving every minute our systems are online

Myth 3: Out data are available and in just a few formats. Processing the information is going to be pretty easy.

Myth 4: Out IT team can handle most of the data work. We may not need any outside assistance for our AI project.

Will companies map these myths to their reality? Nope.

The Nvidia expert points out:

…there’s a ton of ready-made AI applications that you just need to add your data to.

“Ready made”: Just like a Betty Crocker cake mix my grandmother thought tasted fake, not as good as home made. Granny’s comment could be applied to some of the AI tests my team have tracked; for example, the Big Apple’s chatbot outputting comments which violated city laws or the exciting McDonald’s smart ordering system. Sure, I like bacon on my on-again, off-again soft serve frozen dessert. Doesn’t everyone?

The Nvidia experts offers this comment about storage:

If it’s a large model you’re training from scratch you need very fast storage because a lot of the way AI training works is they all hit the same file at the same time because everything’s done in parallel. That requires very fast storage, very fast retrieval.

Is that a problem? Nope. Just crank up the cloud options. No big deal, except it is. There are costs and time to consider. But otherwise this is no big deal.

The article contains one gems and wanders into marketing “don’t worry” territory.

From my point of view, the data issue is the big deal. Bad, stale, incomplete, and information in odd ball formats — these exist in organizations now. The mass of data may have 40 percent or more which has never been accessed. Other data are back ups which contain versions of files with errors, copyright protected data, and Boy Scout trip plans. (Yep, non work information on “work” systems.)

Net net: The data issue is an important one to consider before getting into the let’s deploy a customer support smart chatbot. Will carpetland dwellers focus on the first step? Not too often. That’s why some AI projects get lost or just succumb to rising, uncontrollable costs. Moving data? No problem. Bad data? No problem. Useful AI system? Hmmm. How much does storage cost anyway? Oh, not much.

Stephen E Arnold, July 11, 2024

Written by Stephen E. Arnold · Filed Under AI, Data, Financial, News | Comments Off on Common Sense from an AI-Centric Outfit: How Refreshing

Mastercard and Customer Information: A Lone Ranger?

October 26, 2023

Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

In my lectures, I often include a pointer to sites selling personal data. Earlier this month, I explained that the clever founder of Frank Financial acquired email information about high school students from two off-the-radar data brokers. These data were mixed with “real” high school student email addresses to provide a frothy soup of more than a million email addresses. These looked okay. The synthetic information was “good enough” to cause JPMorgan Chase to output a bundle of money to the alleged entrepreneur winners.

A fisherman chasing a slippery eel named Trust. Thanks, MidJourney. You do have a knack for recycling Godzilla art, don’t you?

I thought about JPMorgan Chase when I read “Mastercard Should Stop Selling Our Data.” The article makes clear that Mastercard sells its customers (users?) data. Mastercard is a financial institution. JPMC is a financial institution. One sells information; the other gets snookered by data. I assume that’s the yin and yang of doing business in the US.

The larger question is, “Are financial institutions operating in a manner harmful to themselves (JPMC) and harmful to others (personal data about Mastercard customers (users?). My hunch is that today I am living in an “anything goes” environment. Would the Great Gatsby be even greater today? Why not own Long Island and its railroad? That sounds like a plan similar to those of high fliers, doesn’t it?

The cited article has a bias. The Electronic Frontier Foundation is allegedly looking out for me. I suppose that’s a good thing. The article aims to convince me; for example:

the company’s position as a global payments technology company affords it “access to enormous amounts of information derived from the financial lives of millions, and its monetization strategies tell a broader story of the data economy that’s gone too far.” Knowing where you shop, just by itself, can reveal a lot about who you are. Mastercard takes this a step further, as U.S. PIRG reported, by analyzing the amount and frequency of transactions, plus the location, date, and time to create categories of cardholders and make inferences about what type of shopper you may be. In some cases, this means predicting who’s a “big spender” or which cardholders Mastercard thinks will be “high-value”—predictions used to target certain people and encourage them to spend more money.

Are outfits like Chase Visa selling their customer (user) data? (Yep, the same JPMC whose eagle eyed acquisitions’ team could not identify synthetic data) and enables some Amazon credit card activities. Also, what about men-in-the-middle like Amazon? The data from its much-loved online shopping, book store, and content brokering service might be valuable to some I surmise? How much would an entity pay for information about an Amazon customer who purchased item X (a 3D printer) and purchased Kindle books about firearm related topics be worth?

The EFF article uses a word which gives me the willies: Trust. For a time, when I was working in different government agencies, the phrase “trust but verify” was in wide use. Am I able to trust the EFF and its interpretation from a unit of the Public Interest Network? Am I able to trust a report about data brokering? Am I able to trust an outfit like JPMC?

My thought is that if JPMC itself can be fooled by a 31 year old and a specious online app, “trust” is not the word I can associate with any entity’s action in today’s business environment.

This dinobaby is definitely glad to be old.

Stephen E Arnold, October 26, 2023

Written by Stephen E. Arnold · Filed Under Business strategy, Data, Data mining, News | Comments Off on Mastercard and Customer Information: A Lone Ranger?

Why Some Outputs from Smart Software Are Wonky

July 26, 2021

Some models work like a champ. Utility rate models are reasonably reliable. When it is hot, use of electricity goes up. Rates are then “adjusted.” Perfect. Other models are less solid; for example, Bayesian systems which are not checked every hour or large neural nets which are “assumed” to be honking along like a well-ordered flight of geese. Why do I offer such Negative Ned observations? Experience for one thing and the nifty little concepts tossed out by Ben Kuhn, a Twitter persona. You can locate this string of observations at this link. Well, you could as of July 26, 2021, at 630 am US Eastern time. Here’s a selection of what are apparently the highlights of Mr. Kuhn’s conversation with “a former roommate.” That’s provenance enough for me.

Item One:

Most big number theory results are apparently 50-100 page papers where deeply understanding them is ~as hard as a semester-long course. Because of this, ~nobody has time to understand all the results they use—instead they “black-box” many of them without deeply understanding.

Could this be true? How could newly minted, be an expert with our $40 online course, create professionals who use models packaged in downloadable and easy to plug in modules be unfamiliar with the inner workings of said bundles of brilliance? Impossible? Really?

Item Two:

A lot of number theory is figuring out how to stitch together many different such black boxes to get some new big result. Roommate described this as “flailing around” but also highly effective and endorsed my analogy to copy-pasting code from many different Stack Overflow answers.

Oh, come on. Flailing around. Do developers flail or do they “trust” the outfits who pretend to know how some multi-layered systems work. Fiddling with assumptions, thresholds, and (close your ears) the data themselves are never, ever a way to work around a glitch.

Item Three

Roommate told a story of using a technique to calculate a number and having a high-powered prof go “wow, I didn’t know you could actually do that”

No kidding? That’s impossible in general, and that expression would never be uttered at Amazon-, Facebook-, and Google-type operations, would it?

Will Mr. Kuhn be banned for heresy. [Keep in mind how Wikipedia defines this term: “is any belief or theory that is strongly at variance with established beliefs or customs, in particular the accepted beliefs of a church or religious organization.”] Just repeating an idea once would warrant a close encounter with an Iron Maiden or a pile of firewood. Probably not today. Someone might emit a slightly critical tweet, however.

Stephen E Arnold, July 26, 2021

Written by Stephen E. Arnold · Filed Under AI, Analytics, Data, Data mining, News, Statistics | Comments Off on Why Some Outputs from Smart Software Are Wonky

The Ultimate Private Public Partnership?

October 7, 2020

It looks as though the line between the US government and Silicon Valley is being blurred into oblivion. That is the message we get as we delve into Unlimited Hangout’s report, “New Pentagon-Google Partnership Suggests AI Will Soon Be Used to Diagnose Covid-19.” Writer Whitney Webb begins by examining evidence that a joint project between the Pentagon’s young Defense Innovation Unit (DIU) and Google Cloud is poised to expand from predicting cancer cases to also forecasting the spread of COVID-19. See the involved write-up for that evidence, but we are more interested in Webb’s further conclusion—that the US military & intelligence agencies and big tech companies like Google, Amazon, Microsoft, and others are nigh inseparable. Many of their decision makers are the same, their projects do as much for companies’ bottom lines as for the public good, and they are swimming in the same pools of (citizen’) data. We learn:

“NSCAI [National Security Commission on Artificial Intelligence] unites the US intelligence community and the military, which is already collaborating on AI initiatives via the Joint Artificial Intelligence Center and Silicon Valley companies. Notably, many of those Silicon Valley companies—like Google, for instance—are not only contractors to US intelligence, the military, or both but were initially created with funding from the CIA’s In-Q-Tel, which also has a considerable presence on the NSCAI. Thus, while the line between Silicon Valley and the US national-security state has always been murky, now that line is essentially nonexistent as entities like the NSCAI, DIB [Defense Innovation Board], and DIU, among several others, clearly show. Whereas China, as Robert Work noted, has the ‘civil-military fusion’ model at its disposal, the NSCAI and the US government respond to that model by further fusing the US technology industry with the national-security state.”

Recent moves in this arena involve healthcare-related projects. They are billed as helping citizens stay healthy, and that is a welcome benefit, but there is much more to it. The key asset here, of course, is all that tasty data—real-world medical information that can be used to train and refine valuable AI algorithms. Webb writes:

“Thus, the implementation of the Predictive Health program is expected to amass troves upon troves of medical data that offer both the DIU and its partners in Silicon Valley the ‘rare opportunity’ for training new, improved AI models that can then be marketed commercially.”

Do we really want private companies generating profit from public data?

Cynthia Murrell, October 7, 2020

Written by Stephen E. Arnold · Filed Under Business strategy, Data, Government, News | Comments Off on The Ultimate Private Public Partnership?

Google: Human Data Generators

July 29, 2020

DarkCyber spotted this interesting article, which may or may not be true. But it is fascinating. The story is “Google Working on Smart Tattoos That Turn Skin into Living Touchpad.” The write up states:

Google is working on smart tattoos that, when applied to skin, will transform the human body into a living touchpad via embedded sensors. Part of Google Research, the wearable project is called “SkinMarks” that uses rub-on tattoos. The project is an effort to create the next generation of wearable technology devices…

DarkCyber believes that the research project makes it clear that Google is indeed intent collecting personal data. Where will the tattoo be applied? Forehead in Central America street gang fashion?

Russian prisoner style with appropriate Google iconography?

A tasteful tramp stamp approach?

The possibilities are plentiful if the report is accurate.

Stephen E Arnold, July 29, 2020

Written by Stephen E. Arnold · Filed Under Data, Feature, Google | Comments Off on Google: Human Data Generators

Ontotext: GraphDB Update Arrives

January 31, 2020

Semantic knowledge firm Ontotext has put out an update to its graph database, The Register announces in, “It’s Just Semantics: Bulgarian Software Dev Ontotext Squeezes Out GraphDB 9.1.” Some believe graph databases are The Answer to a persistent issue. The article explains:

“The aim of applying graph database technology to enterprise data is to try to overcome the age-old problem of accessing latent organizational knowledge; something knowledge management software once tried to address. It’s a growing thing: Industry analyst Gartner said in November the application of graph databases will ‘grow at 100 per cent annually over the next few years’. GraphDB is ranked at eighth position on DB-Engines’ list of most popular graph DBMS, where it rubs shoulders with the likes of tech giants such as Microsoft, with its Azure Cosmos DB, and Amazon’s Neptune. ‘GraphDB is very good at text analytics because any natural language is very ambiguous: a project name could be a common English word, for example. But when you understand the context and how entities are connected, you can use these graph models to disambiguate the meaning,’ [GraphDB product manager Vassil] Momtchev said.”

The primary feature of this update is support for the Shapes Constraint Language, or SHACL, which the World Wide Web Consortium recommends for validating data graphs against a set of conditions. This support lets the application validate data against the schema whenever new data is loaded to the database instead of having to manually run queries to check. A second enhancement allows users to track changes in current or past database transactions. Finally, the database now supports network authentication protocol Kerberos, eliminating the need to store passwords on client computers.

Cynthia Murrell, January 31, 2020

Written by Stephen E. Arnold · Filed Under Data, Indexing, News | Comments Off on Ontotext: GraphDB Update Arrives

Data Are a Problem? And the Solution Is?

January 8, 2020

I attended a conference about managing data last year. I sat in six sessions and listened as enthusiastic people explained that in order to tap the value of data, one has to have a process. Okay? A process is good.

Then in each of the sessions, the speakers explained the problem and outlined that knowing about the data and then putting it in a system is the way to derive value.

Neither Pros Nor Cons: Just Consulting Talk

This morning I read an article called “The Pros and Cons of Data Integration Architectures.” The write up concludes with this statement:

Much of the data owned and stored by businesses and government departments alike is constrained by the silos it’s stuck in, many of which have been built over the years as organizations grow. When you consider the consolidation of both legacy and new IT systems, the number of these data silos only increases. What’s more, the impact of this is significant. It has been widely reported that up to 80 per cent of a data scientist’s time is spent on collecting, labeling, cleaning and organizing data in order to get it into a usable form for analysis.

Now this is most true. However, the 80 percent figure is not backed up. An IDG expert whipped up some percentages about data and time, and these, I suspect, have become part of the received wisdom of those struggling with silos for decades. Most of a data scientist’s time is frittered away in meetings, struggling with budgets and other resources, and figuring out what data are “good” and what to do with the data identified by person or machine as “bad.”

The source of this statement is MarkLogic, a privately held company founded in 2001 and a magnet for $173 million from funding sources. That works out to an 18 years young start up if DarkCyber adopts a Silicon Valley T shirt.

A modern silo is made of metal and impervious to some pests and most types of weather.

One question the write up begs is, “After 18 years, why hasn’t the methodology of MarkLogic swept the checker board?” But the same question can be asked of other providers’ solutions, open source solutions, and the home grown solutions creaking in some government agencies in Europe and elsewhere.

Several reasons:

The technical solution offered by MarkLogic-type companies can “work”; however, proprietary considerations linked with the issues inherent in “silos” have caused data management solutions to become consultantized; that is, process becomes the task, not delivering on the promise of data, elther dark or sunlit.
Customers realize that the cost of dealing with the secrecy, legal, and technical problems of disparate, digital plastic trash bags of bits cannot be justified. Like odd duck knickknacks one of my failed publishers shoved into his lumber room, ignoring data is often a good solution.
Individuals tasked with organizing data begin with gusto and quickly morph into bureaucrats who treasure meetings with consultants and companies pitching magic software and expensive wizards able to make the code mostly work.

DarkCyber recognizes that with boundaries like budgets, timetables, measurable objectives, federation can deliver some zip.

Silos: A Moment of Reflection

The article uses the word “silo” five times. That’s the same frequency of its use in the presentations to which I listened in mid December 2019.

So you want to break down this missile silo which is hardened and protected by autonomous weapons? That’s what happens when a data scientist pokes around a pharma company’s lab notebook for a high potential new drug.

Let’s pause a moment to consider what a silo is. A silo is a tower or a pit used to store core, wheat, or some other grain. Dust is silos can be exciting. Tip: Don’t light a match in a silo on a dry, hot day in a state where farms still operate. A silo can also be a structure used to house a ballistic missile, but one has to be a child of the Cold War to appreciate this connotation.

As applied to data, it seems that a silo is a storage device containing data. Unlike a silo used to house maize or a nuclear capable missile, the data silo contains information of value. How much value? No one knows. Are the data in a digital silo explosive? Who knows? Maybe some people should not know? What wants to flick a Bic and poke around?

Written by Stephen E. Arnold · Filed Under Data, Database, Feature, Management | Comments Off on Data Are a Problem? And the Solution Is?

Federating Data: Easy, Hard, or Poorly Understood Until One Tries It at Scale?

March 8, 2019

I read two articles this morning.

One article explained that there’s a new way to deal with data federation. Always optimistic, I took a look at “Data-Driven Decision-Making Made Possible using a Modern Data Stack.” The revolution is to load data and then aggregate. The old way is to transform, aggregate, and model. Here’s a diagram from DAS43. A larger version is available at this link.

Hard to read. Yep, New Millennial colors. Is this a breakthrough?

I don’t know.

When I read “2 Reasons a Federated Database Isn’t Such a Slam-Dunk”, it seems that the solution outlined by DAS42 and the InfoWorld expert are not in sync.

There are two reasons. Count ‘em.

One: performance

Two: security.

Yeah, okay.

Some may suggest that there are a handful of other challenges. These range from deciding how to index audio, video, and images to figuring out what to do with different languages in the content to determining what data are “good” for the task at hand and what data are less “useful.” Date, time, and geocodes metadata are needed, but that introduces the not so easy to solve indexing problem.

So where are we with the “federation thing”?

Exactly the same place we were years ago…start ups and experts notwithstanding. But then one has to wrangle a lot of data. That’s cost, gentle reader. Big money.

Stephen E Arnold, March 8, 2019

Written by Stephen E. Arnold · Filed Under Data, Data mining, Database, Indexing, News | Comments Off on Federating Data: Easy, Hard, or Poorly Understood Until One Tries It at Scale?

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Employment
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Weaponization of LLMs Is a Thing. Will Users Care? Nope

Racers, Start Your Work Around Engines

Common Sense from an AI-Centric Outfit: How Refreshing

Mastercard and Customer Information: A Lone Ranger?

Why Some Outputs from Smart Software Are Wonky

The Ultimate Private Public Partnership?

Google: Human Data Generators

Ontotext: GraphDB Update Arrives

Data Are a Problem? And the Solution Is?

Neither Pros Nor Cons: Just Consulting Talk

Silos: A Moment of Reflection

Federating Data: Easy, Hard, or Poorly Understood Until One Tries It at Scale?

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Neither Pros Nor Cons: Just Consulting Talk

Silos: A Moment of Reflection

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta