More about NAMER, the Bitext Smart Entity Technology

January 14, 2025

dino orangeA dinobaby product! We used some smart software to fix up the grammar. The system mostly worked. Surprised? We were.

We spotted more information about the Madrid, Spain based Bitext technology firm. The company posted “Integrating Bitext NAMER with LLMs” in late December 2024. At about the same time, government authorities arrested a person known as “Broken Tooth.” In 2021, an alert for this individual was posted. His “real” name is Wan Kuok-koi, and he has been in an out of trouble for a number of years. He is alleged to be part of a criminal organization and active in a number of illegal behaviors; for example, money laundering and human trafficking. The online service Irrawady reported that Broken Tooth is “the face of Chinese investment in Myanmar.”

Broken Tooth (né Wan Kuok-koi, born in Macau) is one example of the importance of identifying entity names and relating them to individuals and the organizations with which they are affiliated. A failure to identify entities correctly can mean the difference between resolving an alleged criminal activity and a get-out-of-jail-free card. This is the specific problem that Bitext’s NAMER system addresses. Bitext says that large language models are designed for for text generation, not entity classification. Furthermore, LLMs pose some cost and computational demands which can pose problems to some organizations working within tight budget constraints. Plus, processing certain data in a cloud increases privacy and security risks.

Bitext’s solution provides an alternative way to achieve fine-grained entity identification, extraction, and tagging. Bitext’s solution combines classical natural language processing solutions solutions with large language models. Classical NLP tools, often deployable locally, complement LLMs to enhance NER performance.

NAMER excels at:

  1. Identifying generic names and classifying them as people, places, or organizations.
  2. Resolving aliases and pseudonyms.
  3. Differentiating similar names tied to unrelated entities.

Bitext supports over 20 languages, with additional options available on request. How does the hybrid approach function? There are two effective integration methods for Bitext NAMER with LLMs like GPT or Llama are. The first is pre-processing input. This means that entities are annotated before passing the text to the LLM, ideal for connecting entities to knowledge graphs in large systems. The second is to configure the LLM to call NAMER dynamically.

The output of the Bitext system can generate tagged entity lists and metadata for content libraries or dictionary applications. The NAMER output can integrate directly into existing controlled vocabularies, indexes, or knowledge graphs. Also, NAMER makes it possible to maintain separate files of entities for on-demand access by analysts, investigators, or other text analytics software.

By grouping name variants, Bitext NAMER streamlines search queries, enhancing document retrieval and linking entities to knowledge graphs. This creates a tailored “semantic layer” that enriches organizational systems with precision and efficiency.

For more information about the unique NAMER system, contact Bitext via the firm’s Web site at www.bitext.com.

Stephen E Arnold, January 14, 2025

Geolocation Data: Available for a Price

December 30, 2024

According to a report from 404 Media, a firm called Fog Data Science is helping law enforcement compile lists of places visited by suspects. Ars Technica reveals, “Location Data Firm Helps Police Find Out When Suspects Visited their Doctor.” Writer Jon Brodkin writes:

“Fog Data Science, which says it ‘harness[es] the power of data to safeguard national security and provide law enforcement with actionable intelligence,’ has a ‘Project Intake Form’ that asks police for locations where potential suspects and their mobile devices might be found. The form, obtained by 404 Media, instructs police officers to list locations of friends’ and families’ houses, associates’ homes and offices, and the offices of a person’s doctor or lawyer. Fog Data has a trove of location data derived from smartphones’ geolocation signals, which would already include doctors’ offices and many other types of locations even before police ask for information on a specific person. Details provided by police on the intake form seem likely to help Fog Data conduct more effective searches of its database to find out when suspects visited particular places. The form also asks police to identify the person of interest’s name and/or known aliases and their ‘link to criminal activity.’ ‘Known locations a POI [Person of Interest] may visit are valuable, even without dates/times,’ the form says. It asks for street addresses or geographic coordinates.”

See the article for an image of the form. It is apparently used to narrow down data points and establish suspects’ routine movements. It could also be used to, say, prosecute abortions, Brodkin notes.

Back in 2022, the Electronic Frontier Foundation warned of Fog Data’s geolocation data horde. Its report detailed which law enforcement agencies were known to purchase Fog’s intel at the time. But where was Fog getting this data? From Venntel, the EFF found, which is the subject of a Federal Trade Commission action. The agency charges Venntel with “unlawfully tracking and selling sensitive location data from users, including selling data about consumers’ visits to health-related locations and places of worship.” The FTC’s order would prohibit Venntel, and parent company Gravy Analytics, from selling sensitive location data. It would also require they establish a “sensitive data location program.” We are not sure what that would entail. And we might never know: the decision may not be finalized until after the president-elect is sworn in.

Cynthia Murrell, December 30, 2024

Bold Allegation: Columbia, the US, and Pegasus

December 27, 2024

The United States assists its allies, but why did the Biden Administration pony up $11 million for a hacking software. DropSiteNews investigates the software, its huge price tag, and why the US bought it in: “The U.S. Bought Pegasus For Colombia With $11 Million In Cash. Now Colombians Are Asking Why.” Colombians are just as curious as Americans are why the US coughed up $11 million in cash for the Israeli hacking software.

The Colombian ambassador to the US Daniel García-Peña confirmed that Washington DC assisted his country in buying the software, so the Colombian government could track drug cartels. The software was purchased and used throughout 2021-2022. Pegasus usage stopped in 2022 and it was never used to against politicians, such as former Columbian president Ivan Duque. The Biden Administration remained in control of the Pegasus software and assured that the Columbian government only provided spying targets.

It’s understandable why Colombia’s citizens were antsy about Pegasus:

“García-Peña’s revelations come two months after Colombian President Gustavo Petro delivered a televised speech in which he revealed some of the details of the all-cash, $11-million purchase, including that it has been split across two installments, flown from Bogotá and deposited into the Tel Aviv bank account belonging to NSO Group, the company that owns Pegasus. Soon after the speech, Colombia’s attorney general opened an investigation into the purchase and use of Pegasus. In October, Petro accused the director of the NSO Group of money laundering, due to the tremendous amount of cash he transported on the flights.

The timeline of the purchase and use of Pegasus overlaps with a particularly turbulent time in Colombia. A social movement had begun protesting against Duque, while in the countryside, Colombia’s security forces were killing or arresting major guerrilla and cartel leaders. At the time, Petro, the first left-wing president in the country’s recent history, was campaigning for the presidency.”

The Pegasus is powerful hacking software and Columbians were suspicious how their government acquired it. Journalists were especially curious where the influx of cash came from. They slowly discovered it was from the United States with the intent to spy on drug cartels. Columbia is a tumultuous nation with crime worse than the wild west. Pegasus hopefully caught the worst of the bad actors.

Whitney Grace, December 27, 2024

FOGINT: Intelware Tension Ticks Up

December 24, 2024

fog from gifer 8AC8 small_thumb_thumb Observations from the FOGINT research team.

On Friday, December 20, 2024, NSO Group, the Pegasus specialized software outfit, found itself losing a court squabble with Facebook (Meta and WhatsApp). According to the Reuters’ news story pushed out at 915 pm Eastern time, “US Judge Finds Israel’s NSO Group Liable for Hacking in WhatsApp Lawsuit.” In case you don’t have the judgment at hand, you can find the United States District Court, Norther District of California document at this link.

The main idea behind the case is that the NSO Group’s specialized software pressed into duty for the purpose of obtaining information about WhatsApp users. The mechanism was to exploit “a bug in the messaging app to install spy software allowing unauthorized surveillance.” NSO Group’s fancy legal two step did not work.

The NSO Group has become the poster child for the “compromise the mobile” phone and obtain data. The Pegasus system exfiltrates data and, when properly configured, can capture information from a mobile device. Furthermore, the company’s hassles about its customers’ use of the Pegasus tool unwittingly created a surge in software and specialized services performing identical or similar tasks.

The FOGINT team has identified firms which have found different ways of compromising mobile devices. The company, therefore, has been an innovator and its approach to compromising devices has [a] focused attention on Israel’s technical competence in this specialized software niche and [b] rightly or wrongly illustrated that the technology can act with extreme prejudice when used by some clients to solve what they perceive as “problems.”

There are several larger consequences which the FOGINT team has identified:

  1. Specialized software is more prevalent because the revelations about Pegasus have encouraged entrepreneurs and technologists to develop more effective surveillance methods
  2. Unique delivery methods have been crafted. These range for in-app malware to more sophisticated multi-stage malware installed as a consequence of a user’s carelessness
  3. Making clear that powerful surveillance tools can be installed in a way that does not require the user to click, email, or interact. The malware simply dials up a mobile and bingo! the device is compromised.

How will this judgment affect the specialized software industry? In FOGINT’s view, the decision will further stimulate competition and the follow of novel surveillance techniques. One consequence also may be that law enforcement and intelligence professionals will encounter headwinds when similar specialized software is required for certain investigations. FOGINT’s view is that NSO Group’s go-go approach to sales created a problem for the company and for specialized software. Some technologies should remain “secret,” which is now becoming an old-fashioned viewpoint. Marketing is not always a benefit.

Stephen E Arnold, December 24, 2024

Entity Extraction: Not As Simple As Some Vendors Say

November 19, 2024

dino orange_thumb_thumb_thumb_thumb_thumbNo smart software. Just a dumb dinobaby. Oh, the art? Yeah, MidJourney.

Most of the systems incorporating entity extraction have been trained to recognize the names of simple entities and mostly based on the use of capitalization. An “entity” can be a person’s name, the name of an organization, or a location like Niagara Falls, near Buffalo, New York. The river “Niagara” when bound to “Falls” means a geologic feature. The “Buffalo” is not a Bubalina; it is a delightful city with even more pleasing weather.

The same entity extraction process has to work for specialized software used by law enforcement, intelligence agencies, and legal professionals. Compared to entity extraction for consumer-facing applications like Google’s Web search or Apple Maps, the specialized software vendors have to contend with:

  • Gang slang in English and other languages; for example, “bumble bee.” This is not an insect; it is a nickname for the Latin Kings.
  • Organizations operating in Lao PDR and converted to English words like Zhao Wei’s Kings Romans Casino. Mr. Wei has been allegedly involved in gambling activities in a poorly-regulated region in the Golden Triangle.
  • Individuals who use aliases like maestrolive, james44123, or ahmed2004. There are either “real” people behind the handles or they are sock puppets (fake identities).

Why do these variations create a challenge? In order to locate a business, the content processing system has to identify the entity the user seeks. For an investigator, chopping through a thicket of language and idiosyncratic personas is the difference between making progress or hitting a dead end. Automated entity extraction systems can work using smart software, carefully-crafted and constantly updated controlled vocabulary list, or a hybrid system.

Automated entity extraction systems can work using smart software, carefully-crafted and constantly updated controlled vocabulary list, or a hybrid system.

Let’s take an example which confronts a person looking for information about the Ku Group. This is a financial services firm responsible for the Kucoin. The Ku Group is interesting because it has been found guilty in the US for certain financial activities in the State of New York and by the US Securities & Exchange Commission. 

Read more

Apple and NSO Group: Enough PR Already

October 16, 2024

dino orange_thumb_thumb_thumb_thumb_thumbJust a humanoid processing information related to online services and information access.

No, no court battle. Bummer.

Technology companies either like or dislike others in their industry. Apple and the spyware NSO Group don’t play well together, but they recently agreed on something. Cyberscoop reports that, “NSO Group Indicates Rare Agreement With Apple Over Dismissal Of Lawsuit.” Apple and NSO Group agreed it is prudent to drop a lawsuit that accused the latter of targeting the former’s users.

The NSO Group was more open about why the lawsuit should be dismissed, but Apple is keeping quiet. The lawsuit was filed three years ago, but it’s not as useful anymore because the spyware market has grown. There’s more information here:

“NSO Group, by contrast, said that while it agreed with Apple that there were “significant obstacles” in the court case, the real issue was that a district court in California wasn’t the right venue for “adjudicating claims that a foreign technology company licensed lawful-intercept technology to foreign governments, which then used the technology to monitor foreign criminals and terrorists in foreign countries for those countries’ own national security and other sovereign interests.”

The filing states that Apple has done little to prosecute its claims, and as such the judge should dismiss the lawsuit “with prejudice,” meaning that it couldn’t be refiled later. But if the judge dismisses it without prejudice, then NSO Group said it would like to be reimbursed for its court costs as the work it has done on the case couldn’t be recycled.”

Whitney Grace, October 16, 2024

Surveillance Watch Maps the Surveillance App Ecosystem

October 1, 2024

Here is an interesting resource: Surveillance Watch compiles information about surveillance tech firms, organizations that fund them, and the regions in which they are said to operate. The lists, compiled from contributions by visitors to the site, are not comprehensive. But they are full of useful information. The About page states:

“Surveillance technology and spyware are being used to target and suppress journalists, dissidents, and human rights advocates everywhere. Surveillance Watch is an interactive map that documents the hidden connections within the opaque surveillance industry. Founded by privacy advocates, most of whom were personally harmed by surveillance tech, our mission is to shed light on the companies profiting from this exploitation with significant risk to our lives. By mapping out the intricate web of surveillance companies, their subsidiaries, partners, and financial backers, we hope to expose the enablers fueling this industry’s extensive rights violations, ensuring they cannot evade accountability for being complicit in this abuse. Surveillance Watch is a community-driven initiative, and we rely on submissions from individuals passionate about protecting privacy and human rights.”

Yes, the site makes it easy to contribute information to its roundup. Anonymously, if one desires. The site’s information is divided into three alphabetical lists: Surveilling Entities, Known Targets, and Funding Organizations. As an example, here is what the service says about safeXai (formerly Banjo):

“safeXai is the entity that has quietly resumed the operations of Banjo, a digital surveillance company whose founder, Damien Patton, was a former Ku Klux Klan member who’d participated in a 1990 drive-by shooting of a synagogue near Nashville, Tennessee. Banjo developed real-time surveillance technology that monitored social media, traffic cameras, satellites, and other sources to detect and report on events as they unfolded. In Utah, Banjo’s technology was used by law enforcement agencies.”

We notice there are no substantive links which could have been included, like ones to footage of the safeXai surveillance video service or the firm’s remarkable body of patents. In our view, these patents represent an X-ray look at what most firms call artificial intelligence.

A few other names we recognize are IBM, Palantir, and Pegasus owner NSO Group. See the site for many more. The Known Targets page lists countries that, when clicked, list surveilling entities known or believed to be operating there. Entries on the Funding Organizations page include a brief description of each organization with a clickable list of surveillance apps it is known or believed to fund at the bottom. It is not clear how the site vets its entries, but the submission form does include boxes for supporting URL(s) and any files to upload. It also asks whether one consents to be contacted for more information.

Cynthia Murrell, October 1, 2024

Intellexa: Ill Intent or Israeli Marketing Failure?

September 19, 2024

green-dino_thumb_thumb_thumb_thumb1_This essay is the work of a dumb dinobaby. No smart software required.

Most online experts are not familiar with the specialized software sector. Most of the companies in this intelware niche try to maintain a low profile. Publicity in general media, trade magazines, or TikTok is not desired. However, a couple of Israel-anchored vendors have embraced the Madison Avenue way. Indications of unwanted publicity surface in sources rarely given much attention by the poohbahs who follow more clickable topics like Mr. Musk’s getting into doo doo in Brazil or Mr. Zuck’s antics in Australia and the UK.

image

You know your marketing and PR firm has created an issue which allows management to ask, “Should we switch to a new marketing and PR firm?” Will the executives make a switch or go for a crisis management outfit instead? Thanks, MSFT Copilot. Interesting omission of the word “a”, but that’s okay. Your team is working on security and a couple of other pressing issues. Grammar is the least of some Softies’ worries.

The Malta Times (yep, it is an island with an interesting history and a number of business districts which house agents and lawyers who do fascinating work) reported on March 6, 2024, that:

The Maltese government has initiated the process of the deprivation of the Maltese citizenship of a person who appeared on a US sanctions list on Tuesday (March 5, 2024).

The individual, according to the write up, was “Ex-Israeli intelligence officer and current CEO of cyber spyware firm Intellexa.” The write up points out:

Tal Dilian was added to the United States Office of Foreign Assets Control Specially Designated Nationals List on Tuesday (March 5, 2024) in connection with sanctions by the US Treasury on members of the Intellexa Spyware Consortium.

The Malta Times noted:

According to the [U.S.] State Department, “Dilian is the founder of the Intellexa Consortium and is the architect behind its spyware tools. The consortium is a complex international web of decentralized companies controlled either fully or partially by Dilian, including through Sara Aleksandra Fayssal Hamou. “Hamou is a corporate off-shoring specialist who has provided managerial services to the Intellexa Consortium, including renting office space in Greece on behalf of Intellexa S.A. Hamou holds a leadership role at Intellexa S.A., Intellexa Limited, and Thalestris Limited,” said the State Department.

I saw a news release from the US Department of the Treasury titled “Treasury Sanctions Enablers of the Intellexa Commercial Spyware Consortium.” That statement said:

Today, the Department of the Treasury’s Office of Foreign Assets Control (OFAC) sanctioned five individuals and one entity associated with the Intellexa Consortium for their role in developing, operating, and distributing commercial spyware technology that presents a significant threat to the national security of the United States. These designations complement concerted U.S. government actions against commercial spyware vendors, including previous sanctions against individuals and entities associated with the Intellexa Consortium; the Department of Commerce’s addition of commercial spyware vendors to the Entity List; and the Department of State’s visa ban policy targeting those who misuse or profit from the misuse of commercial spyware, subsequently exercised on thirteen individuals.

Some of these people include:

  • Felix Bitzios (Bitzios), beneficial owner of an Intellexa Consortium
  • Andrea Nicola Constantino Hermes Gambazzi, the beneficial owner of Thalestris Limited which holds distribution rights to the Predator spyware and has been involved in processing transactions on behalf of other entities within the Intellexa Consortium.
  • Merom Harpaz, a manager of Intellexa S.A.
  • Panagiota Karaoli, the director of multiple Intellexa Consortium entities that are controlled by or are a subsidiary of Thalestris Limited.
  • Artemis Artemiou (Artemiou), the general manager and member of the board of Cytrox Holdings Zartkoruen Mukodo Reszvenytarsasag (Cytrox Holdings), a member of the Intellexa Consortium
  • Aliada Group Inc,  a British Virgin Islands-based company and member of the Intellexa Consortium

Chatter about Intellexa’s specialized software has been making noise since

In 2021, the firm used this headline on its Web site to catch attention, not of law enforcement and intelligence agencies, but other entities:

More than intelligence gathering networks — Intellexa’s innovative insight platform

And statements like

Create insights, win the digital race

The lingo is important because it is marketing oriented. Plus, in 2021, the firm’s positioning emphasized Tal Dilian’s technology. (Some of the features reminded me of NSO Group’s Pegasus with a dash of other Israeli-developed specialized software systems.

How has the marketing worked out? Since Mr. Dilian became involved with a failing specialized software developer called Cytrox in Cyprus, Intellexa matured into an “alliance.” The reinvigorated outfit operated from Athens, Greece. By 2021, Intellexa was attracting attention from several governments related to officials’ whose devices had been enhanced with the cleverly named Predator software.

That marketing expertise has put Intellexa and its “affiliates” in the spotlight. From a PR point of view, mission accomplished. The problem appears to be that one PR and marketing success has created a sticky wicket for the company. An unintended consequence is that the specialized software vendors find themselves becoming increasingly well known. From my point of view, the failure to keep certain specialized software capabilities secret has been a surprising trend.

My hypothesis is that because the systems and methods for obtaining information for legal purposes has become more widely known, more people are thinking about how they too could obtain information from an entity. One may criticize what government entities do, but these entities (in theory) are operating within a formal structure. Use of specialized software, therefore, operates within a structure which has rules, regulations, norms for conduct, and similar knobs and dials. When the capabilities are available to anyone via a Telegram download, certain types of risk go up. That’s why I am not in favor of specialized software companies practicing the Israel developed NSO Group and Intellexa style of marketing.

But the mobile surveillance cat is out of the bag. And I have been around long enough to know what happens when cats are turned loose. They market, make noise, and make more cats. And some technology can make a mobile device behave in unexpected ways or go bang.

Stephen E Arnold, September 19, 2024

The Fixed Network Lawful Interception Business is Booming

September 11, 2024

It is not just bad actors who profit from an increase in cybercrime. Makers of software designed to catch them are cashing in, too. The Market Research Report 224 blog shares “Fixed Network Lawful Interception Market Region Insights.” Lawful interception is the process by which law enforcement agencies, after obtaining the proper warrants of course, surveil circuit and packet-mode communications. The report shares findings from a study by Data Bridge Market Research on this growing sector. Between 2021 and 2028, this market is expected to grow by nearly 20% annually and hit an estimated value of $5,340 million. We learn:

“Increase in cybercrimes in the era of digitalization is a crucial factor accelerating the market growth, also increase in number of criminal activities, significant increase in interception warrants, rising surge in volume of data traffic and security threats, rise in the popularity of social media communications, rising deployment of 5G networks in all developed and developing economies, increasing number of interception warrants and rising government of both emerging and developed nations are progressively adopting lawful interception for decrypting and monitoring digital and analog information, which in turn increases the product demand and rising virtualization of advanced data centers to enhance security in virtual networks enabling vendors to offer cloud-based interception solutions are the major factors among others boosting the fixed network lawful interception market.”

Furthermore, the pace of these developments will likely increase over the next few years. The write-up specifies key industry players, a list we found particularly useful:

“The major players covered in fixed network lawful interception market report are Utimaco GmbH, VOCAL TECHNOLOGIES, AQSACOM, Inc, Verint, BAE Systems., Cisco Systems, Telefonaktiebolaget LM Ericsson, Atos SE, SS8 Networks, Inc, Trovicor, Matison is a subsidiary of Sedam IT Ltd, Shoghi Communications Ltd, Comint Systems and Solutions Pvt Ltd – Corp Office, Signalogic, IPS S.p.A, ZephyrTel, EVE compliancy solutions and Squire Technologies Ltd among other domestic and global players.”

See the press release for notes on Data Bridge’s methodology. It promises 350 pages of information, complete with tables and charts, for those who purchase a license. Formed in 2014, Data Bridge is based in Haryana, India.

Cynthia Murrell, September 11, 2024

Preligens Is Safran.ai

September 9, 2024

Preligens, a French AI and specialized software company, is now part of Safran Electronics & Defense which is a unit of the Safran Group. I spotted a report in Aerotime. “Safran Accelerates AI Development with $243M Purchase of French-Firm Preligens” reported on September 2, 2024. The report quotes principles to the deal as saying:

“Joining Safran marks a new stage in Preligens’ development. We’re proud to be helping create a world-class AI center of expertise for one of the flagships of French industry. The many synergies with Safran will enable us to develop new AI product lines and accelerate our international expansion, which is excellent news for our business and our people,” Jean-Yves Courtois, CEO of Preligens, said.  The CEO of Safran Electronics & Defense, Franck Saudo, said that he was “delighted” to welcome Preligens to the company.

The acquisition does not just make Mr. Saudo happy. The French military, a number of European customers, and the backers of Preligens are thrilled as well. In my lectures about specialized software companies, I like to call attention to this firm. It illustrates that technology innovation is not located in one country. Furthermore it underscores the strong educational system in France. When I first learned about Preligens, one rumor I heard was that on of the US government entities wanted to “invest” in the company. For a variety of reasons, the deal went no place faster than a bus speeding toward La Madeleine. If you spot me at a conference, you can ask about French technology firms and US government processes. I have some first hand knowledge starting with “American fries in a Congressional lunch facility.”

Preligens is important for three reasons:

  1. The firm developed an AI platform; that is, the “smart software” is not an afterthought which contrasts sharply with the spray paint approach to AI upon which some specialized software companies have been relying
  2. The smart software outputs identification data; for example, a processed image can show an aircraft. The Preligens system identifies the aircraft by type
  3. The user of the Preligens system can use time analyses of imagery to draw conclusions. Here’s a hypothetical because the actual example is not appropriate for a free blog written by a dinobaby. Imagine a service van driving in front of an embassy in Paris. The van makes a pass every three hours for two consecutive days. The Preligens system can “notice” this and alert an operator.

I will continue to monitor the system which will be doing business with selected entities under the name Safran.ai.

Stephen E Arnold, September 9, 2024

Next Page »

  • Archives

  • Recent Posts

  • Meta