Featured

Entity Extraction: Not As Simple As Some Vendors Say

dino orange_thumb_thumb_thumb_thumb_thumbNo smart software. Just a dumb dinobaby. Oh, the art? Yeah, MidJourney.

Most of the systems incorporating entity extraction have been trained to recognize the names of simple entities and mostly based on the use of capitalization. An “entity” can be a person’s name, the name of an organization, or a location like Niagara Falls, near Buffalo, New York. The river “Niagara” when bound to “Falls” means a geologic feature. The “Buffalo” is not a Bubalina; it is a delightful city with even more pleasing weather.

The same entity extraction process has to work for specialized software used by law enforcement, intelligence agencies, and legal professionals. Compared to entity extraction for consumer-facing applications like Google’s Web search or Apple Maps, the specialized software vendors have to contend with:

  • Gang slang in English and other languages; for example, “bumble bee.” This is not an insect; it is a nickname for the Latin Kings.
  • Organizations operating in Lao PDR and converted to English words like Zhao Wei’s Kings Romans Casino. Mr. Wei has been allegedly involved in gambling activities in a poorly-regulated region in the Golden Triangle.
  • Individuals who use aliases like maestrolive, james44123, or ahmed2004. There are either “real” people behind the handles or they are sock puppets (fake identities).

Why do these variations create a challenge? In order to locate a business, the content processing system has to identify the entity the user seeks. For an investigator, chopping through a thicket of language and idiosyncratic personas is the difference between making progress or hitting a dead end. Automated entity extraction systems can work using smart software, carefully-crafted and constantly updated controlled vocabulary list, or a hybrid system.

Automated entity extraction systems can work using smart software, carefully-crafted and constantly updated controlled vocabulary list, or a hybrid system.

Let’s take an example which confronts a person looking for information about the Ku Group. This is a financial services firm responsible for the Kucoin. The Ku Group is interesting because it has been found guilty in the US for certain financial activities in the State of New York and by the US Securities & Exchange Commission. 

Read more »

Interviews

DarkCyber, March 29, 2022: An Interview with Chris Westphal, DataWalk

Chris Westphal is the Chief Analytics Officer of DataWalk, a firm providing an investigative and analysis tool to commercial and government organizations. The 12-minute interview covers DataWalk’s unique capabilities, its data and information resources, and the firm’s workflow functionality. The video can be viewed on YouTube at this location.

Stephen E Arnold, March 29, 2022

Latest News

Oh, Oh! Silicon Valley Hype Minimizes Risk. Who Knew?

This is an official dinobaby post. No smart software involved in this blog post. I read “Silicon Valley Stifled the AI Doom Movement in 2024.” I must admit I... Read more »

January 10, 2025 | Comment

The Brain Rot Thing: The 78 Wax Record Is Stuck Again

This is an official dinobaby post. I read again about brain rot. I get it. Young kids play with a mobile phone. They get into social media. They watch TikTok. The... Read more »

January 10, 2025 | Comment

Social Media Change: Stop the Decay! Ouch! Stop!

This is an official dinobaby post. No smart software involved in this blog post. I learned a new term: Platform Decay. I associated the phrase with Tooth Decay.... Read more »

January 10, 2025 | Comment

Meta and Zuck Make Free Speech News

Techmeme makes clear that Meta and its charming leader are important and “real” news. I checked the splash page of the online news service and learned: Zuckerberg... Read more »

January 9, 2025 | Comment

GitHub Identifies a Sooty Pot and Does Not Offer a Fix

This is an official dinobaby post. No smart software involved in this blog post. GitLab’s Sabrina Farmer is a sharp thinking person. Her “Three Software Development... Read more »

January 9, 2025 | Comment

AI Outfit Pitches Anti Human Message

AI startup Artisan thought it could capture attention by telling companies to get rid of human workers and use its software instead. It was right. Gizmodo reports,... Read more »

January 9, 2025 | Comment

Be Secure Like a Journalist

This is an official dinobaby post. If you want to be secure like a journalist, Freedom.press has a how-to for you. The write up “The 2025 Journalist’s Digital... Read more »

January 9, 2025 | Comment

FOGINT: The French Method for Communicating with Telegram Works

 A post from the FOGINT team. Direct action by French authorities has had a visible impact on Telegram. The FOGINT team noted a report in Arab News which provides... Read more »

January 8, 2025 | Comment

FOGINT: Russia Reveals How Important Telegram Is to Its Propaganda Program

This is an official dinobaby post. No smart software involved in this blog post. Telegram that messaging service is important to Russia’s European propaganda efforts.... Read more »

January 8, 2025 | Comment

Identifying Misinformation: A Task Not Yet Mastered

This is an official dinobaby post. No smart software involved in this blog post. On New Year’s eve the US Department of Treasury issued a news release about Russian... Read more »

January 8, 2025 | Comment


  • Archives

  • Recent Posts

  • Meta