Featured

Entity Extraction: Not As Simple As Some Vendors Say

dino orange_thumb_thumb_thumb_thumb_thumbNo smart software. Just a dumb dinobaby. Oh, the art? Yeah, MidJourney.

Most of the systems incorporating entity extraction have been trained to recognize the names of simple entities and mostly based on the use of capitalization. An “entity” can be a person’s name, the name of an organization, or a location like Niagara Falls, near Buffalo, New York. The river “Niagara” when bound to “Falls” means a geologic feature. The “Buffalo” is not a Bubalina; it is a delightful city with even more pleasing weather.

The same entity extraction process has to work for specialized software used by law enforcement, intelligence agencies, and legal professionals. Compared to entity extraction for consumer-facing applications like Google’s Web search or Apple Maps, the specialized software vendors have to contend with:

  • Gang slang in English and other languages; for example, “bumble bee.” This is not an insect; it is a nickname for the Latin Kings.
  • Organizations operating in Lao PDR and converted to English words like Zhao Wei’s Kings Romans Casino. Mr. Wei has been allegedly involved in gambling activities in a poorly-regulated region in the Golden Triangle.
  • Individuals who use aliases like maestrolive, james44123, or ahmed2004. There are either “real” people behind the handles or they are sock puppets (fake identities).

Why do these variations create a challenge? In order to locate a business, the content processing system has to identify the entity the user seeks. For an investigator, chopping through a thicket of language and idiosyncratic personas is the difference between making progress or hitting a dead end. Automated entity extraction systems can work using smart software, carefully-crafted and constantly updated controlled vocabulary list, or a hybrid system.

Automated entity extraction systems can work using smart software, carefully-crafted and constantly updated controlled vocabulary list, or a hybrid system.

Let’s take an example which confronts a person looking for information about the Ku Group. This is a financial services firm responsible for the Kucoin. The Ku Group is interesting because it has been found guilty in the US for certain financial activities in the State of New York and by the US Securities & Exchange Commission. 

Read more »

Interviews

DarkCyber, March 29, 2022: An Interview with Chris Westphal, DataWalk

Chris Westphal is the Chief Analytics Officer of DataWalk, a firm providing an investigative and analysis tool to commercial and government organizations. The 12-minute interview covers DataWalk’s unique capabilities, its data and information resources, and the firm’s workflow functionality. The video can be viewed on YouTube at this location.

Stephen E Arnold, March 29, 2022

Latest News

FOGINT: Telegram Gets Some Lipstick to Put on a Very Dangerous Pig

Information from the FOGINT research team. We noted the New York Times article “Under Pressure, Telegram Turns a Profit for the First Time.” The write up reported... Read more »

December 23, 2024 | Comment

AI Makes Stuff Up and Lies. This Is New Information?

The blog post is the work of a dinobaby, not AI. I spotted “Alignment Faking in Large Language Models.” My initial reaction was, “This is new information?”... Read more »

December 23, 2024 | Comment

Microsoft Grouses and Barks, Then Regrouses and Rebarks about the Google

This blog post is the work of an authentic dinobaby. No smart software was used. I spotted a reference to Windows Central, a very supportive yet “independent”... Read more »

December 23, 2024 | Comment

Telegram: Pressure Mounting on the TON Entities

Observations from the Telegram research team. Two apparently unrelated news items provide tantalizing hints about what Telegram will do in 2025. The phrase “cornered... Read more »

December 23, 2024 | Comment

Thales CortAIx (Get It?) and Smart Drones

Countries are investing in AI to amp up their militaries, including naval forces. Aviation Defense Universe explores how one tech company is shaping the future of... Read more »

December 23, 2024 | Comment

Google AI Videos: Grab Your Popcorn and Kick Back

This blog post is the work of an authentic dinobaby. No smart software was used. Google has an artificial intelligence inferiority complex. In January 2023, it... Read more »

December 20, 2024 | Comment

Another Horse Ridge or Just Horse Feathers from the Management Icon Intel?

This write up emerged from the dinobaby’s own mind. No AI was used because this dinobaby is too stupid to make it work. If you are an Intel trivia buff, you will... Read more »

December 20, 2024 | Comment

IBM Courts Insurance Companies: Interesting Move from the Watson Folks

This blog post flowed from the sluggish and infertile mind of a real live dinobaby. If there is art, smart software of some type was probably involved. This smart... Read more »

December 20, 2024 | Comment

The Hay Day of Search Has a Ground Hog Moment

This blog post is the work of an authentic dinobaby. No smart software was used. I think it was 2002 or 2003 that I started writing the first of three editions... Read more »

December 19, 2024 | Comment

More Data about What Is Obvious to People Interacting with Teens

This blog post is the work of an authentic dinobaby. No smart software was used. Here’s another one of those surveys which provide some data about a very obvious... Read more »

December 19, 2024 | Comment


  • Archives

  • Recent Posts

  • Meta