Featured
Entity Extraction: Not As Simple As Some Vendors SayNo smart software. Just a dumb dinobaby. Oh, the art? Yeah, MidJourney.
Most of the systems incorporating entity extraction have been trained to recognize the names of simple entities and mostly based on the use of capitalization. An “entity” can be a person’s name, the name of an organization, or a location like Niagara Falls, near Buffalo, New York. The river “Niagara” when bound to “Falls” means a geologic feature. The “Buffalo” is not a Bubalina; it is a delightful city with even more pleasing weather.
The same entity extraction process has to work for specialized software used by law enforcement, intelligence agencies, and legal professionals. Compared to entity extraction for consumer-facing applications like Google’s Web search or Apple Maps, the specialized software vendors have to contend with:
- Gang slang in English and other languages; for example, “bumble bee.” This is not an insect; it is a nickname for the Latin Kings.
- Organizations operating in Lao PDR and converted to English words like Zhao Wei’s Kings Romans Casino. Mr. Wei has been allegedly involved in gambling activities in a poorly-regulated region in the Golden Triangle.
- Individuals who use aliases like maestrolive, james44123, or ahmed2004. There are either “real” people behind the handles or they are sock puppets (fake identities).
Why do these variations create a challenge? In order to locate a business, the content processing system has to identify the entity the user seeks. For an investigator, chopping through a thicket of language and idiosyncratic personas is the difference between making progress or hitting a dead end. Automated entity extraction systems can work using smart software, carefully-crafted and constantly updated controlled vocabulary list, or a hybrid system.
Automated entity extraction systems can work using smart software, carefully-crafted and constantly updated controlled vocabulary list, or a hybrid system.
Let’s take an example which confronts a person looking for information about the Ku Group. This is a financial services firm responsible for the Kucoin. The Ku Group is interesting because it has been found guilty in the US for certain financial activities in the State of New York and by the US Securities & Exchange Commission.
Interviews
DarkCyber, March 29, 2022: An Interview with Chris Westphal, DataWalkChris Westphal is the Chief Analytics Officer of DataWalk, a firm providing an investigative and analysis tool to commercial and government organizations. The 12-minute interview covers DataWalk’s unique capabilities, its data and information resources, and the firm’s workflow functionality. The video can be viewed on YouTube at this location.
Stephen E Arnold, March 29, 2022
Latest News
Russian Drug Trade Likes That CryptocurrencyNo smart software involved. Just a dinobaby’s work. High tech innovation meets traditional thuggery in Russia’s expanding drug trade. The Global Initiative Against... Read more »
Marketing Milestone 2024: Whither VM?When a vendor jacks up prices tenfold, customers tend to look elsewhere. If VMware‘s new leadership thought its clients had no other options, it was mistaken.... Read more »
Code Graveyards: Welcome, Bad ActorsDid you know that the siloes housing nuclear missiles are still run on systems from the 1950s-1960s? These systems use analog computers and code more ancient than... Read more »
FOGINT: What Do the Most Recent Telegram Function Enhancements Portend for 2025?This is a report from the FOGINT research team. For a company without a permanent office with staff who show up everyday, Telegram has been busy in December 2024.... Read more »
Google, the Modern Samurai, Becomes a Ronin. Banzai!Written by a dinobaby, not an over-achieving, unexplainable AI system. I read “Google to Fight Japan’s Claims That It Harms Rivals in Search.” This paywalled... Read more »
Paywalls: New Angles for Bad ActorsInformation literacy is more important now than ever, especially as people become more polarized in their views. This is due to multiple factors such as the news... Read more »
A Better Database of SEC Filings?DocDelta is a new database that says it is, “revolutionizing investment research by harnessing the power of AI to decode complex financial documents at scale.”... Read more »
A New Year Alert: Americans Cannot ReadThe United States is a large country with a self-contained nature. Because of its monolith status, the United States is very isolated. The rest of the world views... Read more »
WhatsApp: Chasing More MoneyMeta aims to make WhatsApp indispensable to businesses around the world. The app is currently responsible for just a fraction of the company’s revenue, but... Read more »
The US and Math: Not So HotIn recent decades, the US educational system has increasingly emphasized teaching to the test over niceties like critical thinking and deep understanding. How is... Read more »