No Fooling: Copyright Enforcer Does Indexing Too

April 1, 2020

The Associated Press is one of the oldest, most respected, and widely read news services in the world. As more than half the world reads Associated Press, it makes one wonder how the news services organizes and distributes its content. Synaptica has more details in the article, “Synaptica Insights: Veronika Zielinska, The Associated Press.”

Veronika Zielinska has a background in computational linguistics and natural language. She was interested in how automated tagging, taxonomies, and statistical engines apply rules to content. She joined Associated Press’s Information Management team in 2005, then moving up to the Metadata Technology team. Her current responsibilities are to develop the Metadata Services platform, fine tuning search quality and relevancy for content distribution platforms, scheme design, data transformations, analytics and business intelligence programs, and developing content enrichment methods.

Zielinska offers information on how the Associated Press builds a taxonomy:

“We looked at all the content that AP produced and scoped our taxonomy to cover all possible topics, events, places, organizations, people, and companies that our news production covered. News can be about anything – it’s broad, but we also took into account there are certain areas where AP produces more content than others. We have verticals that have huge news coverage – this can be government, politics, sports, entertainment and emerging areas like health, environment, nature, and education. Looking at the content and knowing what the news is about helps us to develop the taxonomy framework. We took this content base and divided the entire news domain into smaller domains. Each person on the team was responsible for their three or four taxonomy domains. They became subject and theme matter experts.”

The value of Associated Press’s taxonomies comes from the entire content package that includes everything from photos, articles, and videos centered around descriptive metadata that makes it agreeable and findable.

While the Associated Press is a non-profit news service, they do offer a platform called AP Metadata Services that is used by other news services. The Associated Press frequently updates its taxonomy with new terms when they enter the media. The AP taxonomy team works with the AP Editorial team to identify new terms and topics. The biggest challenges Zielinska faces are maintenance and writing in a manner that the natural language processing algorithms can understand it.

As for the future, Zielinska fears news services losing their budgets, local news not getting as much coverage, and the spread of misinformation. The biggest problem is that automated technologies can take the misinformation and disseminate it. She advises, “Managers can help by creating standardized vocabularies for fact checking across media types, for example, so that deep fakes and other misleading media can be identified consistently across various outlets.”

Whitney Grace, April 1, 2020


Comments are closed.

  • Archives

  • Recent Posts

  • Meta