Big Data and Search Solving Massive Language Processing Headaches

December 4, 2017

Written language can be a massive headache for those needing search strength. Different spoken languages can complicate things when you need to harness a massive amount of data. Thankfully, language processing is the answer, as software architect Federico Thomasetti wrote in his essay, “A Guide to Natural Language Processing.”

According to the story:

…the relationship between elements can be used to understand the importance of each individual element. TextRank actually uses a more complex formula than the original PageRank algorithm, because a link can be only present or not, while textual connections might be partially present. For instance, you might calculate that two sentences containing different words with the same stem (e.g., cat and cats both have cat as their stem) are only partially related.

 

The original paper describes a generic approach, rather than a specific method. In fact, it also describes two applications: keyword extraction and summarization. The key differences are:

  • the units you choose as a foundation of the relationship
  • the way you calculate the connection and its strength

Natural language processing is a tricky concept to wrap your head around. But it is becoming a thing that people have to recognize. Currently, millions of dollars are being funneled into perfecting this platform. Those who can really lead the pack here will undoubtedly have a place at the international tech table and possibly take over. This is a big deal.

Patrick Roland, December 4, 2017

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta