Juicing Up RAG: The RAG Bop Bop
December 26, 2024
Can improved information retrieval techniques lead to more relevant data for AI models? One startup is using a pair of existing technologies to attempt just that. MarkTechPost invites us to “Meet CircleMind: An AI Startup that is Transforming Retrieval Augmented Generation with Knowledge Graphs and PageRank.” Writer Shobha Kakkar begins by defining Retrieval Augmented Generation (RAG). For those unfamiliar, it basically combines information retrieval with language generation. Traditionally, these models use either keyword searches or dense vector embeddings. This means a lot of irrelevant and unauthoritative data get raked in with the juicy bits. The write-up explains how this new method refines the process:
“CircleMind’s approach revolves around two key technologies: Knowledge Graphs and the PageRank Algorithm. Knowledge graphs are structured networks of interconnected entities—think people, places, organizations—designed to represent the relationships between various concepts. They help machines not just identify words but understand their connections, thereby elevating how context is both interpreted and applied during the generation of responses. This richer representation of relationships helps CircleMind retrieve data that is more nuanced and contextually accurate. However, understanding relationships is only part of the solution. CircleMind also leverages the PageRank algorithm, a technique developed by Google’s founders in the late 1990s that measures the importance of nodes within a graph based on the quantity and quality of incoming links. Applied to a knowledge graph, PageRank can prioritize nodes that are more authoritative and well-connected. In CircleMind’s context, this ensures that the retrieved information is not only relevant but also carries a measure of authority and trustworthiness. By combining these two techniques, CircleMind enhances both the quality and reliability of the information retrieved, providing more contextually appropriate data for LLMs to generate responses.”
CircleMind notes its approach is still in its early stages, and expects it to take some time to iron out all the kinks. Scaling it up will require clearing hurdles of speed and computational costs. Meanwhile, a few early users are getting a taste of the beta version now. Based in San Francisco, the young startup was launched in 2024.
Cynthia Murrell, December 26, 2024
Comments
Got something to say?