GAR: Graphical Agentic Retrieval

MODULAR . AGENTIC . EFFICIENT

A collaboration with ST Engineering – GAR is a state-of-the-art Retrieval-Augmented Generation (RAG) system designed to bridge the gap between data and meaningful insights with swift and accurate response customised to company’s finance-specific inquiries.

Introducing GAR: Graphical Agentic Retrieval

The process to gather financial data is a tedious path with access to accurate, context-aware information being more important than ever. GAR enhances the way users interact with information, by retrieving relevant documents and data from a structured knowledge base and generates precise, fact-based responses efficiently in real time.

Team members

Langarkande Rishita Sanjeev (ESD), Loy Pek Yong (CSD), Aravind Gopinath Nair (CSD), Sim Reynard Simano (CSD), Lim Yongqing (CSD)

Instructors:

  • Cyrille Pierre Joseph Jegourel

Writing Instructors:

  • Susan Wong

What is GAR?

GAR is system based on the popular AI framework called Retrieval-Augmented Generation or RAG. Unlike many other system, GAR changes the way normal RAG behaves by utilizing advanced retrieval system and uses the state-of-the-art framework out there.

Summary of GAR Architecture

/

What makes GAR better?

Knowledge Bases
Knowledge Graph
Agentic

A single Knowledge Base in our project is akin to having a generic single retrieval augmented generation (RAG) system. We decided to modularize the RAG system to allow for the creation of multiple Knowledge Bases as we wanted a Knowledge Base to be considered a domain expert that can answer queries in their relevant fields.

The reason we decided to split the information into specific sectors with different knowledge bases instead of storing all the information into a single knowledge base was to reduce the time needed to search through the vector database. Smaller vector databases would be more efficient and accurate in retrieving the information required by the agent, compared to having one large vector database which could have lower accuracy and take longer to retrieve the information.

Given the wide array of powerful tools available, the following are some of the makeup solutions we have employed in developing the Knowledge Base:

  • PymuPDF (PDF Parser)
  • Dsrag (Chunking)
  • Open AI Embedding (Embedding)
  • Open AI GPT 4o Mini (Large Language Model)
  • ChromaDb (Vector database)
  • Cohere Re-Ranker (Re-Ranker)

The Knowledge Graph is a complementary system to the Knowledge Base. It serves as a modifier for the original query using relational properties provided by graph structures to search for related sections, acting as background knowledge for the subsequent querying of the Knowledge Base. This Knowledge Graph is a relational database built using Neo4j’s graph database system, which has good query performance even when the database gets larger.

Neo4j uses Structured Query Language (SQL), more specifically known as Cypher Query Language (CQL), which requires structured procedures for querying and manipulating data, as opposed to a vectored database.

The Knowledge Graph provides useful features such as:

  • Relational modeling
  • Atomicity, Consistency, Isolation, and Durability (ACID) compliant
  • Strong data integrity and consistent data
  • Suitable for applications such as financial systems, e-commerce, and customer relationship management (CRM)
  • Data visualization

Agentic refers to the capacity to act independently and achieve outcomes through self-directed actions and informed decision-making and our GAR is all that.

GAR employs a variety of methods to dynamically create different tools to determine the next best action. Leveraging Langchain library in combination with OpenAI LLM, GAR is able to autonomously set up the tool creation process, allowing it to adapt to diverse scenarios. GAR intelligently evaluates a task, utilizes the different tools, and creates new ones to enhance its decision-making capabilities if deemed necessary. This system ensures efficiency by optimizing tool selection and integrating reasoning frameworks called ReAct to generate the next best action.

Benchmark

FinanceBench is a comprehensive benchmark developed by Patronus AI to evaluate the performance of large language models (LLMs) in financial question answering. It comprises over 10,000 questions related to publicly traded companies, each accompanied by corresponding answers and evidence strings. The questions are designed to be clear-cut and straightforward, serving as a minimum performance standard for LLMs in financial contexts.

Usnig FinanceBench, GAR was able to answer 93/102 questions and achieves an outstanding evaluation score of 91%, triumphing over other similar systems

Acknowledgements

Our team would like to thank our Capstone instructors: Dr Cyrille Jegourel & Dr Susan Wong for their valuable advice which were pivotal to GAR’s success.

The team would like to thank our ST Engineering mentors, Mr Ween Jiann Lee and Chew Kay Thiam Dennis,  for their guidance and help in providing the support we needed.

Menu

ornament-menu

Contact the Capstone Office :

+65 6499 4076

8 Somapah Road Singapore 487372

Please fill in your information below and feedback

Contact the Capstone Office :

8 Somapah Road Singapore 487372

8 Somapah Road Singapore
487372

Welcome back!

Log in to your existing account.

Contact the Capstone Office :

+65 6499 4076

8 Somapah Road Singapore 487372

Welcome back!

Log in to your existing account.