INTELLIPAL, Hybrid RAG System

Imagine yourself as a rookie officer..

Let's Identify the Problem Statement

Ground Response Force officers (GRF) currently lack a reliable method to access critical Standard Operating Procedures (SOPs) and legal reference materials while deployed in the field.

The existing knowledge base is hindered by keyword-dependent search mechanisms and a reliance on stable internet connectivity, which is often unavailable in operational “blackspots”.

Introducing..

Take a Closer Look on Our Product

System Architecture

Indexing Pipeline

Large Language Models (LLMs) can produce outdated or hallucinated outputs due to overreliance on parametric memory; Retrieval-Augmented Generation (RAG) mitigates this by grounding responses in retrieved, domain-specific documents, improving trustworthiness and traceability. Documents are split into semantically coherent chunks and embedded for effective retrieval, while a KNN-clustering-based partial loading approach reduces RAM usage for resource-constrained mobile environments.

Edge Device

Our edge system uses a C++ interoperability layer to connect mobile environments with a low-latency llama.cpp engine, combining a unified schema and C++-compatible vector database to enable optimized dot-product similarity search for millisecond-scale retrieval. It employs hybrid offline-online orchestration (Online inference API + local inference engine) for reliability in all environments, and is built on a model-agnostic, database-agnostic architecture for scalable, future-proof integration of new LLMs and vector stores.

UI/UX

This project applied a human-centred design (HCD) process grounded in Don Norman’s principles of discoverability, feedback, and affordance. It began with interviews with SPF personnel to identify friction points in SOP retrieval, followed by iterative design and A/B testing to evaluate search affordances, result hierarchy, and navigation CTAs against task-completion metrics. Insights informed low-fidelity wireframes for early validation, before progressing to a high-fidelity prototype aligned with SGDS v2 design tokens, WCAG 2.1 AA accessibility standards, and Samsung Galaxy S20 One UI constraints.

Product Evaluation

96% ↓

Time to Reach Target Content

6.3s → 0.9s

Time to Retrieve Confidence Score

36x

System Throughput Increase

Our evaluation proves that the system is fast, reliable, and ready for frontline use after several technical improvements. We rigorously tested the prototype using four specific metrics: Faithfulness, Answer Relevance, Context Precision, and Context Recall. The results show that our search pipeline is highly accurate, consistently finding the right documents without making up false information.

We learned that while shrinking the model and compressing the text speeds things up, it can hurt the factual correctness of the final answers. To solve this, we found that using 4-bit or 8-bit models with strict noise filtering provides the best balance between speed and truth. Through four major updates, we massively improved the overall generation speed, jumping from 2 to 73 tokens per second and dropping the initial wait time to just 1.8 seconds. Combined with faster database search methods like partial loading, the final system successfully delivers instant, trustworthy guidance to officers directly on their devices.

Impacts

Economic

Economic Impact

It optimizes manpower by reducing administrative bottlenecks.

Social

Social Impact

Our system increases public safety through faster response times.

Environment

Environment Impact

It pushes users toward a truly paperless, digital-first frontline.

Users Feedback

John Doe

ex-GRF NSF

The app is quite easy to use and intuitive. Would have loved to have something like this in my starting days, so don't need bother supervisor as much

Josh Doe

ex-GRF NSF

This app can be quite helpful to new joinees while on the way to the site.

Jack Doe

NSF

This can be resourceful for times we face connectivity blackspots like in mall basements or carparks.

Acknowledgements

The huge progress of this Capstone project would not have been possible without the invaluable guidance, support, and collaboration of several key individuals and institutions. 

We extend our sincere gratitude to our SUTD Capstone Mentors, Dr. Fredy Tantri, Prof. Kenny Choo, and Geraldine Quek for their unwavering support, rigorous feedback, and technical direction, which were instrumental in shaping the project’s architecture and methodology. 

Our deepest appreciation goes to our industry partner, HTX. We extend special thanks to Shisheng Huang, Calista Choy, Prasanth Karthikeyan, and Justin Yeo for providing us with the critical operational insights, access to Ground Response Force (GRF) personnel, and the real-world problem statement that grounded this research. Their commitment to innovation and willingness to collaborate with our team was essential.