Use Case Financial
RAG for CHATBOT
Our client is a major player in the Savings sector in France and international. They process a lot of contracts, term sheets, and brochures for various clients. These documents usually contain financial information such as payment calendars, prices, regulatory compliance details, etc.
Manually searching for the required information takes a lot of time, which reduces productivity.
They would like to develop a chatbot that allows employees to interact with these documents and obtain the required information in a fraction of the time, along with the source from which the response is derived.
Challenges
The task involves extracting key information from multi-page documents containing text, tables, and various data types, requiring an understanding of long contexts and domain-specific knowledge to identify the primary section of interest among similar but slightly varied content.
Complex document structure
Multi-page documents include both text and tables. The relevant information is presented in various data types (text, numbers, and dates).
Contextual data extraction
Extracting the required information often involves understanding long contexts and domain-specific knowledge.
Identifying relevant sections
Many sections share similar information with slight variations, and only one section is of primary interest.
Solution
The proposed solution involves loading and processing documents as a vector database to enable efficient information retrieval. By utilizing similarity search, the system can identify the top-k most relevant text chunks for a given query. A Retrieval-Augmented Generation (RAG) system is then employed to generate the final response, leveraging advanced technologies such as LangChain, Faiss, and the OpenAI API for the latest GPT models. To make the system user-friendly, Streamlit will be used to create a web application where users can upload documents and interact with the chatbot. Once a document is uploaded, the user can ask questions, and the system will instantly return the response based on the most relevant information extracted from the document.
Tech stack
Results
In just five weeks, a demo web application was developed and deployed on an AWS instance. The app successfully uploads and processes PDF/Text documents, handling various formats.
83%
Correct responses
59
Q-A pairs for evaluation
