Guide

How to Chat with Long Documents Using ChatGPT? - Three Methods

Oct 24, 2023

With the advent of ChatGPT, utilizing AI has become a common practice for many.

Generative AI finds applications in various aspects of life.

On a daily basis, ChatGPT assists business professionals, students, researchers, pupils, teachers, and probably just about anyone.

However, ChatGPT is not without limitations.

Not always the material we want ChatGPT to process and comprehend is long enough to fit within a single prompt.

The character limit in a prompt is a significant drawback of ChatGPT.

Nevertheless, there are various solutions to bypass this limitation, and in this article, we will demonstrate three possible approaches.

1. Prompt Splitting

The technique of prompt splitting is one of the methods of what's known as prompt engineering.

It involves initially providing an instruction to make ChatGPT acknowledge the individual parts of the divided text we want it to understand. Following this, it confirms the receipt of all parts, allowing us to proceed with further instructions related to the text.

An example of splitting a text into smaller parts:

💡 The total length of the content that I want to send you is too large to send in only one piece. For sending you that content, I will follow this rule: [START PART 1/2] first part of the splitted text Do not answer yet. This is just another part of the text I want to send you. Just receive and acknowledge as "Part 2/2 received" and wait for the next part. [START PART 2/2] second part of the splitted text

Several free websites offer automated text splitting for ChatGPT prompts.

They allow you to paste the text you're interested in or upload a file, and then divide the content into multiple prompts based on the length of your text.

Next, you copy these prompts and paste them into ChatGPT one after another.

2. Using Vector Database and Langchain

Not always does the information we request from ChatGPT pertain to the entire document.

Frequently, the information may concern only a snippet of the document we want to process.

When we want to analyze a legal document or a research paper using ChatGPT, we might inquire about a specific piece of information, e.g., the question "When was ChatGPT launched?" can be found in a single sentence within the document: "ChatGPT was launched as a prototype on November 30, 2022."

So, we don't need to feed ChatGPT the entire document when just that one chunk of the document will suffice.

How to Chat with Files?

The solution that automates this process combines three key components:

  • Embeddings
  • Vector Database
  • Prompt Chaining

Let's go through each of them to better explain the mechanism of chatting with files.

Embeddings

An embedding is a numerical representation of words.

In other words, to embed content means to transform it into its vectorized representation.

Vector Database

A vector database is a space where we can store vectors created through embeddings. To store them in an organized manner, we use special indexes. By utilizing algorithms, we can search a given index for vectors (words) that are similar. This is achieved by sending a vectorized query to our index, which then returns the most similar text fragments.

Prompt Chaining

Chaining prompts allows us to cleverly combine our prompt with the text fragments returned from the vector database and send this combined prompt to a model like ChatGPT. This way, we can chat with really long documents without having to load them beforehand in several prompts to ChatGPT.

How to Train Your AI Chatbot Using Langchain, a Vector Database, and Chat GPT?

This mechanism may seem somewhat complex. We have many individual components that must work sequentially together to enable us to chat with long documents.

However, Python and the Langchain library come to the rescue, doing a significant portion of the work for us. Langchain is a library designed to simplify the creation of advanced applications using large language models. It assists in quickly integrating APIs of various language models (including OpenAI), vector databases, and many other things.

So, let's build an AI chatbot that can chat with long documents.

This will be a minimal solution that allows us to chat with a PDF document.

We will do it in several steps:


First, install all the necessary Python libraries:


pip install faiss-cpu openai langchain tiktoken

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores.faiss import FAISS
from langchain.chains import RetrievalQA
from langchain import OpenAI
from langchain.document_loaders import PyMuPDFLoader
import openai
import faiss

Next, we need to load the OpenAI API token as an environment variable:


import os
os.environ["OPENAI_API_KEY"] = ""
llm = ChatOpenAI(temperature=0)

In the following step, we'll load our PDF file and immediately split it into smaller chunks, using sentence-based segmentation. This will make the text fragments we send to ChatGPT more understandable and contextually relevant:


loader = PyMuPDFLoader("example_data/layout-parser-paper.pdf")
loader.split_by = 'sentence'
documents = loader.load_and split()

Now, we'll embed our entire PDF document, transforming it into vectors, and save it in the FAISS vector database:


# Initialize the FAISS vector store with the documents and the desired text embeddings:
vectorstore = FAISS.from_documents(documents, OpenAIEmbeddings())

Finally, we'll load a chain that allows us to connect our queries with the results from the vector database and efficiently send "chained" prompts to the ChatGPT model:


# Initialize the RetrievalQA chain with the FAISS retriever and the language model:
qa_chain = RetrievalQA.from_chain_type(llm, retriever=vectorstore.as_retriever())

Now, you can test our system by sending a query to the model:


# Use the run() method of the QA chain to ask questions and get answers:
question = "What is the capital of France?"
result = qa_chain.run(question)
answer = result['answer']
# The answer variable will contain the answer to the question based on the PDF.

The entire code from this tutorial is available for testing here.

3. Chat with Any Document Using Knowbase

Using open libraries and writing code to chat with documents is not the best solution for everyone.

Along the way, you may encounter many unknowns that you may not always be able to resolve on your own.

The solution that saves time is Knowbase.

Knowbase makes it easy to chat with various types of documents using ChatGPT models.

The process is incredibly straightforward.

You add the documents you want to chat with:

And then you're ready to start chatting, asking questions, and summarizing:

That's not all – you can also upload recordings and YouTube videos. Knowbase transcribes them, enabling you to chat with videos as well.

Each response from the chatbot is accompanied by a timestamp that corresponds to the segment of the recording from which the response was derived. You can click on the bubble below the response, and it will take you to the relevant segment in the recording.

All your files will be stored in your Library, ensuring you have access to them whenever you need to retrieve essential information.

ChatGPT's evolution has ushered in a new era of AI-assisted productivity, yet its character limit can be a hurdle. However, the innovative strategies discussed in this article empower users to transcend these boundaries, ensuring that ChatGPT remains an invaluable tool for professionals, students, and researchers across diverse fields.

Also, take a look at: Step-by-Step Guide on Creating a Knowledge Base Chatbot Using Knowbase.

Try Knowbase for free today

Create your Knowledge Base powered by AI today and
never look back

Get started for free