This repository contains a jupyternotebook file which contains a working RAG demo.
- You need to setup a opensearch database locally. I prefer using Docker.
- You need an OpenAI apikey which some credits on it.
- Setup
- Connect to Opensearch
- Connec to OpenAI
- Create an index in the Opensearch database
- Extraxt text from the PDF
- Chunk the text
- Generate embeddings (from the chunks)
- Index the chunks
- Experiment with kNN results
- Write a query
- Get the top 5 results (Retrieval step)
- Build the context for the LLM (Augmentation step)
- Prompt the LLM (Generation step)
The following list is what comes to mind:
- Add more then 1 document
- Add and use metadata. From which source comes your information?
- Upgrade code to python functions
- build a pipeline