8000
Skip to content

NathanNeelis/llm_rag_demo_pdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DEMO Large Language Models RAG

This repository contains a jupyternotebook file which contains a working RAG demo.

setup

  1. You need to setup a opensearch database locally. I prefer using Docker.
  2. You need an OpenAI apikey which some credits on it.

In this demo

  1. Setup
  2. Connect to Opensearch
  3. Connec to OpenAI
  4. Create an index in the Opensearch database
  5. Extraxt text from the PDF
  6. Chunk the text
  7. Generate embeddings (from the chunks)
  8. Index the chunks
  9. Experiment with kNN results
  10. Write a query
  11. Get the top 5 results (Retrieval step)
  12. Build the context for the LLM (Augmentation step)
  13. Prompt the LLM (Generation step)

Future work

The following list is what comes to mind:

  1. Add more then 1 document
  2. Add and use metadata. From which source comes your information?
  3. Upgrade code to python functions
  4. build a pipeline

About

Walking through the RAG steps. Using opensearch as a vectordatabase. Extract and embed text from PDF. Using kNN for retrieval and OpenAI for generation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

0