Scrape website content, store it in a vector DB, and ask questions about it using a local LLM (Mistral via Ollama). Built with LangChain, FAISS, and Streamlit.
- 🌍 Web scraping using
requests+BeautifulSoup - 🔍 Embedding text chunks via
sentence-transformers - 💾 Semantic search using FAISS vector database
- 🤖 Local LLM (Mistral via Ollama) for Q&A
- 🖥️ Easy-to-use Streamlit UI
git clone https://github.com/AzkaSahar/AI-Web-Scraper.git
cd AI-Web-Scraperpip install -r requirements.txt- Install and run Ollama on your machine
- Pull the Mistral model:
ollama pull mistralstreamlit run ai_webscraper.py- Input a website URL
- It scrapes and stores text chunks in a FAISS index
- Ask a question — the app retrieves relevant content and passes it to the LLM
- The LLM answers based on that content
├── ai_webscraper.py # Main Streamlit script
├── requirements.txt
└── README.md
MIT — free to use, modify, and distribute.