E603
Skip to content

sahnoun11/ThreatLens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

9 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ” ThreatLens

AI-Powered Log Analysis & Threat Hunting Assistant

Python Streamlit Groq Ollama PostgreSQL License

Built by Oussama Sahnoun β€” for the community 🌍

100% free AI assistant that hunts threats in your logs like a senior SOC analyst

ThreatLens Banner

πŸŽ₯ Live Demo

ThreatLens Demo on YouTube
▢️ Click to watch the full demo on YouTube


πŸ“Œ Overview

ThreatLens is an open-source AI-powered log analysis and threat hunting assistant built for SOC analysts and cybersecurity professionals. Upload your Windows Event Logs (.evtx) or Linux logs (.txt), ask questions in plain English, and get expert-level threat hunting answers in seconds β€” faster than any manual review.

Built on a fully free stack: Groq API for ultra-fast LLM inference, Ollama for local private embeddings, and PostgreSQL + pgvector for semantic search. No data leaves your machine.

πŸ‡ΉπŸ‡³ Built in Tunisia β€” 100% free stack, no international payment required


⚑ Key Features

Feature Description
πŸ“‚ Log ingestion Windows EVTX and Linux TXT log support
🌐 URL ingestion Scrape and index any web page into the knowledge base
🧠 RAG pipeline Logs are chunked, embedded locally, and stored in pgvector
⚑ Groq LLM Ultra-fast inference with LLaMA 3.3 70B via free Groq API
πŸ”’ Private embeddings Fully local via Ollama β€” no data sent to the cloud
πŸ’¬ Conversational memory Multi-turn chat with full history awareness
πŸ†“ 100% Free Groq API + Ollama + Docker β€” zero cost

πŸ“₯ Supported Log Formats

Format Extensions Status
Windows Event Log .evtx βœ… Supported
Plain text / Linux logs .txt βœ… Supported
JSON logs .json πŸ”œ Roadmap
CSV logs .csv πŸ”œ Roadmap
XML event logs .xml πŸ”œ Roadmap

πŸ—οΈ Architecture

User uploads log (EVTX / TXT) or pastes a URL
              β”‚
              β–Ό
     Chunker (300 chars/chunk)
     Safe for nomic-embed-text 512-token limit
              β”‚
              β–Ό
   Ollama (nomic-embed-text)  ──►  pgvector (PostgreSQL)
              β”‚
              β–Ό
       User asks a question
              β”‚
              β–Ό
   Semantic search in pgvector
              β”‚
              β–Ό
   Groq LLM β€” LLaMA 3.3 70B
              β”‚
              β–Ό
   Expert threat hunting answer 🎯

🧰 Tech Stack

Component Technology
UI Streamlit
LLM Groq API β€” llama-3.3-70b-versatile (free)
Embeddings Ollama β€” nomic-embed-text (local, private)
Vector DB PostgreSQL + pgvector (Docker)
RAG Framework phidata
EVTX Parser python-evtx

πŸš€ Quick Start

Prerequisites


1. Clone the repo

git clone https://github.com/sahnoun11/threatlens.git
cd threatlens

2. Start the database

docker run -d --name threatlens-db --restart always \
  -e POSTGRES_DB=ai \
  -e POSTGRES_USER=ai \
  -e POSTGRES_PASSWORD=ai \
  -p 5532:5432 ankane/pgvector

3. Start Ollama and pull the embedding model

ollama serve
ollama pull nomic-embed-text

4. Python setup

python3 -m venv venv
source venv/bin/activate       # Windows: venv\Scripts\activate
pip install -r requirements.txt

5. Set your free Groq API key

export GROQ_API_KEY="your_free_key_here"

Or create a .env file:

GROQ_API_KEY=your_free_key_here

6. Launch ThreatLens

streamlit run app.py

Open your browser at http://localhost:8501 πŸš€


πŸ“ Project Structure

threatlens/
β”œβ”€β”€ app.py              # Streamlit UI + file readers + chunking logic
β”œβ”€β”€ assistant.py        # AI brain β€” Groq + RAG + SOC analyst prompts
β”œβ”€β”€ requirements.txt    # Python dependencies
β”œβ”€β”€ assets/             # Banner and media files
β”œβ”€β”€ LICENSE             # MIT License
└── README.md

πŸ“¦ requirements.txt

streamlit>=1.35.0
phidata>=2.4.0
groq>=0.9.0
ollama>=0.2.0
pgvector>=0.2.5
psycopg[binary]>=3.1.0
sqlalchemy>=2.0.0
python-evtx>=0.7.4
evtx>=0.8.2
requests>=2.31.0
beautifulsoup4>=4.12.0
openai>=1.0.0

πŸ§ͺ Test Datasets

Don't have logs to test with? Here are great free resources:

Windows EVTX:

Linux Logs (TXT):

  • Loghub β€” SSH, syslog, auth.log samples
  • SecRepo β€” Apache, DNS, IDS logs

From your own machine:

cp /var/log/auth.log ~/test_auth.txt
cp /var/log/syslog   ~/test_syslog.txt
dmesg > ~/test_dmesg.txt

🎯 Example Questions

Once you upload a log file, try asking:

What failed login attempts are in this log?
Are there any signs of lateral movement?
Summarise all privilege escalation events.
List all unique source IPs and flag suspicious ones.
What happened between 2:00 AM and 3:00 AM?
Are there any indicators of compromise (IOCs)?

🀝 Contributing

Contributions are welcome from the community! Feel free to open issues or pull requests for:

  • New log format support (JSON, CSV, XML, Syslog)
  • Better chunking strategies
  • MITRE ATT&CK mapping
  • Dashboard / visualisation features
  • Bug fixes and improvements

πŸ“„ License

MIT β€” see LICENSE for details.


πŸ‘¨β€πŸ’» Built by Oussama Sahnoun

"Analyse faster. Hunt smarter. Stay ahead." πŸ›‘οΈ

⭐ Star this repo if you find it useful!

About

ThreatLens is a free, open-source AI assistant that analyses Windows Event Logs and Linux logs like a senior SOC analyst --- powered by Groq LLaMA 3.3 and local embeddings.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

0