An intelligent AI agent that answers founder-level business queries by integrating with monday.com boards. Built to handle messy real-world data with robust cleaning, natural language processing, and contextual memory.
- Natural Language Queries: Ask questions in plain English with conversational follow-ups
- Contextual Memory: Remembers previous questions for intelligent follow-ups
- Multi-Board Analytics: Joins deals and work orders for cross-board insights
- AI-Powered Insights: Natural language summaries explaining what the data means
- Interactive Dashboard: Clean Streamlit UI with dynamic visualizations
- ✅ Pipeline Analysis: Deal value, probability, stages, trends
- ✅ Revenue Analytics: Billing, collections, sector performance
- ✅ Risk Assessment: Stalled deals, uncollected invoices
- ✅ Sector Performance: Cross-board comparison (uses joined data)
- ✅ Resource Utilization: Owner workload analysis
- ✅ Operational Metrics: Conversion rates, deal cycle time
- ✅ Handles 52% missing deal values gracefully
- ✅ Detects and reports duplicates
- ✅ Aggregates one-to-many joins (multiple orders per deal)
- ✅ Normalizes inconsistent formats (dates, text, numbers)
- ✅ Transparent quality warnings shown to user
User Query → Gemini LLM (Intent Classification) → BI Engine (Analytics)
↓ ↓
History Context Data Quality Tracking
↓ ↓
Gemini LLM (Insight Generation) ← monday.com API (GraphQL)
↓
Streamlit UI
See docs/architecture.md for detailed system design.
- Python 3.12+
- monday.com account (free tier works)
- Google Gemini API key (Get free key)
-
Clone the repository
git clone https://github.com/SahilKhan101/intelliquery.git cd intelliquery -
Create virtual environment
python -m ven 8000 v venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Configure environment
cp .env.example .env # Edit .env with your credentials: # - GOOGLE_API_KEY=your_gemini_key # - MONDAY_API_KEY=your_monday_key # - DEALS_BOARD_ID=your_board_id # - ORDERS_BOARD_ID=your_board_id
-
Import data to monday.com
- See docs/setup_guide.md for step-by-step instructions
- Import provided CSV files to monday.com
- Copy board IDs to
.env
-
Run the application
streamlit run src/main.py
-
Open in browser
- Navigate to
http://localhost:8501 - Enable "Generate AI Insights" in sidebar for natural language explanations
- Start asking business questions!
- Navigate to
- "How's our pipeline looking?"
- "Show me deals in the mining sector"
- "What's the total pipeline value for high-probability deals?"
- "Show monthly closed deals for 2025"
- "What's our total revenue this quarter?"
- "Show monthly revenue trend"
- "What's our collection rate?"
- "Revenue for the energy sector"
- "Which deals are at risk?"
- "Show me uncollected invoices"
- "Compare performance across sectors"
- "What's our conversion rate?"
- User: "Show revenue for mining"
- Bot: [shows mining revenue]
- User: "What about energy?" ← Uses context!
- Bot: [shows energy revenue]
| Component | Technology | Why? |
|---|---|---|
| Language | Python 3.12 | Modern, rich ecosystem |
| LLM | Google Gemini 2.5 Flash | Fast, accurate, JSON mode |
| SDK | google.generativeai (native) | More reliable than LangChain |
| Data Source | monday.com GraphQL API | Flexible, single-request fetching |
| UI | Streamlit | Rapid prototyping, interactive |
| Data Processing | Pandas | Perfect for <10K rows |
| Visualization | Plotly | Interactive charts |
Why not LangChain? See docs/architecture.md for rationale.
intelliquery/
├── src/
│ ├── main.py # Streamlit application
│ ├── connectors/
│ │ └── monday_client.py # monday.com GraphQL integration
│ ├── processors/
│ │ ├── data_cleaner.py # Normalization & quality tracking
│ │ └── query_parser.py # LLM-powered intent classification
│ ├── analytics/
│ │ └── bi_engine.py # Business intelligence logic
│ └── utils/
│ └── date_utils.py # Flexible date parsing
├── docs/
│ ├── setup_guide.md # Step-by-step setup instructions
│ └── architecture.md # System design & rationale
├── config/
│ └── settings.py # Environment configuration
├── .env.example # Template for environment variables
├── requirements.txt # Python dependencies
└── README.md # This file
IntelliQuery handles real-world data issues transparently:
- ✅ Missing Values: 52% of deals have null
deal_value→ Still calculates totals from valid data - ✅ Inconsistent Dates: Excel serial dates, ISO strings, DD/MM/YYYY → Normalized via
parse_date_flexible() - ✅ Duplicates: Detects duplicate
deal_codeentries → Reports in quality section - ✅ One-to-Many Joins: Deals with multiple work orders → Aggregates numeric fields (sums revenue)
- ✅ Text Normalization: "High" vs "high" vs "HIGH" → Standardized
- Data quality warnings shown in expandable UI sections
- Severity-based reporting (🔴 High, 🟡 Medium, 🟢 Low)
- Tells users exactly what data was excluded and why
python test_api_key.py- ✅ Pipeline analysis works
- ✅ Revenue analysis shows trend charts
- ✅ Risk assessment identifies stalled deals
- ✅ Sector performance shows cross-board analytics
- ✅ Follow-up questions use context
- ✅ Data quality warnings appear
- ✅ Charts render without duplicate ID errors
- ✅ API keys in
.env(never committed to git) - ✅
.gitignoreconfigured for sensitive files - ✅ No hardcoded credentials
- ✅ No SQL injection risk (intent-based, not text-to-SQL)
- No Write Capabilities: Read-only access to monday.com (marked as bonus feature)
- Single User: No authentication or multi-tenancy
- Small Dataset Optimized: Best for <10K rows (uses Pandas in-memory)
- No Forecasting: Shows trends but not predictive analytics
- >10K rows: Migrate to PostgreSQL
- >100K rows: Use DuckDB or ClickHouse
- Multiple users: Add authentication & rate limiting
See docs/architecture.md for scaling path.
- Setup Guide: Detailed installation and configuration
- Architecture Document: System design, tech choices, trade-offs
This project fulfills all requirements:
| Requirement | Status | Implementation |
|---|---|---|
| monday.com Integration | ✅ Complete | GraphQL API, dynamic boards |
| Data Resilience | ✅ Complete | Nulls, duplicates, normalization, transparency |
| Query Understanding | ✅ Complete | Intent classification, clarifying questions |
| Business Intelligence | ✅ Complete | 6 analysis types, joined board analytics |
| Natural Language Output | ✅ Complete | AI-generated insights (toggle in sidebar) |
| Setup Instructions | ✅ Complete | docs/setup_guide.md |
| Architecture Doc | ✅ Complete | docs/architecture.md |
| Source Code Quality | ✅ Complete | Clean, documented, error-handled |
- LangChain isn't always the answer: Native SDK gave better reliability
- Data resilience > perfect data: Real-world data is always messy
- Transparency builds trust: Show users what data is missing
- Context matters: Chat history enables intelligent follow-ups
This is a technical assignment project. Feedback welcome via GitHub issues!
Author: Sahil Khan
GitHub: @SahilKhan101
Project: IntelliQuery
Built with ❤️ for data-driven decision making