Arkime is an open source, large scale, full packet capturing, indexing, and database system.
-
Updated
Apr 11, 2026 - C
8000
Arkime is an open source, large scale, full packet capturing, indexing, and database system.
🔍 Capture and segment multiview 3D expressions using our Multimodal Visual Geometry Grounded Transformer for enhanced visual comprehension.
📈 Build a real-time stock market data pipeline with Snowflake, Kafka, and Power BI for effective analytics and data engineering practices.
📊 Build a real-time data pipeline for YouTube trending analytics using Kafka, Spark, and Airflow for efficient processing and insightful visualizations.
🔍 Query any database from the terminal and optimize data for AI models with ease using usql.
YTsaurus is a scalable and fault-tolerant open-source big data platform.
🌍 Ingest and transform real-time earthquake data using Azure tools to deliver reliable analytics-ready datasets for insightful decision-making.
🚀 Enhance your AI prompts effortlessly with Synapse, a dual-platform app that boosts results through intelligent techniques and multi-provider support.
🚴 Visualize real-time availability of Vélib' bike stations in Paris, showing available bikes and empty docks with interactive Mapbox GL.
🌆 Optimize urban management with smart solutions for efficient resource use and enhanced city living, promoting sustainability and innovation.
🖥️ Implement the Intel 8086 microprocessor on FPGA with original microcode, featuring a simplified interface and compatibility for seamless integration.
🚀 Build high-performance applications with Paimon C++, a native implementation for efficient access to the Paimon datalake format.
🚖 Analyze NYC taxi data to uncover festival trends and predict demand using scalable Big Data techniques with Dask and Apache Spark.
Scalable, reliable, distributed storage system optimized for data analytics and object store workloads.
📄 Ingest documents into structured datasets for LLMs, ensuring numeric integrity and easy export across multiple frameworks with doc2dataset.
ClickHouse® is a real-time analytics database management system
📊 Analyze website traffic and user engagement to optimize marketing strategies and enhance content performance using detailed hourly session data.
🚀 Simplify data ingestion with Smart Ingest Kit, a lightweight toolkit that uses intelligent chunking for optimized, layout-aware RAG processes.
Multilevel data and operation deduplication mechanism (implementation in C++)
Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.
To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."