Rust crate for entity parsing
-
Updated
Dec 26, 2022 - Rust
8000
Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference.
Rust crate for entity parsing
DuckDB community extension for locality-sensitive hashing (LSH)
Fuzzymatching made easy
SQL extensions for GoldenMatch — run entity resolution from Postgres (pgrx) and DuckDB (Python UDFs). Rust bridge via pyo3. Part of the Golden Suite.
canon resolves identifiers to canonical forms using versioned registries — normalizing formats, validating checksums, and mapping to canonical IDs deterministically.
Autonomous recursive language-model investigation agent.
CJK-native master data matching engine — multi-signal phonetic, visual, and normalization matching for Chinese/Japanese/Korean records. Built in Rust, runs in the browser via WASM.
Created by Halbert L. Dunn
Released 1946