Description
Audience: data teams building golden records, relevance engineers, BI squads.
What you get: blocking strategies (phonetic keys, n‑gram hashes, LSH); matchers (Jaro‑Winkler/TF‑IDF, embeddings); conflict resolution (field precedence, provenance trust, survivorship); evaluation harness (labeled pairs, PR curves, threshold optimizer, error taxonomy); schemas & Jupyter notebooks.
Architecture: works with Pandas/Polars, DuckDB, Elastic, Neo4j; outputs graph or tabular; rollbackable merges.
KPIs: Precision/F1 at threshold, false‑merge/false‑split rates, golden‑record completeness. Guardrails: do‑not‑link lists, manual review queues.

Reviews
There are no reviews yet.