RedactifyAI is a Python package for detecting and anonymizing sensitive Personally Identifiable Information (PII) in textual data using Microsoft’s Presidio and Apache Spark.

Key Features

  • Integration with Presidio to detect and anonymize PII such as names, emails, phone numbers, and more.
  • Spark-powered processing for scalable anonymization using PySpark.
  • Custom recognizers to extend PII detection for specific needs.

Project Link: https://rokorolev.gitlab.io/redactify-ai/