Skip to main content

Policies & Rules

What Is a Policy?

A policy is a named collection of rules with a severity threshold. When you scan a file, the policy determines which rules are applied and which findings are surfaced in the report.

Policies let you tailor scans to specific use cases. A law firm preparing court filings needs different checks than a photographer sharing portfolio images.

How Policies Work

  1. You specify a policy when running a scan (via CLI, API, or the web UI).
  2. Purgit loads all rules associated with that policy.
  3. Each rule inspects the file for a specific type of metadata or embedded data.
  4. Findings at or above the policy's severity threshold are included in the report.
  5. Rules marked autofix: true can be automatically removed during sanitization.

Built-In Policies

| Policy | Rules | Threshold | Description | Best For | |--------|-------|-----------|-------------|----------| | strict | 93 | Info | Every rule, every severity level | Highest-risk documents, whistleblowing, legal discovery | | standard | 71 | Medium | All rules, medium severity and above | General professional use, default for most users | | minimal | 23 | High | Critical and high severity rules only | Quick checks, low-risk documents | | legal | 58 | Medium | Legal-profession-specific rule set | Law firms, contracts, court filings | | healthcare | 47 | Medium | HIPAA-focused rules for PHI | Healthcare organizations, medical records |

Severity Levels

Each rule has a severity level indicating the risk of leaving that metadata in a shared file:

Critical

Must be removed before sharing. These findings represent immediate privacy or security risks.

Examples: Embedded GPS coordinates in photos, tracked changes containing confidential edits, hidden embedded files.

High

Strongly recommended to remove. These expose personal or organizational information that most users would not want shared.

Examples: Author name, creator software and version, document edit history, camera serial number.

Medium

Worth reviewing. May or may not be sensitive depending on context.

Examples: Creation and modification timestamps, document keywords, PDF producer string, image color profile metadata.

Low

Minor metadata with limited privacy impact but still potentially informative to a determined adversary.

Examples: Page count metadata, PDF page layout preferences, image resolution (DPI).

Info

Informational only. No action recommended, but reported for completeness in strict mode.

Examples: PDF version, file structure metadata, compression method.

Rule ID Format

Every rule has a unique identifier following the pattern:

{FORMAT}-{CATEGORY}-{NNN}

Format prefixes:

| Prefix | Format | |--------|--------| | PDF | PDF documents | | EXIF | Image EXIF data (JPEG, PNG, HEIC) | | IPTC | Image IPTC data | | XMP | XMP metadata (cross-format) | | DOCX | DOCX / Office Open XML documents | | ICC | ICC color profiles |

Category examples:

| Category | Description | |----------|-------------| | META | General metadata fields | | GPS | Geographic location data | | DEVICE | Device/hardware identifiers | | REV | Tracked changes / revisions | | CMT | Comments and annotations | | EMBED | Embedded objects and files | | HIST | Edit and version history | | LINK | External links and references |

Common Rule Examples

| Rule ID | Severity | Autofix | Description | |---------|----------|---------|-------------| | PDF-META-001 | High | Yes | Author field in PDF metadata | | PDF-META-002 | Medium | Yes | Keywords field in PDF metadata | | PDF-META-003 | High | Yes | Creator application in PDF metadata | | PDF-META-004 | Medium | Yes | Producer string in PDF metadata | | PDF-META-005 | Medium | Yes | Subject field in PDF metadata | | EXIF-GPS-001 | Critical | Yes | GPS latitude/longitude coordinates | | EXIF-GPS-002 | Critical | Yes | GPS altitude | | EXIF-DEVICE-001 | High | Yes | Camera make and model | | EXIF-DEVICE-002 | High | Yes | Camera serial number | | EXIF-DEVICE-003 | Medium | Yes | Lens model and serial | | DOCX-REV-001 | Critical | Yes | Tracked changes / revision marks | | DOCX-CMT-001 | High | Yes | Document comments | | DOCX-META-001 | High | Yes | Author and last modified by | | DOCX-META-002 | Medium | Yes | Company name | | DOCX-META-003 | Medium | Yes | Manager name |

Autofix Behavior

Rules marked autofix: true are automatically handled during sanitization:

  • Removal: The metadata field is deleted entirely (e.g., GPS coordinates, author name).
  • Normalization: The value is replaced with a neutral default (e.g., timestamps set to epoch, producer set to "Purgit").

Rules marked autofix: false require manual review. These are typically findings where automatic removal could corrupt the document or where the user needs to make a judgment call (e.g., embedded images that may or may not be intentional).

Custom Policies

On the Team plan, you can create custom policies that select specific rules by ID or category. Custom policies are managed via the dashboard or API.

Next Steps

Last updated: 2026-03-06