ACTION_ID: auto_dedupe_rows
NAME: Auto Dedupe
CATEGORY: Data Operations
CREDITS: 0

Auto dedupe rows in the workflow based on a single selected column.

INDEX:
  1. Inputs
  2. Outputs
  3. How to configure
  4. Key notes
  5. Where it fits in a workflow
  6. When to use

================================================================================
1. INPUTS
================================================================================

auto_dedupe_columns (string, required)
  A reference to the column to dedupe on, e.g. `{{input.email}}`.
  Only one column is supported per instance of this action.

select_dedupe_action (dropdown, optional)
  How duplicates are handled.
    "mark_as_duplicate"  keep the row; tag `value_type` as "duplicate".
    "delete_duplicate"   remove the duplicate row from the sheet.

ignore_case (boolean, optional)
  Case-insensitive matching. `true` treats "user@x.com" and "User@X.com"
  as duplicates; `false` requires exact match.

================================================================================
2. OUTPUTS
================================================================================

The behavior depends on the `select_dedupe_action` chosen:

  "mark_as_duplicate"
    The row is kept on the sheet and tagged as "duplicate" in the
    value_type output. No row is removed. Downstream actions can
    branch on this value (e.g. via a filter step) to skip duplicates
    while preserving them for audit.

  "delete_duplicate"
    The entire duplicate row is removed from the workflow. Only the
    first occurrence (treated as "unique") remains on the sheet.

Output field:

value_type (string) — "unique" or "duplicate". Emitted for rows
  that remain on the sheet. With "delete_duplicate", removed rows
  produce no output because they no longer exist.

================================================================================
3. HOW TO CONFIGURE
================================================================================

Configure Action body
(PATCH /api/v1/workflows/{workflow_id}/sheets/{sheet_id}/actions/{action_instance_id}):

{
  "inputs": {
    "auto_dedupe_columns": "{{input.email}}",
    "select_dedupe_action": "mark_as_duplicate",
    "ignore_case": true
  }
}

Field-by-field:
  - auto_dedupe_columns   A `{{input.X}}` (or `{{<action>.X}}`) reference
                          to the column to dedupe on. Single column only.
  - select_dedupe_action  "mark_as_duplicate" or "delete_duplicate".
  - ignore_case           Boolean. `true` for case-insensitive matching.

================================================================================
4. KEY NOTES
================================================================================

- Place this action early in the chain so paid downstream steps don't
  spend credits on duplicates.
- Single-column dedupe only. To dedupe by multiple columns, combine
  them upstream via `format_data_using_js_expression` (e.g. concatenate
  email + domain into a single derived column) and reference the
  derived column here.
- The action runs row-by-row in chain order: each row is compared
  against earlier rows on the same sheet, so `value_type` reflects
  whether THIS row is a duplicate of a prior one.
- `ignore_case` is a boolean — pass `true` / `false`, not the strings
  `"true"` / `"false"`.
- Normalize variable-format identifiers in a `format_data_using_js_expression`
  step BEFORE dedupe. URLs especially come in many surface forms across
  sources — a single LinkedIn profile can show up as
  `linkedin.com/in/janedoe`, `https://uk.linkedin.com/in/janedoe`,
  `https://www.linkedin.com/in/janedoe?utm=...`, all the same person.
  `auto_dedupe_rows` does a literal string compare (case-insensitive
  if `ignore_case: true`); without normalization it treats those three
  as three different people. Add an upstream formatter that strips
  protocol/query/trailing-slash and folds country subdomains
  (`uk.linkedin.com` → `www.linkedin.com`), then dedupe on the
  formatter's output. Same pattern for emails (lowercase, strip plus
  addressing), domains (strip `www.`, protocol, paths), and phone
  numbers (E.164). Canonical LinkedIn URL recipe lives in
  https://floqer.com/docs/action-detail/format_data_using_js_expression.txt
  §8.4 (LinkedIn URL normalization).

================================================================================
5. WHERE IT FITS IN A WORKFLOW
================================================================================

UPSTREAM — how rows arrive on the sheet
  Rows can land on the sheet from many sources: a direct list ingest,
  a push_data_to_sheet from an upstream sheet that ran multiple
  employee-finder actions, or several inbound triggers / webhooks /
  intent signals routing into the same sheet. Any of these can place
  the same entity on the sheet more than once.

THIS ACTION
  Compares each row's value in the chosen column against earlier rows
  on the same sheet. With "mark_as_duplicate", later occurrences are
  tagged as "duplicate" in the value_type output. With
  "delete_duplicate", later occurrences are removed from the sheet
  entirely.

DOWNSTREAM — typical chains after this action
  mark_as_duplicate:  this -> filter (value_type == "unique") ->
                      enrichment / outreach. Duplicates remain on the
                      sheet for audit but skip the paid steps.
  delete_duplicate:   this -> enrichment / outreach. Only unique rows
                      remain, so downstream actions run normally.

================================================================================
6. WHEN TO USE
================================================================================

Use auto_dedupe_rows to drop or mark duplicates before paid steps.
Pick a unique identifier as the dedupe column. For example, on a list
of people, use email; on a list of companies, use domain.

================================================================================

This file is maintained manually. Last updated: 2026-05-20.
Full interactive reference: https://floqer.com/docs/reference
Action catalog: https://floqer.com/docs/action-catalog.txt