ACTION_ID: auto_dedupe_rows NAME: Auto Dedupe CATEGORY: Data Operations CREDITS: 0 Auto dedupe rows in the workflow based on a single selected column. INDEX: 1. Inputs 2. Outputs 3. How to configure 4. Key notes 5. Where it fits in a workflow 6. When to use ================================================================================ 1. INPUTS ================================================================================ auto_dedupe_columns (string, required) A reference to the column to dedupe on, e.g. `{{input.email}}`. Only one column is supported per instance of this action. select_dedupe_action (dropdown, optional) How duplicates are handled. "mark_as_duplicate" keep the row; tag `value_type` as "duplicate". "delete_duplicate" remove the duplicate row from the sheet. ignore_case (boolean, optional) Case-insensitive matching. `true` treats "user@x.com" and "User@X.com" as duplicates; `false` requires exact match. ================================================================================ 2. OUTPUTS ================================================================================ The behavior depends on the `select_dedupe_action` chosen: "mark_as_duplicate" The row is kept on the sheet and tagged as "duplicate" in the value_type output. No row is removed. Downstream actions can branch on this value (e.g. via a filter step) to skip duplicates while preserving them for audit. "delete_duplicate" The entire duplicate row is removed from the workflow. Only the first occurrence (treated as "unique") remains on the sheet. Output field: value_type (string) — "unique" or "duplicate". Emitted for rows that remain on the sheet. With "delete_duplicate", removed rows produce no output because they no longer exist. ================================================================================ 3. HOW TO CONFIGURE ================================================================================ Configure Action body (PATCH /api/v1/workflows/{workflow_id}/sheets/{sheet_id}/actions/{action_instance_id}): { "inputs": { "auto_dedupe_columns": "{{input.email}}", "select_dedupe_action": "mark_as_duplicate", "ignore_case": true } } Field-by-field: - auto_dedupe_columns A `{{input.X}}` (or `{{.X}}`) reference to the column to dedupe on. Single column only. - select_dedupe_action "mark_as_duplicate" or "delete_duplicate". - ignore_case Boolean. `true` for case-insensitive matching. ================================================================================ 4. KEY NOTES ================================================================================ - Place this action early in the chain so paid downstream steps don't spend credits on duplicates. - Single-column dedupe only. To dedupe by multiple columns, combine them upstream via `format_data_using_js_expression` (e.g. concatenate email + domain into a single derived column) and reference the derived column here. - The action runs row-by-row in chain order: each row is compared against earlier rows on the same sheet, so `value_type` reflects whether THIS row is a duplicate of a prior one. - `ignore_case` is a boolean — pass `true` / `false`, not the strings `"true"` / `"false"`. - Normalize variable-format identifiers in a `format_data_using_js_expression` step BEFORE dedupe. URLs especially come in many surface forms across sources — a single LinkedIn profile can show up as `linkedin.com/in/janedoe`, `https://uk.linkedin.com/in/janedoe`, `https://www.linkedin.com/in/janedoe?utm=...`, all the same person. `auto_dedupe_rows` does a literal string compare (case-insensitive if `ignore_case: true`); without normalization it treats those three as three different people. Add an upstream formatter that strips protocol/query/trailing-slash and folds country subdomains (`uk.linkedin.com` → `www.linkedin.com`), then dedupe on the formatter's output. Same pattern for emails (lowercase, strip plus addressing), domains (strip `www.`, protocol, paths), and phone numbers (E.164). Canonical LinkedIn URL recipe lives in https://floqer.com/docs/action-detail/format_data_using_js_expression.txt §8.4 (LinkedIn URL normalization). ================================================================================ 5. WHERE IT FITS IN A WORKFLOW ================================================================================ UPSTREAM — how rows arrive on the sheet Rows can land on the sheet from many sources: a direct list ingest, a push_data_to_sheet from an upstream sheet that ran multiple employee-finder actions, or several inbound triggers / webhooks / intent signals routing into the same sheet. Any of these can place the same entity on the sheet more than once. THIS ACTION Compares each row's value in the chosen column against earlier rows on the same sheet. With "mark_as_duplicate", later occurrences are tagged as "duplicate" in the value_type output. With "delete_duplicate", later occurrences are removed from the sheet entirely. DOWNSTREAM — typical chains after this action mark_as_duplicate: this -> filter (value_type == "unique") -> enrichment / outreach. Duplicates remain on the sheet for audit but skip the paid steps. delete_duplicate: this -> enrichment / outreach. Only unique rows remain, so downstream actions run normally. ================================================================================ 6. WHEN TO USE ================================================================================ Use auto_dedupe_rows to drop or mark duplicates before paid steps. Pick a unique identifier as the dedupe column. For example, on a list of people, use email; on a list of companies, use domain. ================================================================================ This file is maintained manually. Last updated: 2026-05-20. Full interactive reference: https://floqer.com/docs/reference Action catalog: https://floqer.com/docs/action-catalog.txt