USE_CASE_ID: account_research_and_scoring
NAME: Account research and scoring
CATEGORY: Account research and scoring

The user has a list of target accounts and wants to qualify, score,
and tier them based on time-windowed pressure signals (funding,
leadership changes, product launches, geo expansion, hiring patterns,
tech stack, and so on). The output is a per-account score (0–100),
a tier (T1 / T2 / T3), per-signal evidence, and an AI-generated
outreach brief on top-tier accounts only.

For routing, file structure, and cross-cutting principles, see
https://floqer.com/docs/use-case-catalog.txt. This file does not repeat them.

INDEX:
  1. When to use / when not to use
  2. Inputs and pre-flight clarifications
  3. Outputs
  4. Workflow design
  5. Implementation
  6. Best practices
  7. Common variations
  8. Failure modes and mitigations
  9. Related use cases

================================================================================
1. WHEN TO USE / WHEN NOT TO USE
================================================================================

USE WHEN:
  - The user has an existing list of target accounts (uploaded,
    exported from a CRM, or provided manually) and wants to triage
    them by GTM pressure or fit signals.
  - Qualification criteria include time-windowed signals — funding
    in last X months, leadership change in last X weeks, product
    launches, etc. — that change as time passes.
  - The output is sales-actionable: a tiered list with evidence,
    optionally with AI-written briefs for the top tier.

DO NOT USE WHEN:
  - The user has no list yet — see ICP-based outbound prospecting,
    which discovers companies first then qualifies. This use case
    is list-in-hand only.
  - The trigger is a single intent signal (a specific hiring post
    detected in real time, a tech-stack change) — see Intent-driven
    outreach for trigger-based flows. This use case is batch
    scoring, not event-driven.
  - The qualification is a single deterministic rule (e.g.
    "country == US AND headcount > 100"). Use a JS formatter
    directly. No need for the full pipeline.

================================================================================
2. INPUTS AND PRE-FLIGHT CLARIFICATIONS
================================================================================

INPUTS
  - Target account list — at minimum company name + domain per row.
    Lists are typically partially pre-qualified (CRM export, scraped
    target list, output from another workflow).
  - ICP definition — usually 1–3 sentences describing the kind of
    company in scope (e.g. "B2B SaaS in the technology vertical,
    excluding services / agencies / hardware-first companies"). Used
    by the ICP gate, NOT for fine qualification.
  - Signal definitions — for each pressure signal you want to score,
    define the time window (e.g. "funding round in the last 6
    months", "leadership change in the last 1 month") and how to
    confirm it (search query patterns, evidence requirements).
  - Tier thresholds — score cutoffs for T1 / T2 / T3 (e.g. T1 ≥ 80,
    T2 ≥ 60, T3 < 60).

PRE-FLIGHT CHECKS — scan the list first; ask the user only if
ambiguity is detected.

  1. Parent vs operating-brand resolution
     The same corporate group can appear under multiple names in a
     list (e.g. "Bomgar" + "BeyondTrust" both resolving to
     BeyondTrust; "Navicure, now part of Waystar" + "Waystar"
     both resolving to Waystar; "Gatorade" + "PepsiCo" where the
     user may want only Gatorade, not PepsiCo at large).

     Don't force a decision when none is needed. Scan the list
     before building:
       - look for explicit "X, now part of Y" / "(formerly X)" /
         "Y subsidiary" markers in the company name;
       - look for known parent / subsidiary pairs in the same
         list (cross-check via PDL's parent_company_id and
         alternate_company_names if you've already enriched, or
         a quick web check on suspicious entries);
       - look for near-duplicate domains (e.g. waystar.com +
         navicure.com) that resolve to the same entity.

     Only when one of these signals fires, ask the user whether
     to score (a) the parent across subsidiaries, or (b) the
     operating brand strictly as named. Otherwise treat each row
     as its own entity and proceed. Either choice is valid; the
     wrong one leads to misleading headcount-by-role and
     confused outreach context.

  2. List quality and dedupe
     If the input list contains exact duplicates or near-duplicates,
     dedupe BEFORE the workflow runs. `auto_dedupe_rows` on
     company_domain handles exact dupes; parent rollup needs PDL
     data and a JS formatter (or a manual pass before ingest).
     Skip this step if a quick scan shows the list is already
     clean.

================================================================================
3. OUTPUTS
================================================================================

  - Per row: score (0–100, integer), tier (T1 / T2 / T3),
    per-signal status (CONFIRMED / NOT FOUND / UNCERTAIN) with
    dated evidence and source URLs.
  - Top-tier rows only: an AI-generated outreach brief — typically
    2–3 paragraphs covering why the account is hot, the dated
    evidence, and a concrete outreach hook.
  - Side-effect data: ICP fit verdict + reasoning, raw firmographic
    enrichment, per-signal search log (proof of research).

================================================================================
4. WORKFLOW DESIGN
================================================================================

High-level conceptual stages:

   1. ICP fit gate — coarse "is this even in scope" judgment.
   2. Filter on ICP — single chain juncture. All downstream stops
      here for non-ICP rows.
   3. Firmographic enrichment — pull the structured data the score
      formula needs. Default:
      `company_headcount_distribution_by_job_role` (1 credit/row,
      role-counts as top-level number fields). Reach for PDL only
      when you also need broader firmographics.
   4. Numeric extraction — only on the PDL path. Pulls individual
      numbers out of nested raw_arrays. Skippable when the
      upstream is `company_headcount_distribution_by_job_role`,
      whose role-count fields are already top-level and
      number-typed.
   5. Classifier sentinel — only when an upstream gate variable is
      a JS formatter output (typed `string`). Both `filter` and
      `run_if` numeric operators silently fail on string-typed
      upstreams; the sentinel converts the numeric threshold into
      `"yes"` / `"no"` so downstream gates use `is`. Skip when the
      upstream is a natively number-typed field (e.g. directly
      from the headcount-by-role action). See
      https://floqer.com/docs/action-detail/format_data_using_js_expression.txt
      §2 OUTPUTS.
   6. Filter on classifier — gate everything downstream of the
      sales-team-size threshold. One node, AND-able with other
      conditions if needed (see https://floqer.com/docs/action-detail/filter.txt for
      groups + combinators).
   7. Multi-signal time-windowed research — one web agent call per
      row, returning per-signal status + dated evidence + search
      proof.
   8. Score — formatter; deterministic rubric reading the
      per-signal statuses + numeric inputs.
   9. Tier — formatter; threshold-based classification on score.
  10. AI brief — `llm_models` on top-tier rows only; reads the
      structured signal data and writes an AE-facing summary.

Why two `filter` nodes instead of `run_if` on every downstream
action: when several actions share the same row-level prerequisite
(every action after Stage 1 needs the row to be ICP-positive; every
action after Stage 5 needs the row to clear the sales-team
threshold), put the gate at the chain juncture once. `run_if` on
every action is the older, noisier pattern. Reserve `run_if` for
true per-action conditions — like Stage 10's tier gate, which only
applies to the AI brief.

================================================================================
5. IMPLEMENTATION
================================================================================

STAGE 1 — ICP fit gate
  Action: llm_web_agents (Floqer Fast — sonar-agent-fast).
  Why: A coarse single-judgment task. Lightweight tier model,
       cheap. Output: icp_fit (yes/maybe/no) + icp_reasoning.
       Be generous — keep "maybe" in scope; the score downstream
       is the real filter.

STAGE 2 — Filter on ICP
  Action: filter (path_conditions: `icp_fit` is "yes" OR "maybe").
  Why: Single chain juncture for the ICP gate. Every action after
       this point implicitly inherits the gate — no need to copy
       run_if onto each one. Failing rows stop here; downstream
       cells stay queued and never bill.

STAGE 3 — Firmographic enrichment
  Action: company_headcount_distribution_by_job_role is the
          default when all you need from this stage is the
          headcount-by-role signal — 1 credit/row, top-level
          number-typed fields per role (`perc_engineering`,
          `perc_sales`, ..., plus `change_in_*` quarter-over-
          quarter deltas and a `profiles_found_per_quarter`
          sample-size signal). Despite the `perc_` prefix the
          values are raw role-counts, not percentages — see
          https://floqer.com/docs/action-detail/
          company_headcount_distribution_by_job_role.txt §4 KEY
          NOTES.
          Reach for company_enrich_using_people_data_labs (PDL,
          5 credits/row) when you also need broader firmographics
          — funding stage / date, public/private status, industry
          tags, alternate_company_names for parent rollup, etc.
  Why: Headcount-by-role is the dominant input to the score
       formula; the lighter action is the right default. PDL adds
       breadth (workforce trends, corporate structure, funding
       history) at 5x the cost — only worth it when you'll
       actually use those extra fields elsewhere in the chain.

STAGE 4 — Numeric extraction (PDL path only — skip with the
          headcount-by-role action)
  Action: format_data_using_js_expression.
  Why: PDL outputs nested raw_arrays (e.g.
       `employee_count_by_job_role` is an object with sales /
       engineering / marketing keys). Pull the specific numbers
       you need into clean per-row outputs.
       company_headcount_distribution_by_job_role surfaces each
       role count as a top-level number-typed field (e.g.
       `perc_sales`) directly — no extraction needed. Skip this
       stage entirely on the lighter-action path.

STAGE 5 — Classifier sentinel (only when upstream is a JS
          formatter — typed `string`)
  Action: format_data_using_js_expression.
  Why: JS formatter outputs are typed `string` regardless of what
       the JS returns. Numeric `filter` and `run_if` operators
       (`greater than`, `less than`, `is between`) silently fail
       on string-typed upstreams because the operator is auto-
       derived from the variable's stored type, not its runtime
       value. Convert any numeric threshold gate (e.g. "sales
       team ≥ 30") into a string sentinel ("yes" / "no") here so
       the downstream `filter` can use `is "yes"`. See
       https://floqer.com/docs/action-detail/format_data_using_js_expression.txt
       §4 KEY NOTES for the full pattern.
       Skip this stage if the threshold gate's upstream is a
       natively number-typed field — e.g.
       `company_headcount_distribution_by_job_role.perc_sales`
       (top-level `number`), or `latest_funding_date` /
       `employee_headcount` straight from PDL. Those work with
       numeric `filter` operators directly without a sentinel
       wrapper.

STAGE 6 — Filter on classifier
  Action: filter (path_conditions: classifier sentinel `is` "yes").
  Why: Gates the expensive downstream signal research on the
       sales-team-size threshold. Same juncture pattern as Stage
       2. If you need to combine multiple gates (e.g. ICP +
       sales-team-size + funding stage), express them as separate
       leaves AND'd inside one `filter`'s `path_conditions` rather
       than chaining filters.

STAGE 7 — Multi-signal time-windowed research
  Action: llm_web_agents (Floqer Deep — sonar-agent-deep, or
          Parallel Core — parallel-core for stronger grounding).
  Why: One agent call per row with a mission that lists every
       signal and its time window. Bake proof requirements into
       the mission: each signal's evidence field MUST contain a
       URL + date if CONFIRMED, or the queries tried + a "nothing
       in window" note if NOT FOUND. Without proof requirements
       the model can return schema-default "no"s without
       searching.
       Output schema: per-signal `<signal>_status` +
       `<signal>_evidence` fields. Do NOT include a
       `signals_confirmed_count` or `score` field in the agent's
       output — the model's per-field outputs and self-reported
       summaries can drift in the same JSON. Compute aggregations
       downstream. See https://floqer.com/docs/action-detail/llm_web_agents.txt
       §4 KEY NOTES on this constraint.

STAGE 8 — Score
  Action: format_data_using_js_expression.
  Why: Deterministic rubric reading the per-signal statuses from
       stage 7 + the numeric inputs from stage 4. Common shape:
       sum of per-signal weights for CONFIRMED + a linear
       headcount component capped at the top end. Total
       normalised to 0–100.
       For most use cases this is the right tool. In rare cases
       where the rule depends on qualitative judgment that can't
       be expressed as fixed math, an `llm_models` call with a
       high-reasoning model is a viable but expensive fallback —
       see https://floqer.com/docs/action-detail/llm_models.txt §6 WHEN TO USE.
       The formatter is the default.

STAGE 9 — Tier
  Action: format_data_using_js_expression.
  Why: Threshold classification on the score from stage 8.
       Returns the tier as a string ("T1" / "T2" / "T3") for the
       downstream brief gate.

STAGE 10 — AI brief (top-tier only)
  Action: llm_models (Claude Sonnet 4.6, GPT-5.4, or any
          heavy-tier writing model — see
          https://floqer.com/docs/action-detail/llm_models.txt §8 MODELS REFERENCE).
  Why: Writing task. Reads the structured signal data + score and
       produces a 2–3 paragraph AE-facing brief. Strictly writing
       — no fresh research. If the brief needs context the signal
       research did not capture, add an upstream `llm_web_agents`
       step for that context; do not fold it into this action.
       Use `run_if tier is "T1"` here (per-action gate, single
       downstream consumer — `filter` is overkill for one
       action).

DATA PASSING:
  Carry company name and domain from stage 1 onward — they are
  needed for the ICP gate, the signal research mission, and the
  AI brief. Reference upstream outputs as
  {{action_instance_id.field_name}} or, for nested PDL fields,
  pull them through a stage-3 formatter first.

================================================================================
6. BEST PRACTICES
================================================================================

  - ICP gate is coarse, scoring is the real filter. Do not
    over-qualify in stage 1. The point is to eliminate rows that
    are obviously out of scope (consumer brands, services-only
    firms, hardware companies) — not to identify the perfect fit.
    Scoring downstream handles fine qualification, and input
    lists are usually somewhat pre-qualified to begin with.
  - Bake per-signal proof into the research mission. Each signal's
    evidence field must contain a URL + date if CONFIRMED, or
    queries tried + "nothing in window" if NOT FOUND. Without
    this, web agents can return schema-defaults without searching.
  - Computer-not-model for aggregations. Never ask the research
    web agent to output a `signals_confirmed_count`,
    `total_score`, or `tier`. Per-field outputs and self-reported
    counts/scores can drift in the same JSON response. Compute
    these in formatters reading the per-field outputs.
  - Use string-sentinel classifiers for any numeric threshold
    gate. JS formatter outputs are typed `string` regardless of
    return type, so numeric run_if operators silently fail.
    Always wrap a threshold check in a small classifier formatter
    that returns "yes"/"no", and gate on `is "yes"`.
  - Three-role separation. Web agent does research, formatter
    does math, `llm_models` does writing. Never blend roles in a
    single action — mixing produces hallucinated content (LLM
    doing research) or fragile counts (LLM doing aggregation).
  - Filter at chain junctures, not run_if on every action. When
    several downstream actions share a row-level prerequisite,
    express it once with a `filter` action upstream of them all
    rather than copying `run_if` onto each downstream action.
    `filter` also supports AND/OR composition across multiple
    leaves and groups, which `run_if` (single condition) cannot
    do. Reserve `run_if` for true per-action gates (e.g. the
    final AI brief that only fires for T1).
  - Re-run on a cadence. Scores are point-in-time. A "last 6
    months" funding round ages out of the window over time. If
    the output is used as ongoing intel, re-run weekly or
    monthly. With `cache_enabled` set on the sheet, identical-
    input rows skip re-running unless inputs change — refresh by
    adjusting `cache_since` or by deleting and re-adding rows.
  - Cost scaling guidance (1,000-row reference). Stage 3 is the
    biggest variable — PDL is 5 credits/row,
    company_headcount_distribution_by_job_role is 1 credit/row;
    pick the latter unless you need the broader firmographics PDL
    surfaces. For a 1,000-row list with ~70% ICP pass rate, expect
    roughly:
        Stage 1 ICP web agent (Floqer Fast)            ~500 credits
        Stages 2, 6 (filter)                           0
        Stage 3 firmographic enrichment
            with PDL                                   ~3,500 credits
            with company_headcount_distribution_by_job_role
                                                       ~700 credits
        Stages 4, 5, 8, 9 (formatters)                 0
            (Stages 4 + 5 are skippable on the
             headcount-by-role path — `perc_sales` is
             already a top-level number field)
        Stage 7 signal web agent (Deep/Core)           ~350-1,750 credits
        Stage 10 AI brief on T1 only                   varies by model;
                                                       typically a few $ at
                                                       ~10% T1 rate
    Net: ~1.5-2.5k credits + a few dollars of LLM cost on the
    lighter path; ~4-6k credits on the PDL path.

================================================================================
7. COMMON VARIATIONS
================================================================================

  - 8-signal GTM pressure model. Augment stage 5 to research:
    Series B+ funding, new CRO / VP Sales, sales org
    restructuring, new product / category, geo expansion, active
    RevOps / SalesOps / DemandGen postings, RepVue attainment
    below threshold, fragmented tech stack across overlapping
    GTM categories. Each becomes a per-signal status field in the
    agent's output. The score formula adds a fixed weight per
    CONFIRMED signal.
  - Account research without scoring. Drop stages 6–8 and stop
    after stage 5 — the per-signal evidence is sometimes enough
    for an SDR to manually triage. Useful when the rubric is
    unclear or the team prefers qualitative review.
  - LLM-driven scoring as a fallback. When the rubric depends on
    qualitative judgment that can't be expressed as fixed math
    (e.g. weighing a $50M Series C heavier than a $5M seed even
    though both are "funding CONFIRMED"), replace stage 6 with
    an `llm_models` call using a high-reasoning model. Treat as
    the exception; the formatter is the default.
  - Tier-2 brief. By default only T1 gets the AI brief. To brief
    T2 as well, add a second `llm_models` action gated on
    `tier is "T2"` with a different (shorter / lower-priority)
    prompt.

================================================================================
8. FAILURE MODES AND MITIGATIONS
================================================================================

  - Sonar Deep silent-failure on signal research.
    Symptom: all `*_status` fields return "NOT FOUND" / "no";
    web_citations is empty.
    Mitigation: bake per-signal proof requirements into the
    mission (URL + date for CONFIRMED, queries tried for
    NOT FOUND). Sonar Deep is a strong web researcher when the
    prompt forces it to surface evidence; it short-circuits when
    the prompt doesn't. Parallel Core (parallel-core) is a
    grounding-stricter alternative.

  - Numeric run_if comparison silently fails.
    Symptom: a downstream gate based on `headcount >= 30`
    (against a JS formatter output) doesn't fire; the row halts.
    Mitigation: classifier-sentinel pattern. Convert numeric
    thresholds into "yes"/"no" outputs and gate via `is`.

  - Halt-on-skip causes rows to "run forever".
    Symptom: rows whose ICP verdict is "no" (or whose sales
    headcount is below threshold) stay in `running` state on the
    sheet — `filter` and `run_if` halts both behave this way for
    failing rows.
    Mitigation: documented behaviour. The cosmetic stuck-row state
    can be cleared by setting
    `continue_workflow_if_run_condition_not_met: true` on the
    `run_if`-gated action that's halting the chain (see
    https://floqer.com/docs/action-detail/filter.txt §7 WHEN NOT TO USE) — failing
    rows then advance past that action with empty outputs instead
    of halting. Use this on rows you want to mark `complete` for
    audit purposes. For the use-case-level filter nodes (Stages 2
    and 6), halting is the desired behaviour and you should leave
    it.

  - Model self-reports counts that drift from per-signal outputs.
    Symptom: signals_confirmed_count reports 5 but counting
    individual `*_status: CONFIRMED` fields gives 6.
    Mitigation: don't include count / score / tier fields in the
    research agent's output schema. Compute them in formatters
    reading the per-field outputs.

  - PDL miss for tiny / under-indexed companies.
    Symptom: PDL returns no role-headcount data; sales_headcount
    extracts as 0; the classifier sentinel returns "no"; chain
    halts.
    Mitigation: for under-indexed segments, swap PDL for a web
    agent that estimates headcount from the company website or
    LinkedIn. See https://floqer.com/docs/use-case-detail/icp_outbound_prospecting.txt
    STAGE 3 alternative for the web-agent-as-finder pattern.

  - Parent / subsidiary double-counting.
    Symptom: the same corporate group scores twice (e.g. Bomgar
    + BeyondTrust both score, with different evidence each time).
    Mitigation: dedupe pre-flight and confirm parent-vs-brand
    intent with the user. See section 2 (PRE-FLIGHT
    CLARIFICATIONS).

================================================================================
9. RELATED USE CASES
================================================================================

  - ICP-based outbound prospecting
    (https://floqer.com/docs/use-case-detail/icp_outbound_prospecting.txt)
    Discovery-first variant. Account scoring assumes a list
    exists; ICP outbound builds one.

  - Intent-driven outreach
    Trigger-based variant. Account scoring is a batch operation;
    intent-driven runs per detected signal in real time.

  - Account research only — drop scoring stages, stop at the
    signal-research output. See section 7 (Common variations).

================================================================================

This file is maintained manually. Last updated: 2026-05-07.
Full interactive reference: https://floqer.com/docs/reference
Action catalog: https://floqer.com/docs/action-catalog.txt