How audience generation works

When you submit a project, Candor runs a multi-stage pipeline that researches your audience, extracts signals from evidence, and organises everything into segments. Here’s what happens at each stage and what it means for the audience you end up with.

The pipeline at a glance

Audience generation is one background job that runs through about a dozen stages in sequence. The setup wizard tells you the typical duration when you submit (around 25-35 minutes for most projects). The progress screen shows which stage is running and updates in real time. You can close the tab.

Stage 1: Document readiness

If you uploaded research documents, Candor waits until they’re parsed and embedded before starting. This is fast (under a minute per document) but bigger files take longer. If a document fails to parse, you’ll see a warning and the pipeline continues without it.

Stage 2: Broad search

Candor generates a set of search queries from your audience description, then runs them on the web. The goal is breadth: get an initial picture of the audience, the problems they discuss publicly, and the language they use. This pass is region-aware: queries are tuned to the region you pinned during setup.

Stage 3: Signal extraction

Every source (web page or uploaded document chunk) is read and broken into discrete signals: behaviors, pain points, beliefs, constraints, goals, preferences, and decision rules. Each signal is tagged with type, sentiment, intensity, and a provenance link back to its source. Documents you uploaded are weighted higher than web evidence.

Stage 4: Gap-fill iteration

After the broad pass, Candor audits what it has and identifies gaps: signal types that are under-represented, claims that lack supporting evidence, or segments of the audience that aren’t well-covered. It then runs targeted searches to fill those gaps, one query at a time, until the audience is balanced or the search budget is exhausted.

Stage 5: Critic pass

A separate agent reviews the assembled signal set for weak evidence, contradictions, region drift, and over-reliance on single sources. Issues are surfaced on the audience review screen as a critic banner so you know what to look at.

Stage 6: Segmentation

Candor groups signals into segments by behavioral and psychographic dimensions, not demographics alone. Each segment gets a primary differentiating dimension (the criterion that explains why members of this segment make different choices from others) plus an evidence depth indicator showing how well-grounded it is.

When something goes wrong

If a stage fails, the pipeline stops and shows a failure banner with a retry button. Common causes are transient web-search errors, rate limits, or evidence too sparse to segment cleanly. Retry usually works for the first two; for the third, broaden your audience description or add more evidence and create a new project.

Where to go next

Common questions

Most of the time is spent on web search and signal extraction. Candor runs an initial broad pass, then iterates through gap-fill searches to balance the audience. Each iteration includes a search, a read of every returned page, and signal extraction from that content. There's no way to compress this without dropping quality. The pipeline also runs a critic pass at the end. If you want faster, narrower audiences, use more upfront evidence (uploaded documents) to reduce the search budget needed.

Four kinds of issue. Weak evidence (signals that lean on a single thin source). Contradictions (signals that conflict without explanation). Region drift (evidence from outside the region you pinned creeping in). Over-reliance on single sources (too many signals citing the same page). Issues get surfaced on the audience review screen as a critic banner so you know what to look at before you select segments. The critic doesn't block, it flags. You decide whether to proceed, regenerate, or add more evidence.

Segmentation needs enough signal variance to find distinct groups. If the evidence is too thin or too uniform, Candor will either produce one giant catch-all segment or fail the segmentation stage with a clear error. The fix is to broaden your audience description, upload more documents, or both. Reuse research you already have. Anything you didn't upload is something Candor has to search the web for, which produces shallower signal than your existing primary research.

You can create a new project with the same audience description and adjusted evidence (more documents, a refined description, a different region). The original project doesn't get overwritten. Rerunning from scratch usually produces meaningfully different output, since the gap-fill iteration is non-deterministic in which sources it pulls. If the issue is one segment looking weak, you don't need to rerun everything. The segment selection gate lets you exclude weak segments without rebuilding the audience.

Yes, significantly. Region tunes the search queries (different countries produce different sources), affects which sources get prioritised, and propagates into the persona world-context snapshot. A study pinned to Brazil will find Brazilian sources and produce personas operating in Brazilian context. Pinning region matters most when your audience is country-specific or regulatorily different across regions. For a global audience, "Global" is a valid choice and Candor will treat it as such.

More FAQs →

Candor is in development.

Be the first to know when it launches.

No spam. Just a note when Candor is ready. Powered by Highline Beta.