For Cities

Find out what actually works — before you scale it.

City departments make consequential decisions about how to reach residents, design forms, sequence services, and allocate enforcement. Most of those decisions are made without evidence. We help departments run small, ethical pilots to test those decisions before committing to them at scale.

Start a conversation →See a sample published report

Where we work

Four departments. Dozens of testable questions.

These are not academic hypotheticals. Each is a question that comes up in city operations, can be tested with existing staff and data, and would change what a department does if answered.

Benefits & human services

→Which outreach message most increases Medicaid renewal completion?
→Does a simplified one-page form reduce incomplete SNAP applications?
→Does a same-day callback offer increase benefits enrollment among eligible non-enrollees?

Prior evidence

La Snap flexible interview pilot found a 23 pp increase in enrollment. Pennsylvania SNAP simplification reduced processing time by 40%.

Permitting & licensing

→Does an online status tracker reduce permit inquiry calls to staff?
→Which checklist format produces the fewest incomplete applications?
→Does a pre-submission review offer reduce permit revision cycles?

Prior evidence

Phoenix permit portal upgrade cut processing time 60%. Boone County online permits eliminated 80% of in-person visits.

Tax & revenue

→Does a social norm message ('most of your neighbors have already paid') increase on-time property tax payment?
→Does simplifying penalty letters increase timely settlement of overdue accounts?
→Does a payment plan offer in the first notice reduce referrals to collections?

Prior evidence

Indonesia tax penalty simplification raised settlement 32%. Dominican Republic social norm letter raised timely payment 5 pp.

Public safety

→Does a court date text reminder reduce failure-to-appear rates?
→Which communication approach most increases response to community violence intervention outreach?
→Does a landlord education letter reduce housing code violations in targeted properties?

Prior evidence

Court reminder texting reduced FTA by 26% in multiple jurisdictions. Philadelphia rental license letters reduced violations measurably.

How it works

Four principles behind every pilot.

Start with a question you don't know the answer to

The best civic experiments test genuine uncertainty. If you already know the answer, you don't need an experiment — you need an implementation plan. We help teams identify the 3–5 questions where evidence would actually change what they do.

Your staff runs the program; we handle the science

City staff implement the intervention. We design the study, handle randomization, run the analysis, and write the report. No new hires, no new systems. If you can send emails, you can run a randomized pilot.

90 days from question to published result

Most civic experiments can be designed, run, and reported in a single quarter. We work within budget cycles, election cycles, and staff capacity — not academic publication timelines.

Null results are published equally

If the intervention doesn't work, the report says so — clearly, with a replication protocol so other cities don't repeat the same test. The field learns from failures as much as successes.

Sample output

See what a published experiment report actually looks like.

A complete published report from a library attendance experiment: pre-registration notice, design table, results with p-values, limitations, and replication notes. This is the standard we hold every experiment to.

Read the report →

Common questions

What city staff usually ask first.

Do we need IRB approval to run a randomized pilot?

Most low-risk civic experiments — message testing, process simplification, outreach sequencing — do not require full IRB review. They fall under quality improvement exemptions or existing authority to test service delivery. We will advise you on the applicable standard for your jurisdiction and intervention type, and have designed pilots that passed city attorney review in under two weeks.

What if we run the pilot and the result is null?

We publish it. A documented null result is a contribution — it tells other cities that this approach doesn't work under these conditions, saving them from rerunning the same experiment. We frame null results as learning, not failure, and our published reports make clear that the city made a rigorous, responsible decision to test rather than assume.

How many people do we need for a valid experiment?

It depends on your baseline rate and the size of effect worth detecting. For most outreach and process interventions, you need a few hundred to a few thousand eligible recipients per arm. We run a power calculation in the first call. If your eligible population is too small for a clean result, we'll say so — and may suggest a waitlist design or a longer time window instead.

What does it cost?

Initial partnerships are structured around city capacity. For a first pilot (under 3,000 participants, single arm comparison), there is no charge — we treat first pilots as learning investments and use the published result to build the evidence base. Larger, multi-arm or multi-department work involves a fee we discuss openly before any commitment.

Who owns the data and the findings?

The city owns its administrative data. The published report is released under CC-BY 4.0 — the city is credited as the implementing partner, the findings are publicly available, and the replication protocol is shared with other jurisdictions. You retain full control of whether to act on the findings.

Get started

Run your first pilot this quarter.

A 45-minute call is enough to identify a testable question and sketch a design. We'll tell you honestly if a pilot makes sense — and what a realistic result would look like for your context.

Start the conversation →