Automated Review Friction
Sentry runs three automated reviewers on its PRs: sentry[bot] (bug prediction), sentry-warden[bot] (correctness/security checking), and cursor[bot] (Cursor’s agentic review). Combined, they post 23.7% of all substantive review comments in our high-friction sample, and 68.3% of high-friction PRs have bot review activity.
These bots are not noise. They find real bugs — Pydantic type errors, missing exception handlers, key mismatches, security holes. Every finding is something a developer must read, evaluate, and address.
But here is the central insight of this section:
The same bug categories repeat across different PRs. When the same agentic-review pattern fires on PR after PR, that pattern should be promoted from “expensive LLM review on every commit” to “cheap deterministic check on every commit.”
Agentic review is the right tool for finding a recurring pattern the first few times. It is the wrong tool for enforcing it once the pattern is known.
The Volume
Section titled “The Volume”| Metric | Value |
|---|---|
| Total substantive bot review comments | 139 |
| Total substantive human review comments | 448 |
| Bot share of substantive review | 23.7% |
| PRs with bot review activity | 41 / 60 |
| Share of high-friction PRs with bots | 68.3% |
| Average bot comment length | 1,565 characters |
| Bot | Substantive Comments | Role |
|---|---|---|
sentry[bot] | 81 | Internal bug prediction reviewer |
sentry-warden[bot] | 50 | Internal correctness/security checker |
cursor[bot] | 4 | Cursor agentic review (most filtered as templates) |
github-advanced-security[bot] | 3 | GitHub security advisor |
getsentry-bot | 1 | Internal automation |
sentry[bot] and sentry-warden[bot] together account for 94% of substantive bot reviews.
What the Bots Are Actually Catching
Section titled “What the Bots Are Actually Catching”When we read the actual bot findings, they cluster into a small number of recurring categories. The clusters below were identified by reading the 139 substantive bot comments in our sample. Each cluster includes the pattern, examples, and the kind of deterministic check that would replace the agentic review for that pattern.
Pattern 1: Missing exception handling on model fetches
Section titled “Pattern 1: Missing exception handling on model fetches”
QuerySubscription.DoesNotExist not handled in get_metric_issue_context—MetricIssueContext.from_group_event()callsQuerySubscription.objects.get(id=...)without handlingDoesNotExist. If the subscription is deleted between alert creation and notification rendering, this will crash. —sentry-warden[bot]on #111674
get_from_cache raises ValueError because DetectorGroup has no cache_fields defined— The baseBaseManagerhas emptycache_fieldsby default, soget_from_cachewill raise. —sentry-warden[bot]on #111714
If
project_idisNoneintrigger_coding_agent_launch,Project.objects.get_from_cachefails silently —sentry[bot]on #111691
Replace with: A custom Ruff/Semgrep rule that flags Model.objects.get(...) and Model.objects.get_from_cache(...) calls inside functions that don’t have a corresponding except Model.DoesNotExist: (or wrapping try). For get_from_cache, also check that the model defines cache_fields. This is a 30-line rule that catches the pattern forever.
Pattern 2: .filter().first() with unreachable except DoesNotExist
Section titled “Pattern 2: .filter().first() with unreachable except DoesNotExist”
AttributeError on None when Group is not found— The function uses.filter().first()which returnsNonewhen no record exists, but theexcept Group.DoesNotExisthandler is unreachable because.filter().first()never raises exceptions. —sentry-warden[bot]on #111724
Replace with: A trivial AST lint: if a try block contains .filter(...).first() and the except clause catches *.DoesNotExist, flag as unreachable. This pattern is mechanical — no LLM needed.
Pattern 3: Direct dict access on external API responses
Section titled “Pattern 3: Direct dict access on external API responses”
KeyError when GitHub API response lacks 'installations' key— Line 908 uses direct dict accessinstalled_orgs["installations"]on the GitHub API response. If the API returns an error or malformed response, this will raiseKeyError. —sentry-warden[bot]on #111728
Replace with: An ADR that requires external API responses to be parsed through TypedDicts or Pydantic models with explicit field validation. A lint rule can flag direct ["key"] access on variables that come from requests.get(...).json() or gh_client.* calls.
Pattern 4: Type system misuse in serializers
Section titled “Pattern 4: Type system misuse in serializers”
TypeError when saving defaultCodingAgent: str | None union type is not callable— ThedefaultCodingAgentoption tuple usesstr | Noneas the type parameter. The code callstype_(data[key])to convert the value. However,str | Noneis atypes.UnionTypewhich cannot be called as a function. —sentry-warden[bot]on #111697
Pydantic validation error when coding_agent is 'seer' with integration_id— The code creates aSeerAutomationHandoffConfigurationwithtarget='seer'. However, the Pydantic model’stargetfield isLiteral['cursor_background_agent', 'claude_code_agent', ...]which excludes'seer'. —sentry-warden[bot]on #111697
Replace with: Stricter mypy/pyright configuration in CI, plus a custom check that the type annotation passed to option registration is a concrete callable type, not a UnionType. The Pydantic Literal mismatch is caught by mypy if strict mode is enabled on the relevant module.
Pattern 5: Key/identifier mismatch between writer and reader
Section titled “Pattern 5: Key/identifier mismatch between writer and reader”
Organization option key mismatch causes setting to have no effect— The new option registers with keysentry:default_automated_run_stopping_pointbut the code that consumes this value reads from keysentry:default_stopping_point. Any value set via the API will be stored but never read. —sentry-warden[bot]on #111697
Replace with: This is one of the most valuable bot findings — and one of the easiest to convert to a check. Sentry already has a registry for organization options. A test fixture can assert that every key written through OrganizationOption.objects.set_value(...) corresponds to a registered key, and every read site (get_value(...)) uses a key that exists in the registry. Bonus: extract option keys into a typed enum so the type system catches the typo at edit time.
Pattern 6: Non-exhaustive if/elif blocks causing UnboundLocalError
Section titled “Pattern 6: Non-exhaustive if/elif blocks causing UnboundLocalError”The
if/elifblock checking thedatatype is not exhaustive. A new subclass ofBaseMetricAlertNotificationDatawould cause anUnboundLocalErrorbecausemetric_issue_contextwould not be assigned. —sentry[bot]on #111674
Replace with: Use Python 3.10+ structural pattern matching with explicit case _: exhaustiveness, OR use mypy’s assert_never() pattern for tagged unions. Both are catchable by static analysis once the convention is in place.
Pattern 7: Companion list updates missed
Section titled “Pattern 7: Companion list updates missed”The
python-litestarplatform is missing from several product onboarding arrays inplatformCategories.tsx, which will prevent its onboarding documentation from loading for Logs, Metrics, Profiling, and Replay. —sentry[bot]on #111522
The
python-litestarplatform is missing from the Pythonplatformsset inmonitorQuickStartGuide.tsx, causing it to be miscategorized in the Crons monitor form. —sentry[bot]on #111522
In a single PR (#111522), the bot found the same kind of bug four separate times — adding a platform to one list while forgetting the other lists.
Replace with: Refactor the platform registry into a single source of truth that derives all the companion lists from a typed registry. This is a one-time refactor that eliminates an entire bug class. Alternatively, a CODEOWNERS-style consistency check: when a PR modifies platformCategories.tsx, verify that any new platform identifier appears in all the relevant arrays.
Pattern 8: High-cardinality metrics tags
Section titled “Pattern 8: High-cardinality metrics tags”The circuit breaker for Sentry App webhooks uses a per-app slug for its metrics key, creating high-cardinality data because a
metrics_keyoverride is not provided. —sentry[bot]on #111723
Replace with: A lint rule that flags metrics.incr(name, tags={"slug": ...}) (or similar high-cardinality patterns) without an explicit metrics_key override or an allowlist comment. This is a known anti-pattern with a known fix.
Pattern 9: State not reset in catch blocks
Section titled “Pattern 9: State not reset in catch blocks”The form’s saving state is not reset in the
catchblock on submission failure, leaving the submit button permanently busy and preventing retries. —sentry[bot]on #111490
Replace with: An ESLint rule (or React custom hook convention) that requires try/finally for any state setter inside an async submission handler — setSaving(true) in try, setSaving(false) in finally. The convention pre-empts the error path bug entirely.
Pattern 10: Truthiness checks removed → None saved as explicit value
Section titled “Pattern 10: Truthiness checks removed → None saved as explicit value”The removal of the truthiness check for
stopping_pointcausesNoneto be explicitly saved as a project option, overriding the intended default value of"code_changes". —sentry[bot]on #111697
Replace with: An ADR (and a lint rule) that says option setters must reject None unless the option’s schema explicitly allows it. Combined with Pattern 4 (typed option registry), this becomes a static check.
The Pattern Recurrence Matrix
Section titled “The Pattern Recurrence Matrix”| Pattern | PRs in Sample with Match | Bot(s) Reporting | Cost of Deterministic Check |
|---|---|---|---|
Missing DoesNotExist handling | 3+ | warden, sentry-bot | Low — Ruff/Semgrep rule |
.filter().first() + unreachable except | 1+ | warden | Trivial — AST lint |
| Direct dict access on API responses | 2+ | warden | Medium — TypedDict adoption + lint |
UnionType not callable | 1 | warden | Low — mypy strict mode |
Pydantic Literal mismatch | 1 | warden | Trivial — mypy strict mode |
| Option key mismatch (writer/reader) | 1 | warden | Low — typed option registry |
| Non-exhaustive if/elif | 1 | sentry-bot | Medium — assert_never convention |
| Companion list miss (platform registry) | 1 (4 instances) | sentry-bot | Medium — refactor to single source |
| High-cardinality metrics tag | 1 | sentry-bot | Low — lint rule |
| Form state not reset on error | 1 | sentry-bot | Trivial — ESLint rule |
None saved as explicit value | 1 | sentry-bot | Low — schema validation |
These 10 patterns alone account for the majority of substantive bot findings in our sample. Each has a deterministic-check replacement that costs vastly less than agentic review per-PR.
The Cost Argument
Section titled “The Cost Argument”Agentic review costs roughly:
- LLM inference: A 1,500-character bot review comment implies hundreds of thousands of tokens of input (the diff + context) and meaningful output. At current model pricing, this is on the order of tens of cents per finding.
- Developer attention: 1,565-char findings take 30-60 seconds to read and evaluate. Multiplied across 139 findings in 60 PRs, that’s roughly 70-140 minutes of cumulative developer attention in our sample alone. Extrapolated to Sentry’s full PR volume, the time investment is substantial.
- Latency: Agentic review runs after the PR is opened. A bug found post-open is more expensive than a bug caught at edit time by a type checker or lint rule.
A deterministic check costs:
- CI runtime: Microseconds to milliseconds per file.
- Developer attention at edit time: Lint errors surface in the IDE before the PR is even opened. Catching the bug at edit time is roughly 100x cheaper than catching it in PR review.
- Zero per-PR LLM inference cost.
The economic and ergonomic case is clear: every bot finding that recurs is a candidate for promotion to a deterministic check.
What Bots Should Continue Doing
Section titled “What Bots Should Continue Doing”Agentic review still has a legitimate role:
- First-time pattern discovery — finding novel bug shapes that don’t match any known rule yet
- Cross-file reasoning — bugs that span multiple files in ways that AST-level rules can’t easily capture
- Semantic correctness — “this code does the wrong thing” findings that require understanding intent
- Security review for novel attack surfaces — where an attacker’s mental model is needed
The goal is not to eliminate bot review. The goal is to continuously promote findings from agentic review to deterministic rules, so that the bot’s attention is freed up for the genuinely novel cases.
Implications for ADR Proposals
Section titled “Implications for ADR Proposals”This insight reshapes ADR 6 — instead of a “triage protocol” for bot findings, the right ADR is a promotion protocol: a workflow for converting recurring bot findings into lint rules, type checks, or schema validations.
It also strengthens several other ADRs:
- ADR 2 (API Contract Evolution) — many bot findings (option key mismatch,
Nonepropagation, Pydantic Literal mismatches) are catchable if the API contract is typed and registry-driven - ADR 3 (Test Evidence Matrix) — bot test findings (assertion mismatches, missing edge cases) become test scaffold templates
- ADR 4 (Frontend Conventions) — the form-state-not-reset bot finding becomes a React convention, not an ad-hoc review comment
Caveats
Section titled “Caveats”- Sample bias: This analysis is based on the top 60 friction PRs, not all 500. Bot activity in low-friction PRs may differ.
- Pattern counts are minimums: The “PRs in sample with match” column shows occurrences in our 60-PR deep sample. Across Sentry’s full PR volume, each pattern likely recurs much more frequently. A formal recurrence analysis would require pattern-tagging the bot findings systematically.
- Substantive vs. noise filter: We exclude template comments (BUGBOT_REVIEW summaries, BACKEND_TEST_FAILURES) but count substantive line-level bot reviews.
- Bots are evolving: Sentry’s internal bots (
sentry[bot],sentry-warden[bot]) are relatively new. As more recurring patterns are promoted to deterministic checks, the bots’ findings should shift toward genuinely novel cases.