Skip to content

Commit 8bb1f6a

Browse files
author
Omega Agent
committed
feat: Write the feasibility report: findings, recommended default page strategy, sync/async decision, failure-mode handling, open risks
Task: beads-15f3da Spec: evaluate-post-verification-rendered-page-as-activa-19
1 parent 4d6f36b commit 8bb1f6a

1 file changed

Lines changed: 263 additions & 0 deletions

File tree

Lines changed: 263 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,263 @@
1+
# Feasibility Report — Post-Verification Rendered Page as Activation Signal
2+
3+
- **Spec:** `evaluate-post-verification-rendered-page-as-activa-19`
4+
- **Task:** 4.2 — Write the feasibility report (findings, recommended default
5+
page strategy, sync/async decision, failure-mode handling, open risks)
6+
- **Status:** Final. Inputs: tasks 1.x, 2.x, 3.x, and 4.1 of this spec.
7+
- **Owner:** Get Started squad (team:omega)
8+
- **Date:** 2026-04-30
9+
10+
This report is the single consolidated answer to the four acceptance criteria
11+
of the spike. Everything below is drawn from the numbered investigation
12+
documents in `docs/spikes/`; those remain the source-of-truth for method and
13+
raw data. This report is the squad-review surface and the input to task 4.3
14+
(MVP scope for the follow-up implementation ticket).
15+
16+
Non-goals of the spec — production UI, changes to the render API / crawler /
17+
verification endpoint, batch rendering, screenshots, visual diffing, pricing /
18+
quota / accounting changes, i18n / a11y / analytics polish, caching work, and
19+
refactors of `features/integrationWizard` beyond the throwaway prototype — are
20+
respected throughout. Nothing in this report proposes work inside those areas.
21+
22+
## 1. TL;DR
23+
24+
| Acceptance criterion | Verdict |
25+
|---|---|
26+
| Feasibility of post-verification render is clear | **Feasible.** No changes to this repo required. `GET /render?url=…` already exposes everything the wizard needs. |
27+
| Page selection approach decided | **Verified domain root (`https://<verified-domain>/`)**, with an optional user override pre-filled to the same value. |
28+
| Failure cases documented | Five render-path cases + two wizard-side cases, all non-blocking in the UI. Enumerated in §5. |
29+
| Recommended implementation path defined | **Async-first** call to `/render` from the wizard backend at verification-success, with a **3 s opportunistic sync window**. MVP is a single wizard PR; no render-service change. See §6. |
30+
31+
## 2. Feasibility of post-verification render
32+
33+
**Feasible with the existing render path, with no changes to this repo.**
34+
Confirmed by reading `lib/server.js`, `lib/index.js`, `lib/util.js`, and
35+
`lib/browsers/chrome.js` during tasks 1.2, 2.1, 2.3, and 2.4.
36+
37+
Key facts that make it feasible today:
38+
39+
- Single entry point: `GET /render?url=<url>&renderType=html` (and the POST
40+
equivalent) hits `server.onRequest` in `lib/server.js`. No new endpoint,
41+
no new route, no crawler or sitemap dependency (task 2.1).
42+
- Input validation is already at the boundary: `valid-url.isWebUri` rejects
43+
malformed input with `400` before the plugin chain runs
44+
(`lib/server.js:221-225`). The wizard does not need to re-validate.
45+
- Response contract is already what the wizard needs: status code from the
46+
origin is propagated (`req.prerender.statusCode`, `lib/server.js:303`),
47+
configurable `renderErrorStatusCode` (default `504`) with
48+
`x-prerender-504-reason` header for renderer-side failures
49+
(`lib/server.js:484-486`), and rendered HTML in the body.
50+
- Bounded worst case: `pageLoadTimeout` (default 20 s, `lib/server.js:9`)
51+
plus `waitAfterLastRequest` (500 ms, `lib/server.js:5`) plus the 60 s
52+
per-request hang watchdog (`lib/server.js:241`) cap how long any single
53+
call can take.
54+
- Auth is already solved for service-to-service use: if deployed behind
55+
`basicAuth`, the wizard backend reuses the same credential the crawler
56+
uses. No new auth path is required (task 1.2).
57+
58+
The spec's reliability bar (≥ 95% success on verified URLs) and latency
59+
budget (P50 ≤ 6 s, P95 ≤ 12 s, hard ceiling 20 s) are achievable with the
60+
verified-root selection because the root is the URL whose reachability we
61+
just proved during verification — DNS, TLS, and bot-gating are known-good at
62+
call time (task 2.4). The budget is still tight against the 20 s
63+
`pageLoadTimeout`, which is why §4 below recommends against blocking the
64+
wizard on the render.
65+
66+
## 3. Method (summary)
67+
68+
Detail lives in the individual spike documents; this is the précis so the
69+
report stands alone.
70+
71+
- **Entry point & auth scoping:** read of `lib/server.js`, `lib/index.js`,
72+
`lib/plugins/{basicAuth,whitelist,blacklist}.js``post-verification-render-entry-point.md`.
73+
- **Feasibility, without crawler dependency:** read of `lib/server.js`,
74+
`lib/util.js`, confirmation that no crawler/sitemap/queue code is reachable
75+
from the render path — `post-verification-render-feasibility.md`.
76+
- **Failure enumeration & reproduction:** read of `lib/browsers/chrome.js`
77+
+ live repros against the local server on branch tip `e83e20d`
78+
`post-verification-render-failure-cases.md`.
79+
- **Page-selection comparison:** three-strategy evaluation (verified root,
80+
user-provided, small fixed set) against reliability and latency from the
81+
onboarding-render sample —
82+
`evaluate-post-verification-rendered-page-activation-page-selection.md`.
83+
- **Sync vs. async decision:** latency and failure-mode data applied to the
84+
onboarding flow — `sync-vs-async-decision.md` (task 4.1).
85+
86+
## 4. Decisions
87+
88+
### 4.1 Page selection: verified domain root, with override
89+
90+
**Decision: default to `https://<verified-domain>/`.** Optional user override,
91+
pre-filled with the verified root.
92+
93+
Rationale (full detail in task 2.4 doc):
94+
95+
- The root is the URL whose reachability we just proved during verification —
96+
conditional probability of render success is materially higher than for an
97+
arbitrary user-supplied URL.
98+
- Single render call, so p95 latency is bounded by one `pageLoadTimeout`.
99+
- No probe/fallback code path, no backend caching, no render-accounting
100+
change — all explicitly out of scope for this spec.
101+
- Override input covers legitimate edge cases (SPA shells, locale redirects,
102+
marketing-only homepages) without introducing a probe-and-pick path.
103+
104+
**Explicitly rejected:** the "small fixed set" probe-and-pick strategy. Each
105+
failed probe costs a full `pageLoadTimeout` before fall-through, which breaks
106+
the latency budget and compounds onboarding-render accounting (non-goal).
107+
108+
### 4.2 Sync vs. async: async-first with a 3 s opportunistic sync window
109+
110+
**Decision: async-first. Never block the wizard on the render.** Fire the
111+
render the moment verification succeeds; if it returns within a 3 s wall-clock
112+
budget show it inline; otherwise advance the wizard and update the preview
113+
slot in place when the render finally resolves (success or failure).
114+
115+
Rationale (full detail in task 4.1 doc, `sync-vs-async-decision.md`):
116+
117+
- P95 latency is too close to the 20 s `pageLoadTimeout` for a blocking
118+
wizard step to be acceptable. Timeouts and navigate errors only resolve
119+
after the full page-load budget elapses, so a sync UX pays worst-case
120+
latency on every failure.
121+
- A fully deferred UX (email / later surface) loses the "Aha" moment the
122+
spike is trying to create — a non-trivial fraction of homepages do render
123+
quickly and those users deserve inline feedback.
124+
- The 3 s sync budget is well above typical fast-mode render time and well
125+
below hang-perception thresholds for a secondary widget inside a completed
126+
step. Tunable; the follow-up ticket should instrument actual hit rate.
127+
128+
The verification step itself is **always** marked complete regardless of
129+
render outcome. The preview is additive, never a gate.
130+
131+
## 5. Failure-mode handling
132+
133+
Consolidates task 2.3 (`post-verification-render-failure-cases.md`) and the
134+
wizard-side surface from task 1.2. All states are non-blocking; all are
135+
derived from `statusCode`, body, and `x-prerender-504-reason` alone — no new
136+
render-service signal is required.
137+
138+
### 5.1 Render-path cases (originate in this repo)
139+
140+
| # | Case | Server signal | HTTP seen by wizard | Wizard UX state |
141+
|---|---|---|---|---|
142+
| 1 | Renderer error (Chrome-side) | `renderErrorStatusCode` branch in `chrome.js`, `x-prerender-504-reason` set | `504` (default `RENDERING_ERROR_STATUS_CODE`) | "We couldn't render `<url>` automatically." Retry button; no auto-retry in MVP. |
143+
| 2 | Non-2xx upstream (4xx/5xx propagated) | `tab.prerender.statusCode = params.response.status` | Origin status (e.g. `404`, `500`, `301`) | Show status + short explanation ("Your site returned `403`. If this page is behind login, try a public URL."). Offer override. |
144+
| 3 | Blocked / auth-gated page | Origin `401`/`403`, or `200` with login HTML / WAF interstitial | Pass-through status, or `200` with thin body | Hard case: same as #2. Soft case: flag as "rendered, but content looks thin"; needs a body-length / login-heuristic check on the wizard side. |
145+
| 4 | Empty / unparseable payload | `parseHtmlFromPage` resolves with empty body | `200` (or passed-through) with ~empty body | "Preview looks empty." Treat as render failure. Offer override. |
146+
| 5 | Page-load timeout | `pageLoadTimeout` + `tab.prerender.timedout` | `timeoutStatusCode` if set, else captured status | "Preview is taking longer than expected." Retry button. |
147+
148+
### 5.2 Operational / wizard-side cases
149+
150+
| # | Case | Trigger | Wizard UX state |
151+
|---|---|---|---|
152+
| 6 | Invalid URL (shouldn't happen with verified root; can with override) | `valid-url.isWebUri` rejects → `400` | Inline input error on the override field. |
153+
| 7 | Service unavailable / auth rejected / domain not allow-listed | `401` (basicAuth), `404` (whitelist/blacklist), network failure | Generic "Preview temporarily unavailable — try again" banner. **Never** blocks the wizard. |
154+
155+
### 5.3 Properties the MVP relies on
156+
157+
- **Every case is observable from the response alone.** No new telemetry, no
158+
new endpoint, no new header is needed for the MVP UI to distinguish them.
159+
- **No case blocks verification completion.** The wizard's primary CTA stays
160+
enabled at all times. This is the invariant the async-first decision
161+
protects.
162+
- **Retry is user-initiated.** The spike deliberately does not auto-retry:
163+
cascading retries on the long tail would break the latency budget for any
164+
user left watching.
165+
166+
## 6. Recommended implementation path (for task 4.3 / the follow-up ticket)
167+
168+
Scope so the follow-up fits in a single wizard PR. No render-service change.
169+
170+
1. **Call site.** After verification resolves successfully, the wizard
171+
backend issues one
172+
`GET <PRERENDER_BASE_URL>/render?url=<verifiedRoot>&renderType=html&followRedirects=true`
173+
using the existing crawler credential. Pre-fill an override input with
174+
the verified root. Pass a neutral user-agent so the rendered output
175+
matches what a real visitor sees, not the crawler UA.
176+
2. **Kickoff timing.** Fire the render immediately when verification succeeds
177+
— not when the user clicks "Next." The sync window is a race against the
178+
render, not against user think-time.
179+
3. **Sync window.** 3 s wall-clock from kickoff. If the render resolves within
180+
the window, show the preview inline. Otherwise, advance the wizard with a
181+
"Your preview is rendering…" affordance in the preview slot.
182+
4. **Async resolution.** When the render ultimately resolves, update the
183+
preview slot in place (success, failure, or empty). If the user has moved
184+
on, the finished preview is available on the post-onboarding / integration
185+
settings page.
186+
5. **UI states.** `loading`, `success` (iframe / sandbox, plus "looks right?
187+
/ try a different URL"), and the seven failure states from §5.1–5.2. All
188+
required by the universal async-UI rule; none require backend changes.
189+
6. **Sandbox.** Render output goes into a sandboxed iframe; treat as
190+
untrusted third-party HTML; do not persist beyond the request/response
191+
lifecycle in the MVP.
192+
7. **Client-side timeout cap.** Hard cap the wizard's own fetch at slightly
193+
above `pageLoadTimeout` (e.g. 25 s) so a stuck tab cannot wedge the
194+
preview slot indefinitely.
195+
8. **Scope guardrails (explicit deferrals).** No caching, no batching, no
196+
screenshots, no before/after diff, no per-customer quota accounting, no
197+
analytics instrumentation, no i18n / a11y polish — all spec non-goals.
198+
Task 4.3 is the place to lock this list down for the implementation
199+
ticket.
200+
201+
Rough sizing (not binding; task 4.3 refines):
202+
203+
- **Frontend:** 1 panel, 1 sandboxed iframe, state machine covering the
204+
seven failure states and the sync / async handoff. Small.
205+
- **Backend:** 1 service-to-service call at the verification-success
206+
handler, reusing existing credentials and HTTP client. Smaller.
207+
- **Ops:** no new service, no new env var, no new dashboard. Monitor the
208+
3 s sync-window hit rate post-launch to tune the budget.
209+
210+
## 7. Open risks and follow-ups
211+
212+
All items here are **out of scope for this spike**. They are tracked here so
213+
they are not lost going into task 4.3 and the follow-up implementation.
214+
215+
1. **Sync-window hit rate is not measured yet.** 3 s is the starting point;
216+
the follow-up should instrument how often the inline race wins and tune
217+
from data, not intuition.
218+
2. **SPA-shell false negatives.** A `200` with near-empty body after
219+
hydration didn't finish will read as "empty payload" in §5.1 case 4.
220+
Body-length heuristics are a MVP wizard concern, not a render-service
221+
concern; the threshold is a judgement call the follow-up makes.
222+
3. **Soft bot / WAF blocks indistinguishable from a thin-content page.**
223+
Same class as (2); same treatment. Accept false-negative activation
224+
signals as a known limit of the MVP; document in the UI copy.
225+
4. **Post-wizard surfacing.** Whether the preview should persist in the
226+
integration's ongoing settings view is a product decision, not a
227+
rendering decision. Flagged for the follow-up.
228+
5. **Override abuse / retries.** A customer spamming override + retry is
229+
rate-limited by the render service's own connection concurrency, but the
230+
spec explicitly defers quota/accounting. If abuse is observed post-launch,
231+
mitigate in the wizard backend, not here.
232+
6. **User-agent choice.** A neutral UA keeps the rendered output
233+
representative but means the wizard cannot reuse crawler-UA-specific
234+
cached renders on the render service. This is a deliberate trade-off.
235+
7. **Browser health.** The render server restarts Chrome periodically
236+
(`browserTryRestartPeriod`) and may reject with `503` when the browser
237+
isn't connected. The async-first design absorbs this; the follow-up just
238+
needs to treat `503` as the transient-unavailable state in §5.2 case 7.
239+
240+
## 8. Acceptance-criteria mapping
241+
242+
| Criterion | Section |
243+
|---|---|
244+
| Feasibility of post-verification render is clear | §2 |
245+
| Page selection approach decided | §4.1 |
246+
| Failure cases documented | §5 |
247+
| Recommended implementation path defined | §6 |
248+
249+
## 9. Non-goals respected
250+
251+
This report does not:
252+
253+
- Propose changes to the render API, the crawler, or the verification
254+
endpoint.
255+
- Propose multi-URL batching, screenshots, visual diffing, or before/after
256+
comparison.
257+
- Propose pricing, quota, or render-accounting changes for onboarding renders.
258+
- Propose caching beyond what the existing render path already provides.
259+
- Propose i18n, a11y, or analytics instrumentation for the eventual UI.
260+
- Propose any refactor of `features/integrationWizard` beyond hosting the
261+
throwaway prototype covered in task 3.1.
262+
- Ship production UI; the wizard implementation is the follow-up ticket
263+
scoped by task 4.3 on the basis of this report.

0 commit comments

Comments
 (0)