Add integration testing with testcontainers and Playwright#442
Add integration testing with testcontainers and Playwright#442
Conversation
The integration-tests crate defines traits, error types, and helpers consumed by Docker-dependent tests. Without Docker the compiler sees them as unused and -D warnings promotes them to errors. A crate-level allow keeps CI green.
Proves that a next/script with strategy="afterInteractive" executes after a client-side navigation through the trusted-server proxy. The test SPA-navigates from / to /dashboard, waits for the script's global marker, asserts it ran exactly once, and confirms no document-level request fired (true SPA, not full reload).
…hub.com-prkjr:IABTechLab/trusted-server into feature/integration-testing-with-testcontainers
aram356
left a comment
There was a problem hiding this comment.
Solid integration testing infrastructure. The RuntimeEnvironment + FrontendFramework matrix design is clean and extensible, the two-layer approach (Rust HTTP + Playwright browser) gives good coverage, and the assertion library having its own unit tests is a nice touch.
Cross-cutting notes
📝 Separate Cargo.lock — The 3400-line crates/integration-tests/Cargo.lock is disconnected from the workspace lock. If error-stack (or any shared dep) is updated in the workspace but not here, they silently diverge. Consider documenting the upgrade process or adding a CI check that validates shared dependency versions stay in sync.
📝 Viceroy config placeholders — viceroy-template.toml uses key = "placeholder" / data = "placeholder" for all KV stores. This means integration tests don't exercise KV store functionality, which is fine but worth a comment in the config noting this intentional scope limitation.
⛏ .gitignore missing trailing newline — The last line lacks a POSIX-compliant trailing newline.
What's done well
👍 Drop impl on ViceroyHandle ensures process cleanup even on panics — good defensive design.
👍 Assertion functions (assertions.rs) have thorough unit tests covering both passing and failing cases.
👍 Browser tests use hydration detection (data-hydrated attribute) before SPA navigation — avoids the classic "clicked before client router was ready" flake.
👍 Health check endpoint is namespaced (/__trusted-server/health) to minimize publisher route conflicts.
👍 Shell scripts detect native target dynamically via rustc -vV — no hardcoded platform assumptions.
👍 Smart use of PHP built-in server for WordPress fixture — avoids MySQL dependency while still testing real HTML processing behavior.
| // Sequential execution: all tests share a single origin port (8888) | ||
| workers: 1, | ||
| use: { | ||
| baseURL: process.env.VICEROY_BASE_URL || "http://127.0.0.1:7878", |
There was a problem hiding this comment.
🔧 baseURL is resolved at config-load time, before globalSetup runs.
Playwright evaluates the config module before calling globalSetup. The env var VICEROY_BASE_URL is only set in global-setup.ts:53 after Viceroy binds to a random port (infra.ts:88). By that point the config has already captured the 7878 fallback.
Consider writing the Viceroy URL to the state file (already done at global-setup.ts:57) and reading it in each test via a Playwright fixture, or use Playwright's webServer config to manage the Viceroy lifecycle and let Playwright assign the baseURL itself.
| WORKDIR /app | ||
|
|
||
| COPY package.json ./ | ||
| RUN npm install |
There was a problem hiding this comment.
🔧 npm install without a lockfile resolves caret ranges (^14.0.0, ^18.2.0 in package.json) fresh on every build.
An upstream Next.js 14.x patch or React 18.x release can silently change test behavior without any repo change, causing CI flakiness that's hard to reproduce locally.
Fix: generate a package-lock.json in the fixture directory, COPY it alongside package.json, and switch to npm ci.
|
|
||
| # Cleanup trap: stop any leftover containers on failure | ||
| cleanup() { | ||
| docker ps -q \ |
There was a problem hiding this comment.
🔧 Multiple --filter ancestor= flags in one docker ps call are AND-joined by Docker — a container can't descend from both images simultaneously, so this always returns zero results.
On failure paths the trap silently does nothing and can leave port-8888 containers behind.
Fix: split into two separate calls:
cleanup() {
docker ps -q --filter ancestor=test-nextjs:latest 2>/dev/null | xargs -r docker stop 2>/dev/null || true
docker ps -q --filter ancestor=test-wordpress:latest 2>/dev/null | xargs -r docker stop 2>/dev/null || true
}| let result = match (method, path.as_str()) { | ||
| // Health check endpoint for integration tests and monitoring. | ||
| // Namespaced to avoid shadowing publisher routes. | ||
| (Method::GET, "/__trusted-server/health") => { |
There was a problem hiding this comment.
❓ This adds a hard intercept for GET /__trusted-server/health returning 200 ok before the catch-all publisher proxy.
Previously this path fell through to the origin. The /__trusted-server/ namespace makes collision unlikely, but if any publisher origin serves content at this path, behavior regresses from origin content to a forced health response.
Is the intent to keep this in production, or should it be gated behind a build-time flag so it's only active during testing?
| /** Stop a Docker container by ID. */ | ||
| export function stopContainer(containerId: string): void { | ||
| try { | ||
| execSync(`docker stop ${containerId}`, { timeout: 10_000 }); |
There was a problem hiding this comment.
🔧 execSync with string interpolation passes containerId through the shell. If the value is malformed (e.g. from a corrupted state file), this is a command injection vector. Same pattern at lines 47–54 for docker run.
Fix: use execFileSync('docker', ['stop', containerId], { timeout: 10_000 }) which bypasses the shell entirely.
| "nextjs" | ||
| } | ||
|
|
||
| fn build_container(&self, origin_port: u16) -> TestResult<ContainerRequest<GenericImage>> { |
There was a problem hiding this comment.
📝 build_container always returns Ok(...) here and in wordpress.rs. If the trait method is fallible for future container builders, a doc comment on the trait explaining when it might fail would help. Otherwise simplifying to ContainerRequest<GenericImage> directly avoids suggesting fallibility that doesn't exist.
| // Give scripts a moment to execute and log errors | ||
| await page.waitForTimeout(2000); | ||
|
|
||
| // Suppress benign errors: |
There was a problem hiding this comment.
⛏ Hard-coded waitForTimeout(2000) is flaky — too short on slow CI runners, wasted time on fast ones. Consider expect.poll() or polling for a specific console state instead.
| /// # Errors | ||
| /// | ||
| /// Returns [`TestError::RuntimeNotReady`] if the runtime does not respond within timeout. | ||
| pub fn wait_for_ready(base_url: &str, health_path: &str) -> TestResult<()> { |
There was a problem hiding this comment.
♻️ This wait_for_ready (30 retries, with root-path fallback) and wait_for_container in integration.rs:129 (60 retries, no fallback) do essentially the same thing with different timeouts. Consider unifying into one function with a configurable retry count to reduce duplication and make timeout behavior explicit at each call site.
| --target x86_64-unknown-linux-gnu | ||
| -- --include-ignored --test-threads=1 | ||
| env: | ||
| INTEGRATION_ORIGIN_PORT: ${{ env.ORIGIN_PORT }} |
There was a problem hiding this comment.
🤔 The browser-tests and integration-tests jobs duplicate ~80 lines of identical setup: checkout, Rust toolchain, Viceroy cache, WASM build, and both Docker image builds. When any of these steps change (e.g. Viceroy install command), one job can be updated while the other is missed.
Consider extracting the shared setup into a composite action or reusable workflow.
| branches: [main] | ||
| pull_request: | ||
| pull_request_review: | ||
| types: [submitted] |
There was a problem hiding this comment.
🌱 pull_request_review trigger means integration tests re-run on every review submission, even when no code changed. Combined with the pull_request trigger, an approval on an existing PR triggers a full redundant run. Consider removing this trigger or using workflow_dispatch for on-demand runs.
| # docker run -p 3000:3000 test-nextjs:latest | ||
|
|
||
| # --- Build stage --- | ||
| FROM node:20-alpine AS builder |
There was a problem hiding this comment.
🔧 We should pass version of node from .tool-versions
Summary
/healthendpoint and CI workflow for running both test suites separately from the main CIChanges
crates/fastly/src/main.rs/healthGET endpoint returning 200 with "ok" bodycrates/integration-tests/Cargo.tomlcrates/integration-tests/tests/common/assert_form_action_rewritten(9 unit tests)crates/integration-tests/tests/environments/crates/integration-tests/tests/frameworks/crates/integration-tests/tests/integration.rstest_all_combinations, per-framework tests)crates/integration-tests/fixtures/configs/crates/integration-tests/fixtures/frameworks/wordpress//wp-admin/stubcrates/integration-tests/fixtures/frameworks/nextjs/crates/integration-tests/browser/crates/integration-tests/browser/tests/shared/crates/integration-tests/browser/tests/nextjs/crates/integration-tests/browser/tests/wordpress/.github/workflows/integration-tests.ymlscripts/integration-tests.shscripts/integration-tests-browser.shcrates/integration-tests/README.mdINTEGRATION_TESTS_PLAN.mdCargo.tomlCargo.lockTest coverage
HTTP-level (Rust + testcontainers)
HtmlInjectionScriptServingAttributeRewritingScriptServingUnknownFile404WordPressAdminInjectionNextJsRscFlightNextJsServerActionsNextJsApiRouteNextJsFormActionBrowser-level (Playwright + Chromium)
script-injectionscript-bundlenavigation(4-page SPA chain + back button + deferred route script)api-passthroughform-rewritingadmin-injectionTotals: 12 Next.js + 5 WordPress browser tests, 0 flaky
Known gaps
AuctionRequestbut is not yet populated upstream. Requires implementation before it can be tested.testcontainerscrate; browser tests usedocker run/docker stopvia Playwright's globalSetup/globalTeardown (Node.js cannot use the Rust crate).Closes
Closes #398
Test plan
cargo test --workspace(excluding integration-tests, which requires native target)cargo clippy --all-targets --all-features -- -D warnings(excluding integration-tests)cargo fmt --all -- --checkcd crates/js/lib && npm run format-- passescd docs && npm run format-- passescargo test -p integration-tests --target aarch64-apple-darwin --no-run./scripts/integration-tests.sh-- all pass./scripts/integration-tests-browser.sh-- 17 pass, 0 failcargo clippy --manifest-path crates/integration-tests/Cargo.toml --tests-- cleanChecklist
unwrap()in production code — useexpect("should ...")logmacros (notprintln!)