Skip to content

test: smoke-test proxy IT dispatch workflow (DO NOT MERGE)#800

Closed
vikrantpuppala wants to merge 2 commits into
mainfrom
test/proxy-test-dispatch-smoke
Closed

test: smoke-test proxy IT dispatch workflow (DO NOT MERGE)#800
vikrantpuppala wants to merge 2 commits into
mainfrom
test/proxy-test-dispatch-smoke

Conversation

@vikrantpuppala
Copy link
Copy Markdown
Contributor

Purpose

End-to-end smoke test of the IT dispatch workflow added in #799. Not for merge — closing after the smoke test confirms the workflow behaves correctly.

Diff

A 5-line docstring addition in `src/databricks/sql/backend/kernel/init.py` — touches `src/` so the changed-paths filter in `trigger-tests-pr` fires.

Expected behaviour to verify

Phase 1 — PR opened (this commit)

  • `skip-integration-tests-pr` runs and posts a synthetic `success` `Python Proxy Tests` check.
  • `trigger-tests-pr` and `merge-queue-python` skip (no label, no merge_group).
  • Verify `Python Proxy Tests` shows up as green in the check rollup.

Phase 2 — apply `integration-test` label

  • `trigger-tests-pr` runs.
    • `Detect changed driver paths` → `python: true` (we touched `src/`).
    • `Sanitize PR title` → strips quote/newline/etc. from the title.
    • `Dispatch Python tests to internal repo` → either succeeds (if secrets are installed on the App + this repo) or fails with a 403/401 (if secrets aren't installed yet → expected, will be addressed separately).
  • If dispatch succeeds: comment posted on PR; driver-test repo's `python-proxy-tests.yml` workflow runs against this branch's HEAD.
  • If dispatch fails: `Fail check on dispatch error` step posts a `failure` check on `Python Proxy Tests` so we know the gate didn't silently pass.

Phase 3 — push a new commit

  • `remove-label-on-new-commit` runs.
    • Removes the `integration-test` label.
    • Posts the "Integration test approval reset" comment.

Phase 4 — cleanup

  • Close PR. No merge.

Out of scope

This PR doesn't verify the merge queue path (`merge-queue-python` job). That requires merge queue to be enabled on `main` branch protection (operational step, separate from this PR).

This pull request and its description were AI-assisted by Isaac.

Trivial docstring addition touching `src/` to exercise the
.github/workflows/trigger-integration-tests.yml dispatch path
end-to-end:

  - On PR open: skip-integration-tests-pr should post a synthetic
    `success` Python Proxy Tests check.
  - Add `integration-test` label: trigger-tests-pr should
    dispatch to databricks/databricks-driver-test.
  - Push a new commit: remove-label-on-new-commit should clear
    the label and post the security-reset comment.

Not for merge — close after smoke test confirms the workflow
behaves end-to-end.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
@vikrantpuppala vikrantpuppala added the integration-test Triggers proxy-based integration tests; auto-removed on new commits. label May 22, 2026
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
@github-actions github-actions Bot removed the integration-test Triggers proxy-based integration tests; auto-removed on new commits. label May 22, 2026
@github-actions
Copy link
Copy Markdown

Integration test approval reset.

New commits were pushed to this PR. The integration-test label has been automatically removed for security.

A maintainer must re-review the changes and re-add the label to trigger tests again.

Latest commit: 9ed6916

@vikrantpuppala vikrantpuppala deleted the test/proxy-test-dispatch-smoke branch May 22, 2026 10:16
@vikrantpuppala
Copy link
Copy Markdown
Contributor Author

Smoke-test results

End-to-end exercise of the IT dispatch workflow added in #799. Results:

What worked ✅

  • Phase 1 — PR opened
    • `skip-integration-tests-pr` ran and posted a synthetic `Python Proxy Tests: success` check.
    • `trigger-tests-pr` + `merge-queue-python` correctly skipped (no label, no merge_group).
  • Phase 3 — push new commit (synchronize event)
    • `remove-label-on-new-commit` ran, removed the `integration-test` label, and posted the "Integration test approval reset" comment.
    • `skip-integration-tests-pr` re-ran on the new SHA and posted a fresh synthetic-success.
    • Verified: `gh pr view --json labels` returned empty; comment shows up in PR timeline from `github-actions[bot]`.

What broke (expected, secrets not installed) ⚠️

  • Phase 2 — label applied, dispatch attempted
    • `trigger-tests-pr` ran but failed at "Generate GitHub App Token (internal repo)" because `INTEGRATION_TEST_APP_ID` / `INTEGRATION_TEST_PRIVATE_KEY` aren't installed on this repo yet.

Real bug found 🐛

While we expected dispatch to fail without the secrets, the failure handler (`Fail check on dispatch error`) also failed with `Input required and not supplied: github-token`. Reason: my failure step uses the App-token-generated public-repo token (`steps.public-token.outputs.token`), but `public-token` itself depends on the same App secrets that are missing. When the secrets are absent the failure handler can't post a check-run, so the gate just stays green from Phase 1's stale synthetic success — exactly the silent-pass anti-pattern we're trying to avoid.

Note this is the same shape as the canonical `adbc-drivers/databricks` workflow, so it's a latent issue there too. Fix is to fall back to `${{ github.token }}` (the default workflow token, which has `checks: write` because we already declared that permission) in the failure handler.

Follow-ups

  1. Fix the failure handler in `trigger-integration-tests.yml` — change the `Fail check on dispatch error` step to use `${{ github.token }}` instead of `steps.public-token.outputs.token`. Will open a follow-up PR.
  2. Install the App secrets (`INTEGRATION_TEST_APP_ID` / `INTEGRATION_TEST_PRIVATE_KEY`) on this repo + add `databricks-sql-python` to the App's installation. Operational task — already documented in ci: add label-gated proxy-test dispatch to databricks-driver-test #799's description.
  3. Enable merge queue + mark `Python Proxy Tests` required on `main` branch protection. After [Cleanup] Move connector files to root #2.

vikrantpuppala added a commit that referenced this pull request May 25, 2026
* ci(integration-tests): use github.token for check-run posters

Follow-up to #799. The dispatch failure handlers and auto-pass
steps were posting check-runs with `steps.public-token.outputs.token`,
which is itself an App-token-generating step.

That created a silent-failure trap: if the App secrets are
missing or rotated, the App-token step fails, then the failure
handler also fails (no token to authenticate with), and the gate
sits green from the earlier `skip-integration-tests-pr` job's
synthetic-success check — the exact silent-pass anti-pattern
the failure handler exists to prevent.

Discovered by exercising the dispatch end-to-end on a draft PR
before the App secrets were installed (#800 closed). The
canonical adbc-drivers/databricks workflow has the same latent
bug — fix not yet upstreamed there.

The fix is to use the default workflow `${{ github.token }}` for
all check-posting steps. The default token already has
`checks: write` because each job declares the permission.
`steps.public-token` is no longer referenced anywhere; the
generation step is removed to keep the workflow tidy.

The App token is still used (correctly) for the actual dispatch
call into databricks-driver-test, where cross-repo write access
is required.

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>

* test(retry): isolate retry tests from shared telemetry executor

The e2e retry tests in retry_test_mixins.py assert exact urllib3 call
counts (e.g. `mock_validate_conn.call_count == 6`) while mocking
urllib3 globally via `_validate_conn` / `_get_conn`. The shared
`TelemetryClientFactory` executor — populated by *prior* tests in the
same worker — keeps firing background telemetry pushes that land on the
same mocked urllib3 layer, inflating the counts and tripping the
assertions.

Drain the factory and route any new `initialize_telemetry_client` call
to `NoopTelemetryClient` for the duration of the three tests that
assert exact counts:

  - test_oserror_retries
  - test_retry_max_count_not_exceeded
  - test_retry_exponential_backoff

Factory state is fully restored on exit so subsequent tests see the
real telemetry pipeline.

Co-authored-by: Isaac

---------

Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant