Skip to content

fix(firestore-bigquery-change-tracker): unmask BQ errors and stop spurious table updates#2817

Open
cabljac wants to merge 2 commits into
nextfrom
fix/bq-change-tracker-bugs
Open

fix(firestore-bigquery-change-tracker): unmask BQ errors and stop spurious table updates#2817
cabljac wants to merge 2 commits into
nextfrom
fix/bq-change-tracker-bugs

Conversation

@cabljac
Copy link
Copy Markdown
Contributor

@cabljac cabljac commented May 8, 2026

Summary

Two small but high-impact fixes in firestore-bigquery-change-tracker. Both bugs were diagnosed in detail by the issue reporters with the offending lines called out.

#2801: dead-letter path crashes on warm containers

bigquery/handleFailedTransactions.ts called db.settings({ ignoreUndefinedProperties: true }) inside the exported async body. The Firestore Node SDK contract requires settings() be called at most once and before any other method runs on the singleton. On warm Cloud Functions containers both conditions are violated by the second invocation; the resulting Firestore has already been initialized throw masks the real BigQuery insert failure that put us on the dead-letter path in the first place.

Fix: guard with a per-instance `Set`, plus a `try/catch` so that even if the singleton was used elsewhere in the process before this module's first call, the dead-letter write to the backup collection still proceeds and the real BQ error surfaces upstream.

#2194: `tableRequiresUpdate` always returns true

Two independent bugs in the same function:

  1. Clustering check:
    ```ts
    const configCluster = JSON.stringify(config.clustering); // "null" when unconfigured
    const tableCluster = JSON.stringify(metadata.clustering?.fields || []); // "[]"
    if (configCluster !== tableCluster) return true; // always true
    ```
    Fixed by normalising null/undefined to `[]` before stringifying: `JSON.stringify(config.clustering || [])`.

  2. `pathParamsColExists` strict-equality against `undefined`:
    `bigquery/index.ts` assigned the result of `fields.find(...)` (which is `Field | undefined`) to a parameter typed `boolean`. The check `!!config.wildcardIds !== pathParamsColExists` then resolved to `false !== undefined` -> always true.
    Fixed by coercing in the caller (`!!fields.find(...)`). Applied to `documentIdColExists` and `oldDataColExists` too since the function signature already promises `boolean` for all three (downstream checks happened to tolerate `undefined` for those two, but the type was being violated).

Test plan

  • `npm run build` (`tsc`) clean in `firestore-bigquery-export/firestore-bigquery-change-tracker`
  • CI integration tests against the configured BQ project (the existing `tests` directory hits a real BQ dataset; not runnable locally without `PROJECT_ID` and credentials)

The diff is intentionally tight (3 files, +23 / -8). I haven't added new integration tests since they all target a live BigQuery project; happy to add unit-level mocks for both fixes in a follow-up if reviewers prefer.

…rious table updates

Fixes #2801, fixes #2194.

#2801 - handleFailedTransactions

The dead-letter path called `db.settings({ ignoreUndefinedProperties: true })`
on every invocation. `Firestore.settings()` throws if the singleton has already
been used, so on warm Cloud Functions containers the second invocation crashed
with "Firestore has already been initialized", masking the original BigQuery
insert error that triggered the dead-letter path in the first place.

Guard the call with a per-instance Set and a try/catch: settings is applied
once per process per instance, and any throw is swallowed so the dead-letter
write to the backup collection still runs and the real BQ error surfaces.

#2194 - tableRequiresUpdate always returning true

Two bugs:

1. Clustering check: `JSON.stringify(config.clustering)` returns "null" when
   clustering is not configured, but `JSON.stringify(metadata.clustering?.fields || [])`
   returns "[]". They never match, so the table is always flagged as needing
   update. Normalise null/undefined to `[]` before stringifying.

2. `pathParamsColExists` was assigned the result of `fields.find()` (which is
   `Field | undefined`), then strict-compared against a boolean
   (`!!config.wildcardIds !== pathParamsColExists`). When the column is absent,
   `false !== undefined` is always true. The function signature already
   declares these params as `boolean`; coerce in the caller. Same fix applied
   to `documentIdColExists` and `oldDataColExists` for type-honesty (they
   already worked correctly under truthy checks but the type was being violated).
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves the robustness of the BigQuery change tracker by handling undefined clustering configurations and safely applying Firestore settings during failed transactions to avoid runtime errors. It also ensures boolean values for column existence checks. Feedback suggests enhancing type safety for the Firestore instance ID and using the more idiomatic Array.prototype.some() method instead of !!find() for checking the presence of specific columns.

Comment thread firestore-bigquery-export/firestore-bigquery-change-tracker/src/bigquery/index.ts Outdated
Comment thread firestore-bigquery-export/firestore-bigquery-change-tracker/src/bigquery/index.ts Outdated
Comment thread firestore-bigquery-export/firestore-bigquery-change-tracker/src/bigquery/index.ts Outdated
- handleFailedTransactions: make instanceId optional and drop non-null
  assertion at the call site so the helper matches the optional type on
  ChangeTrackerConfig.firestoreInstanceId
- bigquery/index.ts: use Array.prototype.some() instead of !!find() for
  the column-existence checks (documentId, pathParams, oldData)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant