Created a softball scraper, edited unique sport constraint#69
Created a softball scraper, edited unique sport constraint#69
Conversation
- Integrated Firebase for user login and signup using Firebase ID tokens. - Updated login and signup mutations to handle Firebase authentication. - Modified user schema to reflect changes in authentication method. - Updated .gitignore to include Firebase service account key.
- Introduced a new function to extract scoring summaries for softball games, including inning and score details. - Updated the database index setup to ensure uniqueness of game entries by including the time field. - Added an 'inning' field to the BoxScoreEntryType for better data representation.
📝 WalkthroughWalkthroughThe pull request integrates Firebase Admin SDK for authentication, replacing NetID-based auth with Firebase ID token verification. It adds a softball game parser, updates the game database index schema, and extends GraphQL types to support inning-level scoring data. Changes
Sequence DiagramsequenceDiagram
actor Client
participant App as Flask App
participant Firebase
participant DB as MongoDB
Client->>App: POST login(id_token)
App->>Firebase: verify_id_token(id_token)
Firebase-->>App: decoded token with firebase_uid
App->>DB: query users by firebase_uid
DB-->>App: user document
App->>App: generate JWT (access_token, refresh_token)
App-->>Client: access_token, refresh_token
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
src/scrapers/games_scraper.py (1)
245-264:⚠️ Potential issue | 🟠 MajorLookup keys don’t include time, so updates can target the wrong game.
Given the uniqueness/index change includes
time, these fetches still key by date/location/opponent withoutgame_time. If multiple games match, Line 266–Line 271 picks the first row and may overwrite the wrong record.Proposed fix direction
curr_game = GameService.get_game_by_tournament_key_fields( city, game_data["date"], game_data["gender"], location, game_data["sport"], state, + game_time, ) if not curr_game: curr_game = GameService.get_game_by_key_fields( city, game_data["date"], game_data["gender"], location, team.id, game_data["sport"], state, + game_time, )If service signatures don’t yet accept
time, update them to keep lookup semantics aligned with the DB uniqueness constraint.Also applies to: 266-271
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scrapers/games_scraper.py` around lines 245 - 264, The lookup calls to GameService.get_game_by_tournament_key_fields and GameService.get_game_by_key_fields are missing the game time, so they can match the wrong record; update those two call sites to pass the game's time (e.g., game_data["time"] or game_time) and then update the corresponding GameService method signatures and implementations to accept and use the time parameter in the query/unique-key lookup so the DB uniqueness/index that includes time is honored.app.py (1)
150-175:⚠️ Potential issue | 🟡 MinorDuplicate
DefaultArgsclass definition.The
DefaultArgsclass and the conditional block are defined twice (lines 151-158 and lines 167-175). This appears to be an unintentional duplication.Proposed fix - remove the duplicate block
# Only parse arguments when running directly (not when imported by gunicorn) if __name__ == "__main__": args = parse_args() else: # Default args when imported by gunicorn class DefaultArgs: no_scrape = False no_daily_sun = False args = DefaultArgs() def signal_handler(sig, frame): sys.exit(0) signal.signal(signal.SIGINT, signal_handler) signal.signal(signal.SIGTERM, signal_handler) -# Only parse arguments when running directly (not when imported by gunicorn) -if __name__ == "__main__": - args = parse_args() -else: - # Default args when imported by gunicorn - class DefaultArgs: - no_scrape = False - no_daily_sun = False - args = DefaultArgs() # Only run scraping tasks if not disabled🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app.py` around lines 150 - 175, Duplicate conditional blocks define DefaultArgs and set args twice; remove the redundant second block (the repeated "if __name__ == '__main__': ... else: class DefaultArgs ... args = DefaultArgs()") so only one parse_args()/DefaultArgs flow remains. Keep a single occurrence that uses parse_args() when running as __main__ and defines DefaultArgs with no_scrape and no_daily_sun when imported, and ensure references to parse_args, DefaultArgs, and args are unchanged elsewhere.src/database.py (1)
73-87:⚠️ Potential issue | 🟠 MajorUniqueness key changed, but repository lookups still ignore
time.After Line 83 adds
timeto uniqueness, the current lookup paths insrc/repositories/game_repository.py(Lines 106-131 and 134-181) andsrc/services/game_service.py(Lines 77-93) still search withouttime. This can return ambiguous matches or update the wrong game record for doubleheaders.A practical fix is to pass
timethrough the service/repository lookup methods for primary matching, with a controlled fallback only when source data has no time.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/database.py` around lines 73 - 87, The uniqueness index "uniq_game_key_with_time" on game_collection now includes the "time" field, but lookups still omit time; update the repository methods in src/repositories/game_repository.py to accept a time argument (e.g., extend the primary lookup functions that build the unique key) and include "time" in the query filter used to find or update games, and update the service layer methods in src/services/game_service.py that call those repository methods to pass through the time value; implement a controlled fallback in the repository: if the incoming source record has no time, perform a secondary query without time (explicitly documented and used only when time is null/empty) so doubleheaders are resolved correctly while preserving the new unique index behavior.src/scrapers/game_details_scrape.py (1)
148-168:⚠️ Potential issue | 🟠 MajorRestore
periodon hockey scoring events.Line 150 still parses the period, but it is no longer included in the emitted event. That makes
BoxScoreEntryType.periodresolve tonullfor every hockey score and drops data that existing clients may already depend on.🐛 Proposed fix
summary.append({ 'team': team, + 'period': period, 'time': time, 'scorer': scorer, 'assist': assist, 'cor_score': cornell_score, 'opp_score': opp_score,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scrapers/game_details_scrape.py` around lines 148 - 168, The loop in game_details_scrape.py parses the period into the local variable period but doesn’t include it in the emitted scoring event, causing BoxScoreEntryType.period to be null; update the scoring_events creation inside the for loop (the summary.append call) to include the parsed period by adding a 'period': period entry in the dict so each appended scoring event contains the period value (ensure the key name matches the expected BoxScoreEntryType.period).
🧹 Nitpick comments (4)
src/scrapers/games_scraper.py (1)
128-129: Use robust text extraction forresultnormalization.
result_tag.text.strip().replace("\n", " ")is brittle with nested tags and mixed whitespace. Preferget_text(" ", strip=True)and optionally collapse repeated spaces.Proposed refactor
- `#game_data`["result"] = result_tag.get_text(" ", strip=True) - game_data["result"] = result_tag.text.strip().replace("\n", " ") + game_data["result"] = re.sub(r"\s+", " ", result_tag.get_text(" ", strip=True))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scrapers/games_scraper.py` around lines 128 - 129, Replace the brittle normalization of the result field (currently assigning game_data["result"] = result_tag.text.strip().replace("\n", " ")) with a robust extraction using result_tag.get_text(" ", strip=True) and collapse repeated whitespace (e.g., via a single-space collapse) before assigning into game_data["result"]; locate the assignment that references result_tag and game_data["result"] and update it to use get_text(" ", strip=True) plus a whitespace-normalization step to ensure nested tags and mixed whitespace are handled.requirements.txt (1)
14-14: Consider pinningfirebase-adminversion.The
firebase-adminpackage is added without a version pin. For reproducible builds and to avoid unexpected breaking changes, consider pinning to a specific version (e.g.,firebase-admin==7.3.0).Note: Most packages in this file remain unpinned. If pinning is not adopted as a project-wide practice, this suggestion is optional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@requirements.txt` at line 14, Pin the firebase-admin dependency in requirements.txt by replacing the unpinned entry "firebase-admin" with a specific version specifier (for example "firebase-admin==7.3.0") to ensure reproducible installs; update any related dependency documentation or CI that relies on this file if necessary and run a quick install/test to verify compatibility.src/types.py (1)
39-57: Documentinningseparately inBoxScoreEntryType.The type docs still describe
periodas “period or inning”, but the schema now exposes both fields. Updating the attribute list here will keep the GraphQL contract clearer for clients.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/types.py` around lines 39 - 57, Update the BoxScoreEntryType docstring to document `inning` separately from `period`: in the BoxScoreEntryType class comment (above the attributes list) add a distinct bullet describing `inning` (e.g., "inning: The inning of the event" or similar) and adjust the existing `period` bullet to no longer say "period or inning", so both `period` and `inning` are clearly and separately described in the class docstring.src/scrapers/game_details_scrape.py (1)
72-77: Mirrorinningintoperiodfor compatibility.Lines 312-314 still treat
periodas the generic fallback whentimeis missing. Emitting onlyinningmakes softball the one summary shape that skips that path. Either set both fields here or teach the fallback aboutinning.♻️ Proposed compatibility tweak
summary.append({ 'team': team, + 'period': inning, 'inning': inning, 'description': desc, 'cor_score': cornell_score, 'opp_score': opp_score })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scrapers/game_details_scrape.py` around lines 72 - 77, The summary objects only set 'inning' but other code (lines handling missing 'time') expects a generic 'period'; update the dict passed to summary.append to include 'period': inning (i.e., mirror the inning value into the 'period' key) so the summary shape remains compatible with existing fallback logic that reads 'period'. Ensure you modify the summary.append call that builds the dict (the block adding 'team', 'inning', 'description', 'cor_score', 'opp_score') to add the 'period' key set to the same inning variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app.py`:
- Line 28: Remove the unused import symbol "auth" from the top-level import in
app.py (the statement importing from firebase_admin: credentials, auth) — keep
only the used "credentials" import so the file doesn't import auth
unnecessarily; ensure no other references to auth remain in app.py before
committing.
- Around line 26-45: The deployment is missing configuration for the Firebase
service account used by initialize_firebase: ensure SERVICE_ACCOUNT_PATH
(GOOGLE_APPLICATION_CREDENTIALS) is documented in .env_template, make the
Firebase key file available to containers by mounting it in docker-compose.yml
(add a volume mapping and set the container env GOOGLE_APPLICATION_CREDENTIALS
to the mounted path), and update the Dockerfile or the docker-compose service
env to set the GOOGLE_APPLICATION_CREDENTIALS environment variable (or use
Docker secrets) so that SERVICE_ACCOUNT_PATH is present when
initialize_firebase() runs; keep references to SERVICE_ACCOUNT_PATH and
initialize_firebase in the changes so reviewers can find the initialization
logic.
In `@src/mutations/login_user.py`:
- Around line 17-22: Replace the bare except around
firebase_auth.verify_id_token with targeted exception handling: catch the
firebase_admin.auth exceptions (e.g., InvalidIdTokenError, ExpiredIdTokenError,
RevokedIdTokenError) and other decode-related errors (ValueError) as err and
re-raise GraphQLError using exception chaining (raise GraphQLError("Invalid or
expired token.") from err); after a successful verify_id_token call, validate
the returned payload before indexing (use decoded.get("uid") or check "uid" in
decoded) and raise a clear GraphQLError if uid is missing to avoid KeyError.
Ensure you reference firebase_auth.verify_id_token and the decoded["uid"] access
points when applying these changes.
In `@src/mutations/signup_user.py`:
- Around line 19-24: Replace the broad try/except around
firebase_auth.verify_id_token and the direct decoded["uid"] access with targeted
exception handling: catch the specific firebase_auth token errors (e.g.,
firebase_auth.InvalidIdTokenError, firebase_auth.ExpiredIdTokenError,
firebase_auth.RevokedIdTokenError) and ValueError as separate except clauses,
re-raising GraphQLError with the original exception chained (use "from e");
after verify_id_token succeeds, use decoded.get("uid") and if it is None raise a
GraphQLError("Token missing uid") chained to a KeyError to preserve context.
Ensure you update the verify_id_token and decoded["uid"] usages accordingly.
- Around line 25-36: The current check-then-insert in signup_user.py (the
find_one on db["users"] followed by insert_one creating user_doc with
firebase_uid/email/name) can race under concurrent requests; add a unique index
on firebase_uid in your DB setup and change the insertion flow to either use
find_one_and_update(..., upsert=True) to atomically create-if-not-exists or keep
insert_one but catch the duplicate key error (e.g., DuplicateKeyError) around
db["users"].insert_one to convert it into the GraphQLError("User already
exists.") so concurrent inserts are handled safely.
---
Outside diff comments:
In `@app.py`:
- Around line 150-175: Duplicate conditional blocks define DefaultArgs and set
args twice; remove the redundant second block (the repeated "if __name__ ==
'__main__': ... else: class DefaultArgs ... args = DefaultArgs()") so only one
parse_args()/DefaultArgs flow remains. Keep a single occurrence that uses
parse_args() when running as __main__ and defines DefaultArgs with no_scrape and
no_daily_sun when imported, and ensure references to parse_args, DefaultArgs,
and args are unchanged elsewhere.
In `@src/database.py`:
- Around line 73-87: The uniqueness index "uniq_game_key_with_time" on
game_collection now includes the "time" field, but lookups still omit time;
update the repository methods in src/repositories/game_repository.py to accept a
time argument (e.g., extend the primary lookup functions that build the unique
key) and include "time" in the query filter used to find or update games, and
update the service layer methods in src/services/game_service.py that call those
repository methods to pass through the time value; implement a controlled
fallback in the repository: if the incoming source record has no time, perform a
secondary query without time (explicitly documented and used only when time is
null/empty) so doubleheaders are resolved correctly while preserving the new
unique index behavior.
In `@src/scrapers/game_details_scrape.py`:
- Around line 148-168: The loop in game_details_scrape.py parses the period into
the local variable period but doesn’t include it in the emitted scoring event,
causing BoxScoreEntryType.period to be null; update the scoring_events creation
inside the for loop (the summary.append call) to include the parsed period by
adding a 'period': period entry in the dict so each appended scoring event
contains the period value (ensure the key name matches the expected
BoxScoreEntryType.period).
In `@src/scrapers/games_scraper.py`:
- Around line 245-264: The lookup calls to
GameService.get_game_by_tournament_key_fields and
GameService.get_game_by_key_fields are missing the game time, so they can match
the wrong record; update those two call sites to pass the game's time (e.g.,
game_data["time"] or game_time) and then update the corresponding GameService
method signatures and implementations to accept and use the time parameter in
the query/unique-key lookup so the DB uniqueness/index that includes time is
honored.
---
Nitpick comments:
In `@requirements.txt`:
- Line 14: Pin the firebase-admin dependency in requirements.txt by replacing
the unpinned entry "firebase-admin" with a specific version specifier (for
example "firebase-admin==7.3.0") to ensure reproducible installs; update any
related dependency documentation or CI that relies on this file if necessary and
run a quick install/test to verify compatibility.
In `@src/scrapers/game_details_scrape.py`:
- Around line 72-77: The summary objects only set 'inning' but other code (lines
handling missing 'time') expects a generic 'period'; update the dict passed to
summary.append to include 'period': inning (i.e., mirror the inning value into
the 'period' key) so the summary shape remains compatible with existing fallback
logic that reads 'period'. Ensure you modify the summary.append call that builds
the dict (the block adding 'team', 'inning', 'description', 'cor_score',
'opp_score') to add the 'period' key set to the same inning variable.
In `@src/scrapers/games_scraper.py`:
- Around line 128-129: Replace the brittle normalization of the result field
(currently assigning game_data["result"] = result_tag.text.strip().replace("\n",
" ")) with a robust extraction using result_tag.get_text(" ", strip=True) and
collapse repeated whitespace (e.g., via a single-space collapse) before
assigning into game_data["result"]; locate the assignment that references
result_tag and game_data["result"] and update it to use get_text(" ",
strip=True) plus a whitespace-normalization step to ensure nested tags and mixed
whitespace are handled.
In `@src/types.py`:
- Around line 39-57: Update the BoxScoreEntryType docstring to document `inning`
separately from `period`: in the BoxScoreEntryType class comment (above the
attributes list) add a distinct bullet describing `inning` (e.g., "inning: The
inning of the event" or similar) and adjust the existing `period` bullet to no
longer say "period or inning", so both `period` and `inning` are clearly and
separately described in the class docstring.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 557283ab-514c-473f-913e-9baac1fb90b2
📒 Files selected for processing (10)
.gitignoreapp.pyrequirements.txtsrc/database.pysrc/mutations/login_user.pysrc/mutations/signup_user.pysrc/schema.pysrc/scrapers/game_details_scrape.pysrc/scrapers/games_scraper.pysrc/types.py
| import os | ||
| import firebase_admin | ||
| from firebase_admin import credentials, auth | ||
|
|
||
| SERVICE_ACCOUNT_PATH = os.environ.get("GOOGLE_APPLICATION_CREDENTIALS") | ||
|
|
||
|
|
||
| def initialize_firebase(): | ||
| if not firebase_admin._apps: | ||
| if not SERVICE_ACCOUNT_PATH: | ||
| raise ValueError( | ||
| "GOOGLE_APPLICATION_CREDENTIALS is not set. Set it to your firebase-service-account-key.json path." | ||
| ) | ||
| cred = credentials.Certificate(SERVICE_ACCOUNT_PATH) | ||
| firebase_admin.initialize_app(cred) | ||
| logging.info("Firebase app initialized.") | ||
| return firebase_admin.get_app() | ||
|
|
||
|
|
||
| initialize_firebase() |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if .env_template includes the Firebase credential variable
echo "=== Checking .env_template ==="
cat .env_template 2>/dev/null || echo ".env_template not found"
echo ""
echo "=== Checking docker-compose.yml for volume mounts ==="
cat docker-compose.yml 2>/dev/null | grep -A5 "volumes:" || echo "No volumes section found"Repository: cuappdev/score-backend
Length of output: 498
🏁 Script executed:
# Check file size and read app.py to verify all claims
wc -l app.py
echo "=== Full app.py contents ==="
cat -n app.pyRepository: cuappdev/score-backend
Length of output: 8592
🏁 Script executed:
# Check if Dockerfile exists and examine it
if [ -f Dockerfile ]; then
echo "=== Dockerfile found ==="
cat -n Dockerfile
else
echo "Dockerfile not found"
fiRepository: cuappdev/score-backend
Length of output: 391
Deployment configuration incomplete for Firebase.
The Firebase initialization requires GOOGLE_APPLICATION_CREDENTIALS but the deployment setup is missing key configurations:
.env_templatedoesn't document this required variabledocker-compose.ymldoesn't mount the Firebase key file into containersDockerfiledoesn't configure the environment variable
This will cause startup failures in containerized deployments.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app.py` around lines 26 - 45, The deployment is missing configuration for the
Firebase service account used by initialize_firebase: ensure
SERVICE_ACCOUNT_PATH (GOOGLE_APPLICATION_CREDENTIALS) is documented in
.env_template, make the Firebase key file available to containers by mounting it
in docker-compose.yml (add a volume mapping and set the container env
GOOGLE_APPLICATION_CREDENTIALS to the mounted path), and update the Dockerfile
or the docker-compose service env to set the GOOGLE_APPLICATION_CREDENTIALS
environment variable (or use Docker secrets) so that SERVICE_ACCOUNT_PATH is
present when initialize_firebase() runs; keep references to SERVICE_ACCOUNT_PATH
and initialize_firebase in the changes so reviewers can find the initialization
logic.
There was a problem hiding this comment.
Are these set in both dev and prod servers @claiireyu ?
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
| try: | ||
| decoded = firebase_auth.verify_id_token(id_token) | ||
| except Exception: | ||
| raise GraphQLError("Invalid or expired token.") | ||
|
|
||
| firebase_uid = decoded["uid"] |
There was a problem hiding this comment.
Improve exception handling for token verification.
Two issues flagged by static analysis:
- Catching bare
Exceptionobscures the root cause and catches unrelated errors (e.g.,KeyboardInterruptin Python 2, programming bugs) - Missing exception chaining (
from errorfrom None) loses debugging context
Additionally, accessing decoded["uid"] can raise KeyError if the token structure is unexpected.
Proposed fix
def mutate(self, info, id_token):
try:
decoded = firebase_auth.verify_id_token(id_token)
- except Exception:
- raise GraphQLError("Invalid or expired token.")
-
- firebase_uid = decoded["uid"]
+ firebase_uid = decoded["uid"]
+ except (ValueError, KeyError) as err:
+ raise GraphQLError("Invalid or expired token.") from None
+ except Exception as err:
+ raise GraphQLError("Invalid or expired token.") from None
+
user = db["users"].find_one({"firebase_uid": firebase_uid})Alternatively, catch the specific firebase_admin.auth.InvalidIdTokenError and related exceptions for more precise error handling.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| decoded = firebase_auth.verify_id_token(id_token) | |
| except Exception: | |
| raise GraphQLError("Invalid or expired token.") | |
| firebase_uid = decoded["uid"] | |
| try: | |
| decoded = firebase_auth.verify_id_token(id_token) | |
| firebase_uid = decoded["uid"] | |
| except (ValueError, KeyError) as err: | |
| raise GraphQLError("Invalid or expired token.") from None | |
| except Exception as err: | |
| raise GraphQLError("Invalid or expired token.") from None | |
🧰 Tools
🪛 Ruff (0.15.6)
[warning] 19-19: Do not catch blind exception: Exception
(BLE001)
[warning] 20-20: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/mutations/login_user.py` around lines 17 - 22, Replace the bare except
around firebase_auth.verify_id_token with targeted exception handling: catch the
firebase_admin.auth exceptions (e.g., InvalidIdTokenError, ExpiredIdTokenError,
RevokedIdTokenError) and other decode-related errors (ValueError) as err and
re-raise GraphQLError using exception chaining (raise GraphQLError("Invalid or
expired token.") from err); after a successful verify_id_token call, validate
the returned payload before indexing (use decoded.get("uid") or check "uid" in
decoded) and raise a clear GraphQLError if uid is missing to avoid KeyError.
Ensure you reference firebase_auth.verify_id_token and the decoded["uid"] access
points when applying these changes.
| try: | ||
| decoded = firebase_auth.verify_id_token(id_token) | ||
| except Exception: | ||
| raise GraphQLError("Invalid or expired token.") | ||
|
|
||
| firebase_uid = decoded["uid"] |
There was a problem hiding this comment.
Improve exception handling for token verification.
Same issues as in login_user.py:
- Catching bare
Exceptionis too broad - Missing exception chaining loses debugging context
decoded["uid"]access can raiseKeyError
Proposed fix
def mutate(self, info, id_token, name=None, email=None):
try:
decoded = firebase_auth.verify_id_token(id_token)
- except Exception:
- raise GraphQLError("Invalid or expired token.")
-
- firebase_uid = decoded["uid"]
+ firebase_uid = decoded["uid"]
+ except (ValueError, KeyError) as err:
+ raise GraphQLError("Invalid or expired token.") from None
+ except Exception as err:
+ raise GraphQLError("Invalid or expired token.") from None
+
if db["users"].find_one({"firebase_uid": firebase_uid}):🧰 Tools
🪛 Ruff (0.15.6)
[warning] 21-21: Do not catch blind exception: Exception
(BLE001)
[warning] 22-22: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/mutations/signup_user.py` around lines 19 - 24, Replace the broad
try/except around firebase_auth.verify_id_token and the direct decoded["uid"]
access with targeted exception handling: catch the specific firebase_auth token
errors (e.g., firebase_auth.InvalidIdTokenError,
firebase_auth.ExpiredIdTokenError, firebase_auth.RevokedIdTokenError) and
ValueError as separate except clauses, re-raising GraphQLError with the original
exception chained (use "from e"); after verify_id_token succeeds, use
decoded.get("uid") and if it is None raise a GraphQLError("Token missing uid")
chained to a KeyError to preserve context. Ensure you update the verify_id_token
and decoded["uid"] usages accordingly.
| if db["users"].find_one({"firebase_uid": firebase_uid}): | ||
| raise GraphQLError("User already exists.") | ||
|
|
||
| email = email or decoded.get("email") | ||
| user_doc = { | ||
| "net_id": net_id, | ||
| "firebase_uid": firebase_uid, | ||
| "email": email, | ||
| "favorite_game_ids": [], | ||
| } | ||
| if name is not None: | ||
| user_doc["name"] = name | ||
| if email is not None: | ||
| user_doc["email"] = email | ||
| result = db["users"].insert_one(user_doc) |
There was a problem hiding this comment.
Potential race condition in user creation.
The check-then-insert pattern (find_one followed by insert_one) is vulnerable to race conditions under concurrent signup requests with the same Firebase UID. Two requests could pass the existence check simultaneously and both insert.
Consider using a unique index on firebase_uid and handling the duplicate key error, or using find_one_and_update with upsert=True.
Proposed fix using unique index + exception handling
First, ensure a unique index exists on firebase_uid in your database setup:
db["users"].create_index("firebase_uid", unique=True)Then handle the duplicate key error:
- if db["users"].find_one({"firebase_uid": firebase_uid}):
- raise GraphQLError("User already exists.")
-
email = email or decoded.get("email")
user_doc = {
"firebase_uid": firebase_uid,
"email": email,
"favorite_game_ids": [],
}
if name is not None:
user_doc["name"] = name
- result = db["users"].insert_one(user_doc)
+ try:
+ result = db["users"].insert_one(user_doc)
+ except pymongo.errors.DuplicateKeyError:
+ raise GraphQLError("User already exists.") from None
identity = str(result.inserted_id)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/mutations/signup_user.py` around lines 25 - 36, The current
check-then-insert in signup_user.py (the find_one on db["users"] followed by
insert_one creating user_doc with firebase_uid/email/name) can race under
concurrent requests; add a unique index on firebase_uid in your DB setup and
change the insertion flow to either use find_one_and_update(..., upsert=True) to
atomically create-if-not-exists or keep insert_one but catch the duplicate key
error (e.g., DuplicateKeyError) around db["users"].insert_one to convert it into
the GraphQLError("User already exists.") so concurrent inserts are handled
safely.
JoshD94
left a comment
There was a problem hiding this comment.
lgtm after you address the coderabbit issues
Overview
Created a softball scraper, edited unique sport constraint
Changes Made
Test Coverage
Tested locally with Graphql
Summary by CodeRabbit
Release Notes
New Features
Improvements
Chores