Skip to content

Created a softball scraper, edited unique sport constraint#69

Open
claiireyu wants to merge 2 commits intomasterfrom
claire/softballScraper
Open

Created a softball scraper, edited unique sport constraint#69
claiireyu wants to merge 2 commits intomasterfrom
claire/softballScraper

Conversation

@claiireyu
Copy link
Copy Markdown
Contributor

@claiireyu claiireyu commented Mar 25, 2026

Overview

Created a softball scraper, edited unique sport constraint

Changes Made

  • Introduced a new function to extract scoring summaries for softball games, including inning and score details.
  • Updated the database index setup to ensure uniqueness of game entries by including the time field.
  • Added an 'inning' field to the BoxScoreEntryType for better data representation.

Test Coverage

Tested locally with Graphql

Summary by CodeRabbit

Release Notes

  • New Features

    • Transitioned to Firebase-based authentication for user login and signup
    • Added support for parsing and displaying softball game scoring events
  • Improvements

    • Enhanced game result text formatting to preserve spacing
    • Improved hockey game scoring event tracking
  • Chores

    • Updated authentication configuration and added firebase-admin dependency

- Integrated Firebase for user login and signup using Firebase ID tokens.
- Updated login and signup mutations to handle Firebase authentication.
- Modified user schema to reflect changes in authentication method.
- Updated .gitignore to include Firebase service account key.
- Introduced a new function to extract scoring summaries for softball games, including inning and score details.
- Updated the database index setup to ensure uniqueness of game entries by including the time field.
- Added an 'inning' field to the BoxScoreEntryType for better data representation.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 25, 2026

📝 Walkthrough

Walkthrough

The pull request integrates Firebase Admin SDK for authentication, replacing NetID-based auth with Firebase ID token verification. It adds a softball game parser, updates the game database index schema, and extends GraphQL types to support inning-level scoring data.

Changes

Cohort / File(s) Summary
Firebase Infrastructure
.gitignore, app.py, requirements.txt
Added Firebase Admin SDK dependency, initialized Firebase via environment variable, and ignored service account key file.
Authentication Migration
src/mutations/login_user.py, src/mutations/signup_user.py, src/schema.py
Replaced net_id argument with id_token in login and signup mutations; added Firebase ID token verification; updated mutation descriptions to reflect token-based auth.
Game Data & Parsing
src/scrapers/game_details_scrape.py, src/scrapers/games_scraper.py, src/types.py
Added softball_summary parser for per-inning scoring events; updated hockey_summary to match "CORNELL" and removed period field; normalized result whitespace; added inning field to BoxScoreEntryType.
Database Schema
src/database.py
Updated unique compound index on game collection to include time field, narrowing uniqueness constraint with explicit index name.

Sequence Diagram

sequenceDiagram
    actor Client
    participant App as Flask App
    participant Firebase
    participant DB as MongoDB
    
    Client->>App: POST login(id_token)
    App->>Firebase: verify_id_token(id_token)
    Firebase-->>App: decoded token with firebase_uid
    App->>DB: query users by firebase_uid
    DB-->>App: user document
    App->>App: generate JWT (access_token, refresh_token)
    App-->>Client: access_token, refresh_token
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Firebase tokens hop into view,
NetID auth says "we're through!"
Softballs bat and innings score,
Indices grow forevermore—
Auth flows fresh, the schema's tight,
CodeRabbit approves the migration right! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description covers Overview, Changes Made, and Test Coverage sections matching the template. However, it lacks critical details about Firebase authentication changes and is incomplete relative to the actual changeset scope. Add details about Firebase authentication integration (token verification, login/signup changes) and clarify which changes are in this PR versus future work in Next Steps section.
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the two main changes: adding a softball scraper and modifying the uniqueness constraint with the time field.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claire/softballScraper

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
src/scrapers/games_scraper.py (1)

245-264: ⚠️ Potential issue | 🟠 Major

Lookup keys don’t include time, so updates can target the wrong game.

Given the uniqueness/index change includes time, these fetches still key by date/location/opponent without game_time. If multiple games match, Line 266–Line 271 picks the first row and may overwrite the wrong record.

Proposed fix direction
     curr_game = GameService.get_game_by_tournament_key_fields(
         city,
         game_data["date"],
         game_data["gender"],
         location,
         game_data["sport"],
         state,
+        game_time,
     )

     if not curr_game:
         curr_game = GameService.get_game_by_key_fields(
             city,
             game_data["date"],
             game_data["gender"],
             location,
             team.id,
             game_data["sport"],
             state,
+            game_time,
         )

If service signatures don’t yet accept time, update them to keep lookup semantics aligned with the DB uniqueness constraint.

Also applies to: 266-271

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scrapers/games_scraper.py` around lines 245 - 264, The lookup calls to
GameService.get_game_by_tournament_key_fields and
GameService.get_game_by_key_fields are missing the game time, so they can match
the wrong record; update those two call sites to pass the game's time (e.g.,
game_data["time"] or game_time) and then update the corresponding GameService
method signatures and implementations to accept and use the time parameter in
the query/unique-key lookup so the DB uniqueness/index that includes time is
honored.
app.py (1)

150-175: ⚠️ Potential issue | 🟡 Minor

Duplicate DefaultArgs class definition.

The DefaultArgs class and the conditional block are defined twice (lines 151-158 and lines 167-175). This appears to be an unintentional duplication.

Proposed fix - remove the duplicate block
 # Only parse arguments when running directly (not when imported by gunicorn)
 if __name__ == "__main__":
     args = parse_args()
 else:
     # Default args when imported by gunicorn
     class DefaultArgs:
         no_scrape = False
         no_daily_sun = False
     args = DefaultArgs()

 def signal_handler(sig, frame):
     sys.exit(0)


 signal.signal(signal.SIGINT, signal_handler)
 signal.signal(signal.SIGTERM, signal_handler)

-# Only parse arguments when running directly (not when imported by gunicorn)
-if __name__ == "__main__":
-    args = parse_args()
-else:
-    # Default args when imported by gunicorn
-    class DefaultArgs:
-        no_scrape = False
-        no_daily_sun = False
-    args = DefaultArgs()

 # Only run scraping tasks if not disabled
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app.py` around lines 150 - 175, Duplicate conditional blocks define
DefaultArgs and set args twice; remove the redundant second block (the repeated
"if __name__ == '__main__': ... else: class DefaultArgs ... args =
DefaultArgs()") so only one parse_args()/DefaultArgs flow remains. Keep a single
occurrence that uses parse_args() when running as __main__ and defines
DefaultArgs with no_scrape and no_daily_sun when imported, and ensure references
to parse_args, DefaultArgs, and args are unchanged elsewhere.
src/database.py (1)

73-87: ⚠️ Potential issue | 🟠 Major

Uniqueness key changed, but repository lookups still ignore time.

After Line 83 adds time to uniqueness, the current lookup paths in src/repositories/game_repository.py (Lines 106-131 and 134-181) and src/services/game_service.py (Lines 77-93) still search without time. This can return ambiguous matches or update the wrong game record for doubleheaders.

A practical fix is to pass time through the service/repository lookup methods for primary matching, with a controlled fallback only when source data has no time.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/database.py` around lines 73 - 87, The uniqueness index
"uniq_game_key_with_time" on game_collection now includes the "time" field, but
lookups still omit time; update the repository methods in
src/repositories/game_repository.py to accept a time argument (e.g., extend the
primary lookup functions that build the unique key) and include "time" in the
query filter used to find or update games, and update the service layer methods
in src/services/game_service.py that call those repository methods to pass
through the time value; implement a controlled fallback in the repository: if
the incoming source record has no time, perform a secondary query without time
(explicitly documented and used only when time is null/empty) so doubleheaders
are resolved correctly while preserving the new unique index behavior.
src/scrapers/game_details_scrape.py (1)

148-168: ⚠️ Potential issue | 🟠 Major

Restore period on hockey scoring events.

Line 150 still parses the period, but it is no longer included in the emitted event. That makes BoxScoreEntryType.period resolve to null for every hockey score and drops data that existing clients may already depend on.

🐛 Proposed fix
                 summary.append({
                     'team': team,
+                    'period': period,
                     'time': time,
                     'scorer': scorer,
                     'assist': assist,
                     'cor_score': cornell_score,
                     'opp_score': opp_score,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scrapers/game_details_scrape.py` around lines 148 - 168, The loop in
game_details_scrape.py parses the period into the local variable period but
doesn’t include it in the emitted scoring event, causing
BoxScoreEntryType.period to be null; update the scoring_events creation inside
the for loop (the summary.append call) to include the parsed period by adding a
'period': period entry in the dict so each appended scoring event contains the
period value (ensure the key name matches the expected
BoxScoreEntryType.period).
🧹 Nitpick comments (4)
src/scrapers/games_scraper.py (1)

128-129: Use robust text extraction for result normalization.

result_tag.text.strip().replace("\n", " ") is brittle with nested tags and mixed whitespace. Prefer get_text(" ", strip=True) and optionally collapse repeated spaces.

Proposed refactor
-            `#game_data`["result"] = result_tag.get_text(" ", strip=True)
-            game_data["result"] = result_tag.text.strip().replace("\n", " ")
+            game_data["result"] = re.sub(r"\s+", " ", result_tag.get_text(" ", strip=True))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scrapers/games_scraper.py` around lines 128 - 129, Replace the brittle
normalization of the result field (currently assigning game_data["result"] =
result_tag.text.strip().replace("\n", " ")) with a robust extraction using
result_tag.get_text(" ", strip=True) and collapse repeated whitespace (e.g., via
a single-space collapse) before assigning into game_data["result"]; locate the
assignment that references result_tag and game_data["result"] and update it to
use get_text(" ", strip=True) plus a whitespace-normalization step to ensure
nested tags and mixed whitespace are handled.
requirements.txt (1)

14-14: Consider pinning firebase-admin version.

The firebase-admin package is added without a version pin. For reproducible builds and to avoid unexpected breaking changes, consider pinning to a specific version (e.g., firebase-admin==7.3.0).

Note: Most packages in this file remain unpinned. If pinning is not adopted as a project-wide practice, this suggestion is optional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@requirements.txt` at line 14, Pin the firebase-admin dependency in
requirements.txt by replacing the unpinned entry "firebase-admin" with a
specific version specifier (for example "firebase-admin==7.3.0") to ensure
reproducible installs; update any related dependency documentation or CI that
relies on this file if necessary and run a quick install/test to verify
compatibility.
src/types.py (1)

39-57: Document inning separately in BoxScoreEntryType.

The type docs still describe period as “period or inning”, but the schema now exposes both fields. Updating the attribute list here will keep the GraphQL contract clearer for clients.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/types.py` around lines 39 - 57, Update the BoxScoreEntryType docstring to
document `inning` separately from `period`: in the BoxScoreEntryType class
comment (above the attributes list) add a distinct bullet describing `inning`
(e.g., "inning: The inning of the event" or similar) and adjust the existing
`period` bullet to no longer say "period or inning", so both `period` and
`inning` are clearly and separately described in the class docstring.
src/scrapers/game_details_scrape.py (1)

72-77: Mirror inning into period for compatibility.

Lines 312-314 still treat period as the generic fallback when time is missing. Emitting only inning makes softball the one summary shape that skips that path. Either set both fields here or teach the fallback about inning.

♻️ Proposed compatibility tweak
                 summary.append({
                         'team': team, 
+                        'period': inning,
                         'inning': inning,
                         'description': desc,
                         'cor_score': cornell_score,
                         'opp_score': opp_score
                     })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/scrapers/game_details_scrape.py` around lines 72 - 77, The summary
objects only set 'inning' but other code (lines handling missing 'time') expects
a generic 'period'; update the dict passed to summary.append to include
'period': inning (i.e., mirror the inning value into the 'period' key) so the
summary shape remains compatible with existing fallback logic that reads
'period'. Ensure you modify the summary.append call that builds the dict (the
block adding 'team', 'inning', 'description', 'cor_score', 'opp_score') to add
the 'period' key set to the same inning variable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app.py`:
- Line 28: Remove the unused import symbol "auth" from the top-level import in
app.py (the statement importing from firebase_admin: credentials, auth) — keep
only the used "credentials" import so the file doesn't import auth
unnecessarily; ensure no other references to auth remain in app.py before
committing.
- Around line 26-45: The deployment is missing configuration for the Firebase
service account used by initialize_firebase: ensure SERVICE_ACCOUNT_PATH
(GOOGLE_APPLICATION_CREDENTIALS) is documented in .env_template, make the
Firebase key file available to containers by mounting it in docker-compose.yml
(add a volume mapping and set the container env GOOGLE_APPLICATION_CREDENTIALS
to the mounted path), and update the Dockerfile or the docker-compose service
env to set the GOOGLE_APPLICATION_CREDENTIALS environment variable (or use
Docker secrets) so that SERVICE_ACCOUNT_PATH is present when
initialize_firebase() runs; keep references to SERVICE_ACCOUNT_PATH and
initialize_firebase in the changes so reviewers can find the initialization
logic.

In `@src/mutations/login_user.py`:
- Around line 17-22: Replace the bare except around
firebase_auth.verify_id_token with targeted exception handling: catch the
firebase_admin.auth exceptions (e.g., InvalidIdTokenError, ExpiredIdTokenError,
RevokedIdTokenError) and other decode-related errors (ValueError) as err and
re-raise GraphQLError using exception chaining (raise GraphQLError("Invalid or
expired token.") from err); after a successful verify_id_token call, validate
the returned payload before indexing (use decoded.get("uid") or check "uid" in
decoded) and raise a clear GraphQLError if uid is missing to avoid KeyError.
Ensure you reference firebase_auth.verify_id_token and the decoded["uid"] access
points when applying these changes.

In `@src/mutations/signup_user.py`:
- Around line 19-24: Replace the broad try/except around
firebase_auth.verify_id_token and the direct decoded["uid"] access with targeted
exception handling: catch the specific firebase_auth token errors (e.g.,
firebase_auth.InvalidIdTokenError, firebase_auth.ExpiredIdTokenError,
firebase_auth.RevokedIdTokenError) and ValueError as separate except clauses,
re-raising GraphQLError with the original exception chained (use "from e");
after verify_id_token succeeds, use decoded.get("uid") and if it is None raise a
GraphQLError("Token missing uid") chained to a KeyError to preserve context.
Ensure you update the verify_id_token and decoded["uid"] usages accordingly.
- Around line 25-36: The current check-then-insert in signup_user.py (the
find_one on db["users"] followed by insert_one creating user_doc with
firebase_uid/email/name) can race under concurrent requests; add a unique index
on firebase_uid in your DB setup and change the insertion flow to either use
find_one_and_update(..., upsert=True) to atomically create-if-not-exists or keep
insert_one but catch the duplicate key error (e.g., DuplicateKeyError) around
db["users"].insert_one to convert it into the GraphQLError("User already
exists.") so concurrent inserts are handled safely.

---

Outside diff comments:
In `@app.py`:
- Around line 150-175: Duplicate conditional blocks define DefaultArgs and set
args twice; remove the redundant second block (the repeated "if __name__ ==
'__main__': ... else: class DefaultArgs ... args = DefaultArgs()") so only one
parse_args()/DefaultArgs flow remains. Keep a single occurrence that uses
parse_args() when running as __main__ and defines DefaultArgs with no_scrape and
no_daily_sun when imported, and ensure references to parse_args, DefaultArgs,
and args are unchanged elsewhere.

In `@src/database.py`:
- Around line 73-87: The uniqueness index "uniq_game_key_with_time" on
game_collection now includes the "time" field, but lookups still omit time;
update the repository methods in src/repositories/game_repository.py to accept a
time argument (e.g., extend the primary lookup functions that build the unique
key) and include "time" in the query filter used to find or update games, and
update the service layer methods in src/services/game_service.py that call those
repository methods to pass through the time value; implement a controlled
fallback in the repository: if the incoming source record has no time, perform a
secondary query without time (explicitly documented and used only when time is
null/empty) so doubleheaders are resolved correctly while preserving the new
unique index behavior.

In `@src/scrapers/game_details_scrape.py`:
- Around line 148-168: The loop in game_details_scrape.py parses the period into
the local variable period but doesn’t include it in the emitted scoring event,
causing BoxScoreEntryType.period to be null; update the scoring_events creation
inside the for loop (the summary.append call) to include the parsed period by
adding a 'period': period entry in the dict so each appended scoring event
contains the period value (ensure the key name matches the expected
BoxScoreEntryType.period).

In `@src/scrapers/games_scraper.py`:
- Around line 245-264: The lookup calls to
GameService.get_game_by_tournament_key_fields and
GameService.get_game_by_key_fields are missing the game time, so they can match
the wrong record; update those two call sites to pass the game's time (e.g.,
game_data["time"] or game_time) and then update the corresponding GameService
method signatures and implementations to accept and use the time parameter in
the query/unique-key lookup so the DB uniqueness/index that includes time is
honored.

---

Nitpick comments:
In `@requirements.txt`:
- Line 14: Pin the firebase-admin dependency in requirements.txt by replacing
the unpinned entry "firebase-admin" with a specific version specifier (for
example "firebase-admin==7.3.0") to ensure reproducible installs; update any
related dependency documentation or CI that relies on this file if necessary and
run a quick install/test to verify compatibility.

In `@src/scrapers/game_details_scrape.py`:
- Around line 72-77: The summary objects only set 'inning' but other code (lines
handling missing 'time') expects a generic 'period'; update the dict passed to
summary.append to include 'period': inning (i.e., mirror the inning value into
the 'period' key) so the summary shape remains compatible with existing fallback
logic that reads 'period'. Ensure you modify the summary.append call that builds
the dict (the block adding 'team', 'inning', 'description', 'cor_score',
'opp_score') to add the 'period' key set to the same inning variable.

In `@src/scrapers/games_scraper.py`:
- Around line 128-129: Replace the brittle normalization of the result field
(currently assigning game_data["result"] = result_tag.text.strip().replace("\n",
" ")) with a robust extraction using result_tag.get_text(" ", strip=True) and
collapse repeated whitespace (e.g., via a single-space collapse) before
assigning into game_data["result"]; locate the assignment that references
result_tag and game_data["result"] and update it to use get_text(" ",
strip=True) plus a whitespace-normalization step to ensure nested tags and mixed
whitespace are handled.

In `@src/types.py`:
- Around line 39-57: Update the BoxScoreEntryType docstring to document `inning`
separately from `period`: in the BoxScoreEntryType class comment (above the
attributes list) add a distinct bullet describing `inning` (e.g., "inning: The
inning of the event" or similar) and adjust the existing `period` bullet to no
longer say "period or inning", so both `period` and `inning` are clearly and
separately described in the class docstring.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 557283ab-514c-473f-913e-9baac1fb90b2

📥 Commits

Reviewing files that changed from the base of the PR and between 8ebefa3 and f19c92e.

📒 Files selected for processing (10)
  • .gitignore
  • app.py
  • requirements.txt
  • src/database.py
  • src/mutations/login_user.py
  • src/mutations/signup_user.py
  • src/schema.py
  • src/scrapers/game_details_scrape.py
  • src/scrapers/games_scraper.py
  • src/types.py

Comment on lines +26 to +45
import os
import firebase_admin
from firebase_admin import credentials, auth

SERVICE_ACCOUNT_PATH = os.environ.get("GOOGLE_APPLICATION_CREDENTIALS")


def initialize_firebase():
if not firebase_admin._apps:
if not SERVICE_ACCOUNT_PATH:
raise ValueError(
"GOOGLE_APPLICATION_CREDENTIALS is not set. Set it to your firebase-service-account-key.json path."
)
cred = credentials.Certificate(SERVICE_ACCOUNT_PATH)
firebase_admin.initialize_app(cred)
logging.info("Firebase app initialized.")
return firebase_admin.get_app()


initialize_firebase()
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if .env_template includes the Firebase credential variable
echo "=== Checking .env_template ==="
cat .env_template 2>/dev/null || echo ".env_template not found"

echo ""
echo "=== Checking docker-compose.yml for volume mounts ==="
cat docker-compose.yml 2>/dev/null | grep -A5 "volumes:" || echo "No volumes section found"

Repository: cuappdev/score-backend

Length of output: 498


🏁 Script executed:

# Check file size and read app.py to verify all claims
wc -l app.py
echo "=== Full app.py contents ===" 
cat -n app.py

Repository: cuappdev/score-backend

Length of output: 8592


🏁 Script executed:

# Check if Dockerfile exists and examine it
if [ -f Dockerfile ]; then
  echo "=== Dockerfile found ==="
  cat -n Dockerfile
else
  echo "Dockerfile not found"
fi

Repository: cuappdev/score-backend

Length of output: 391


Deployment configuration incomplete for Firebase.

The Firebase initialization requires GOOGLE_APPLICATION_CREDENTIALS but the deployment setup is missing key configurations:

  1. .env_template doesn't document this required variable
  2. docker-compose.yml doesn't mount the Firebase key file into containers
  3. Dockerfile doesn't configure the environment variable

This will cause startup failures in containerized deployments.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app.py` around lines 26 - 45, The deployment is missing configuration for the
Firebase service account used by initialize_firebase: ensure
SERVICE_ACCOUNT_PATH (GOOGLE_APPLICATION_CREDENTIALS) is documented in
.env_template, make the Firebase key file available to containers by mounting it
in docker-compose.yml (add a volume mapping and set the container env
GOOGLE_APPLICATION_CREDENTIALS to the mounted path), and update the Dockerfile
or the docker-compose service env to set the GOOGLE_APPLICATION_CREDENTIALS
environment variable (or use Docker secrets) so that SERVICE_ACCOUNT_PATH is
present when initialize_firebase() runs; keep references to SERVICE_ACCOUNT_PATH
and initialize_firebase in the changes so reviewers can find the initialization
logic.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these set in both dev and prod servers @claiireyu ?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment on lines +17 to +22
try:
decoded = firebase_auth.verify_id_token(id_token)
except Exception:
raise GraphQLError("Invalid or expired token.")

firebase_uid = decoded["uid"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Improve exception handling for token verification.

Two issues flagged by static analysis:

  1. Catching bare Exception obscures the root cause and catches unrelated errors (e.g., KeyboardInterrupt in Python 2, programming bugs)
  2. Missing exception chaining (from err or from None) loses debugging context

Additionally, accessing decoded["uid"] can raise KeyError if the token structure is unexpected.

Proposed fix
     def mutate(self, info, id_token):
         try:
             decoded = firebase_auth.verify_id_token(id_token)
-        except Exception:
-            raise GraphQLError("Invalid or expired token.")
-
-        firebase_uid = decoded["uid"]
+            firebase_uid = decoded["uid"]
+        except (ValueError, KeyError) as err:
+            raise GraphQLError("Invalid or expired token.") from None
+        except Exception as err:
+            raise GraphQLError("Invalid or expired token.") from None
+
         user = db["users"].find_one({"firebase_uid": firebase_uid})

Alternatively, catch the specific firebase_admin.auth.InvalidIdTokenError and related exceptions for more precise error handling.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
decoded = firebase_auth.verify_id_token(id_token)
except Exception:
raise GraphQLError("Invalid or expired token.")
firebase_uid = decoded["uid"]
try:
decoded = firebase_auth.verify_id_token(id_token)
firebase_uid = decoded["uid"]
except (ValueError, KeyError) as err:
raise GraphQLError("Invalid or expired token.") from None
except Exception as err:
raise GraphQLError("Invalid or expired token.") from None
🧰 Tools
🪛 Ruff (0.15.6)

[warning] 19-19: Do not catch blind exception: Exception

(BLE001)


[warning] 20-20: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mutations/login_user.py` around lines 17 - 22, Replace the bare except
around firebase_auth.verify_id_token with targeted exception handling: catch the
firebase_admin.auth exceptions (e.g., InvalidIdTokenError, ExpiredIdTokenError,
RevokedIdTokenError) and other decode-related errors (ValueError) as err and
re-raise GraphQLError using exception chaining (raise GraphQLError("Invalid or
expired token.") from err); after a successful verify_id_token call, validate
the returned payload before indexing (use decoded.get("uid") or check "uid" in
decoded) and raise a clear GraphQLError if uid is missing to avoid KeyError.
Ensure you reference firebase_auth.verify_id_token and the decoded["uid"] access
points when applying these changes.

Comment on lines +19 to +24
try:
decoded = firebase_auth.verify_id_token(id_token)
except Exception:
raise GraphQLError("Invalid or expired token.")

firebase_uid = decoded["uid"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Improve exception handling for token verification.

Same issues as in login_user.py:

  1. Catching bare Exception is too broad
  2. Missing exception chaining loses debugging context
  3. decoded["uid"] access can raise KeyError
Proposed fix
     def mutate(self, info, id_token, name=None, email=None):
         try:
             decoded = firebase_auth.verify_id_token(id_token)
-        except Exception:
-            raise GraphQLError("Invalid or expired token.")
-
-        firebase_uid = decoded["uid"]
+            firebase_uid = decoded["uid"]
+        except (ValueError, KeyError) as err:
+            raise GraphQLError("Invalid or expired token.") from None
+        except Exception as err:
+            raise GraphQLError("Invalid or expired token.") from None
+
         if db["users"].find_one({"firebase_uid": firebase_uid}):
🧰 Tools
🪛 Ruff (0.15.6)

[warning] 21-21: Do not catch blind exception: Exception

(BLE001)


[warning] 22-22: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mutations/signup_user.py` around lines 19 - 24, Replace the broad
try/except around firebase_auth.verify_id_token and the direct decoded["uid"]
access with targeted exception handling: catch the specific firebase_auth token
errors (e.g., firebase_auth.InvalidIdTokenError,
firebase_auth.ExpiredIdTokenError, firebase_auth.RevokedIdTokenError) and
ValueError as separate except clauses, re-raising GraphQLError with the original
exception chained (use "from e"); after verify_id_token succeeds, use
decoded.get("uid") and if it is None raise a GraphQLError("Token missing uid")
chained to a KeyError to preserve context. Ensure you update the verify_id_token
and decoded["uid"] usages accordingly.

Comment on lines +25 to 36
if db["users"].find_one({"firebase_uid": firebase_uid}):
raise GraphQLError("User already exists.")

email = email or decoded.get("email")
user_doc = {
"net_id": net_id,
"firebase_uid": firebase_uid,
"email": email,
"favorite_game_ids": [],
}
if name is not None:
user_doc["name"] = name
if email is not None:
user_doc["email"] = email
result = db["users"].insert_one(user_doc)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential race condition in user creation.

The check-then-insert pattern (find_one followed by insert_one) is vulnerable to race conditions under concurrent signup requests with the same Firebase UID. Two requests could pass the existence check simultaneously and both insert.

Consider using a unique index on firebase_uid and handling the duplicate key error, or using find_one_and_update with upsert=True.

Proposed fix using unique index + exception handling

First, ensure a unique index exists on firebase_uid in your database setup:

db["users"].create_index("firebase_uid", unique=True)

Then handle the duplicate key error:

-        if db["users"].find_one({"firebase_uid": firebase_uid}):
-            raise GraphQLError("User already exists.")
-
         email = email or decoded.get("email")
         user_doc = {
             "firebase_uid": firebase_uid,
             "email": email,
             "favorite_game_ids": [],
         }
         if name is not None:
             user_doc["name"] = name
-        result = db["users"].insert_one(user_doc)
+        try:
+            result = db["users"].insert_one(user_doc)
+        except pymongo.errors.DuplicateKeyError:
+            raise GraphQLError("User already exists.") from None
         identity = str(result.inserted_id)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/mutations/signup_user.py` around lines 25 - 36, The current
check-then-insert in signup_user.py (the find_one on db["users"] followed by
insert_one creating user_doc with firebase_uid/email/name) can race under
concurrent requests; add a unique index on firebase_uid in your DB setup and
change the insertion flow to either use find_one_and_update(..., upsert=True) to
atomically create-if-not-exists or keep insert_one but catch the duplicate key
error (e.g., DuplicateKeyError) around db["users"].insert_one to convert it into
the GraphQLError("User already exists.") so concurrent inserts are handled
safely.

Copy link
Copy Markdown
Contributor

@JoshD94 JoshD94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm after you address the coderabbit issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants