Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRenames WhitelistUser → UserDataAccess, adds api_access_tier and view_user_data, moves ApiAccessTier to users.constants, updates related migrations, serializers, utils, views, URLs, frontend types/API/UI, admin, and tasks to use "data access" terminology and behavior. Changes
Sequence Diagram(s)sequenceDiagram
participant FE as Frontend (Client)
participant API as Backend API (misc.views)
participant Logic as Business Logic (misc.utils)
participant DB as Database (UserDataAccess)
FE->>API: GET /get-data-access-status?post_id&project_id
API->>Logic: get_data_access_status(user, post_id, project_id)
Logic->>DB: query UserDataAccess (user, project/post/null, view_user_data, api_access_tier)
DB-->>Logic: matching entries / tiers
Logic-->>API: (has_data_access, view_deanonymized_data)
API-->>FE: JSON { has_data_access, view_deanonymized_data }
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@comments/services/feed.py`:
- Around line 73-79: The current checks treat author_is_staff as truthy so False
is treated like "not provided"; update the conditional logic to detect presence
and explicit True/False values: use "author_is_staff is not None" to detect a
provided boolean and "author_is_staff is True" / "author_is_staff is False" for
behavior decisions. Concretely, change the branch conditions around author and
author_is_staff (the if that currently reads "if author is not None and
author_is_staff", the "elif author_is_staff", and related qs.filter calls) to
explicitly check for is not None and compare to True/False, and implement the
corresponding filters (author_id, author__is_staff=True, author__is_staff=False,
and parent=None where needed).
In `@users/serializers.py`:
- Around line 138-142: get_reduced_api_restriction_projects is returning
duplicate project IDs and loads full WhitelistUser objects; change the query on
user.whitelists to select only project_id and deduplicate in the DB by using
values_list('project_id', flat=True).distinct() combined with the existing
project_id__isnull=False filter so the method returns a lean, unique list of
project IDs without instantiating full model instances.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 912be90c-bc22-4669-8060-485dd631298d
📒 Files selected for processing (8)
comments/serializers/common.pycomments/services/feed.pymisc/migrations/0008_whitelistuser_view_forecaster_data.pymisc/models.pymisc/utils.pyusers/migrations/0016_user_api_access_tier.pyusers/models.pyusers/serializers.py
🚀 Preview EnvironmentYour preview environment is ready!
Details
ℹ️ Preview Environment InfoIsolation:
Limitations:
Cleanup:
|
addresses main site parts of primary spec of #4466 add bot_benchmarking to api access tiers add author_is_staff optional param to comments endpoint
…/api-access/comments-and-bot_benchmarking
2e0916b to
55a2c5f
Compare
…/api-access/endpoint-updates
…from user data permissions - Rename WhitelistUser model to UserDataAccess across backend, frontend, and migrations - Replace view_forecaster_data field with view_user_data (default False) to explicitly gate user-level data access, separate from API tier grants - Add api_access_tier field to UserDataAccess for project/post-scoped API tier overrides - Extract ApiAccessTier enum to users/constants.py - Rename all related identifiers: whitelists -> data_accesses, is_whitelisted -> has_data_access, get-whitelist-status -> get-data-access-status - Update /users/me endpoint: rename reduced_api_restriction_projects to project_data_access, only return it when ?with_data_access=true is passed - Migration backfills view_user_data=True for all existing rows
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
misc/utils.py (1)
27-45:⚠️ Potential issue | 🟠 MajorCheck all projects attached to the post, not just
default_project.
utils/views.py:156-172now treats anypost.projectsmembership as sufficient forhas_data_access. Here, Lines 29-31 only matchpost.default_project, and Lines 37-45 reuse that same narrowed project for the admin shortcut. Users whose access is granted through a non-default project on the post will get a false negative from this helper.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@misc/utils.py` around lines 27 - 45, The helper narrows checks to post.default_project which misses permissions granted via other projects on a post; update the logic in the has_data_access helper to iterate over post.projects (or use post.projects.all()) instead of using post.default_project: when post_id is set, collect data_access_entries for every project in post.projects and also check ProjectUserPermission.objects.filter(user=user, project__in=post.projects.all(), permission=ObjectPermission.ADMIN).exists() so the admin shortcut and data_access_entries include all projects attached to the post rather than only the default_project.utils/csv_utils.py (1)
214-236:⚠️ Potential issue | 🔴 CriticalTighten score scoping for
only_include_user_idsand anonymous callers.
user_forecastsis re-scoped after Line 146, but the score branch is not. If a non-privileged caller providesonly_include_user_ids, Lines 221-228 will return those users’ score rows. And whenuserisNone, Line 232 collapses toQ(user__isnull=True) | Q(), which matches every score. Reapply the caller restriction after the optional ID filter, and use onlyQ(user__isnull=True)for anonymous exports.Possible tightening
- elif only_include_user_ids: + elif only_include_user_ids: + allowed_user_ids = set(only_include_user_ids) + if not (has_data_access or is_staff): + allowed_user_ids &= {user.id} if user else set() # only include user-specific scores for the given user_ids scores = scores.filter( - Q(user_id__in=only_include_user_ids) | Q(user__isnull=True) + Q(user_id__in=allowed_user_ids) | Q(user__isnull=True) ) archived_scores = archived_scores.filter( - Q(user_id__in=only_include_user_ids) | Q(user__isnull=True) + Q(user_id__in=allowed_user_ids) | Q(user__isnull=True) ) elif not (has_data_access or is_staff): # only include user-specific scores for the logged-in user scores = scores.filter( - Q(user__isnull=True) | (Q(user=user) if user else Q()) + Q(user__isnull=True) + | (Q(user=user) if user else Q(user__isnull=True)) ) archived_scores = archived_scores.filter( - Q(user__isnull=True) | (Q(user=user) if user else Q()) + Q(user__isnull=True) + | (Q(user=user) if user else Q(user__isnull=True)) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@utils/csv_utils.py` around lines 214 - 236, The scores/archived_scores query branch incorrectly allows broader access when only_include_user_ids is set and when user is None; update the logic so that after applying the optional only_include_user_ids filter you reapply the caller restriction when not (has_data_access or is_staff), and for anonymous callers (user is None) use only Q(user__isnull=True) instead of Q(user__isnull=True) | Q(), i.e. ensure scores and archived_scores are additionally filtered to (Q(user__isnull=True) | Q(user=user)) for logged-in callers and to Q(user__isnull=True) for anonymous callers while still honoring only_include_user_ids.
🧹 Nitpick comments (1)
misc/models.py (1)
111-120: Disallow deanonymized-only grants that never take effect.Lines 111-119 allow
view_deanonymized_data=Truewhileview_user_data=False, butmisc/utils.pyfilters entries byview_user_data=Truebefore it checks deanonymization. That makes these rows look more permissive in admin than they are at runtime. A smallCheckConstraintor matching admin validation would keep the permission matrix consistent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@misc/models.py` around lines 111 - 120, Add a constraint and/or admin validation to prevent rows where view_deanonymized_data is True while view_user_data is False: in the model that defines the fields view_user_data and view_deanonymized_data, add a CheckConstraint enforcing "NOT view_deanonymized_data OR view_user_data" (i.e., view_deanonymized_data implies view_user_data) and add matching clean()/ModelAdmin form validation to reject or warn on such combinations so admin UI and runtime filtering remain consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@misc/utils.py`:
- Around line 27-45: The helper narrows checks to post.default_project which
misses permissions granted via other projects on a post; update the logic in the
has_data_access helper to iterate over post.projects (or use
post.projects.all()) instead of using post.default_project: when post_id is set,
collect data_access_entries for every project in post.projects and also check
ProjectUserPermission.objects.filter(user=user, project__in=post.projects.all(),
permission=ObjectPermission.ADMIN).exists() so the admin shortcut and
data_access_entries include all projects attached to the post rather than only
the default_project.
In `@utils/csv_utils.py`:
- Around line 214-236: The scores/archived_scores query branch incorrectly
allows broader access when only_include_user_ids is set and when user is None;
update the logic so that after applying the optional only_include_user_ids
filter you reapply the caller restriction when not (has_data_access or
is_staff), and for anonymous callers (user is None) use only
Q(user__isnull=True) instead of Q(user__isnull=True) | Q(), i.e. ensure scores
and archived_scores are additionally filtered to (Q(user__isnull=True) |
Q(user=user)) for logged-in callers and to Q(user__isnull=True) for anonymous
callers while still honoring only_include_user_ids.
---
Nitpick comments:
In `@misc/models.py`:
- Around line 111-120: Add a constraint and/or admin validation to prevent rows
where view_deanonymized_data is True while view_user_data is False: in the model
that defines the fields view_user_data and view_deanonymized_data, add a
CheckConstraint enforcing "NOT view_deanonymized_data OR view_user_data" (i.e.,
view_deanonymized_data implies view_user_data) and add matching
clean()/ModelAdmin form validation to reject or warn on such combinations so
admin UI and runtime filtering remain consistent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b4f09573-398e-498b-8ce8-e437f1a7a28f
📒 Files selected for processing (17)
front_end/src/app/(main)/questions/[id]/components/download_question_data_modal/index.tsxfront_end/src/services/api/posts/posts.shared.tsfront_end/src/types/utils.tsmisc/admin.pymisc/migrations/0008_whitelistuser_api_access_tier.pymisc/models.pymisc/urls.pymisc/utils.pymisc/views.pyusers/constants.pyusers/models.pyusers/serializers.pyusers/views.pyutils/csv_utils.pyutils/serializers.pyutils/tasks.pyutils/views.py
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
misc/migrations/0008_whitelistuser_api_access_tier.py (1)
23-35:⚠️ Potential issue | 🟠 MajorBackfill
api_access_tierfor existing rows before relying on the new default.Line 32 sets
"restricted"as the default, and the only data migration at Line 49-52 updatesview_user_dataonly. Existing renamed whitelist entries can lose prior API-tier behavior unlessapi_access_tieris explicitly migrated.Suggested migration update
+def set_existing_api_access_tier(apps, schema_editor): + UserDataAccess = apps.get_model("misc", "UserDataAccess") + UserDataAccess.objects.all().update(api_access_tier="benchmarking") + + class Migration(migrations.Migration): @@ migrations.AddField( model_name="userdataaccess", name="api_access_tier", @@ default="restricted", @@ ), + migrations.RunPython( + set_existing_api_access_tier, + migrations.RunPython.noop, + ), migrations.RunPython( set_existing_view_user_data, migrations.RunPython.noop, ),Also applies to: 49-52
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@misc/migrations/0008_whitelistuser_api_access_tier.py` around lines 23 - 35, The migration adds userdataaccess.api_access_tier with default "restricted" but does not backfill existing rows, so add a RunPython data migration in this migration (or immediately after) that iterates UserDataAccess (model_name="userdataaccess") and sets api_access_tier based on existing state (e.g., preserve prior whitelist/legacy tier fields or map existing boolean/enum fields to "restricted"/"benchmarking"/"unrestricted"), implement an idempotent forward function and a no-op or reversible backward function, and ensure this RunPython runs before any code that relies on the new column (also update the existing data migration that updates view_user_data at lines ~49-52 to include api_access_tier backfill for renamed whitelist entries). Ensure the migration imports migrations.RunPython and uses the model retrieved via apps.get_model to avoid import-time models.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@users/migrations/0016_user_api_access_tier.py`:
- Around line 17-20: Create a new Django data migration that updates existing
rows where users_user.api_access_tier == "bot_benchmarking" to the new value
"benchmarking": implement a RunPython forwards function that queries the User
model (apps.get_model("users", "User")) and performs an
update(queryset.update(api_access_tier="benchmarking")) for those records;
include a reverse function that optionally maps "benchmarking" back to
"bot_benchmarking" (or leave as noop if irreversible) and add the RunPython
operation to the migration's operations list so already-migrated DBs are
corrected.
---
Duplicate comments:
In `@misc/migrations/0008_whitelistuser_api_access_tier.py`:
- Around line 23-35: The migration adds userdataaccess.api_access_tier with
default "restricted" but does not backfill existing rows, so add a RunPython
data migration in this migration (or immediately after) that iterates
UserDataAccess (model_name="userdataaccess") and sets api_access_tier based on
existing state (e.g., preserve prior whitelist/legacy tier fields or map
existing boolean/enum fields to "restricted"/"benchmarking"/"unrestricted"),
implement an idempotent forward function and a no-op or reversible backward
function, and ensure this RunPython runs before any code that relies on the new
column (also update the existing data migration that updates view_user_data at
lines ~49-52 to include api_access_tier backfill for renamed whitelist entries).
Ensure the migration imports migrations.RunPython and uses the model retrieved
via apps.get_model to avoid import-time models.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 26a03672-fcc7-4247-84cf-c06b91c0ca58
📒 Files selected for processing (3)
misc/migrations/0008_whitelistuser_api_access_tier.pyusers/constants.pyusers/migrations/0016_user_api_access_tier.py
elisescu
left a comment
There was a problem hiding this comment.
Had some inline comments.
…/api-access/endpoint-updates
|
@SylvainChevalier working now |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
users/serializers.py (2)
10-10: Unused import.
ApiAccessTieris imported but not referenced anywhere in this file. Theget_project_data_accessmethod returns raw string values from the database rather than using the enum.♻️ Remove unused import or use it for filtering
If filtering by tier is not needed, remove the import:
-from users.constants import ApiAccessTierAlternatively, if restricted-tier entries should be excluded (per earlier discussion), use the import:
def get_project_data_access(self, user: User): entries = ( - user.data_accesses.filter(project_id__isnull=False) + user.data_accesses.filter(project_id__isnull=False).exclude( + api_access_tier=ApiAccessTier.RESTRICTED + ) .values("project_id", "api_access_tier") .distinct() ) return list(entries)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@users/serializers.py` at line 10, The file currently imports ApiAccessTier but never uses it; either remove the unused import or update get_project_data_access to use ApiAccessTier to filter/translate DB strings (e.g., compare the DB value against ApiAccessTier members or map DB strings to ApiAccessTier enums and exclude entries like ApiAccessTier.RESTRICTED). Locate the import statement "ApiAccessTier" and the function "get_project_data_access" to implement the chosen fix so the import is either removed or correctly applied for filtering/mapping.
93-93: Unused field declaration.
project_data_accessis declared as aSerializerMethodFieldbut is not added toMeta.fields, so it will not be serialized. If this is intentional (field only used in subclass), consider removing this line from the parent class to avoid confusion.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@users/serializers.py` at line 93, The parent serializer in users/serializers.py declares project_data_access = serializers.SerializerMethodField() but never includes it in Meta.fields, so either remove this unused declaration from the parent serializer, add "project_data_access" to that serializer's Meta.fields if it should be serialized, or move the SerializerMethodField declaration (and its get_project_data_access method) into the specific subclass that actually uses it; update the relevant serializer (the parent class where project_data_access is declared, or the subclass that consumes it) accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@users/serializers.py`:
- Around line 139-148: Add a Meta override on UserPrivateDataAccessSerializer to
include the SerializerMethodField project_data_access in the output: update the
class UserPrivateDataAccessSerializer (which currently defines
get_project_data_access) to declare an inner Meta that inherits or mirrors
UserPrivateSerializer.Meta but adds "project_data_access" to the fields tuple so
the SerializerMethodField is actually serialized; reference
UserPrivateDataAccessSerializer, UserPrivateSerializer, Meta.fields, and
project_data_access when making the change.
---
Nitpick comments:
In `@users/serializers.py`:
- Line 10: The file currently imports ApiAccessTier but never uses it; either
remove the unused import or update get_project_data_access to use ApiAccessTier
to filter/translate DB strings (e.g., compare the DB value against ApiAccessTier
members or map DB strings to ApiAccessTier enums and exclude entries like
ApiAccessTier.RESTRICTED). Locate the import statement "ApiAccessTier" and the
function "get_project_data_access" to implement the chosen fix so the import is
either removed or correctly applied for filtering/mapping.
- Line 93: The parent serializer in users/serializers.py declares
project_data_access = serializers.SerializerMethodField() but never includes it
in Meta.fields, so either remove this unused declaration from the parent
serializer, add "project_data_access" to that serializer's Meta.fields if it
should be serialized, or move the SerializerMethodField declaration (and its
get_project_data_access method) into the specific subclass that actually uses
it; update the relevant serializer (the parent class where project_data_access
is declared, or the subclass that consumes it) accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 51a13678-9397-440f-b37f-d2b38de895bf
📒 Files selected for processing (4)
misc/migrations/0008_whitelistuser_api_access_tier.pymisc/models.pyusers/serializers.pyusers/views.py
🚧 Files skipped from review as they are similar to previous changes (2)
- users/views.py
- misc/models.py
…/api-access/endpoint-updates
addresses main site part of optional feature of #4466
followup to #4488
Summary
Renames WhitelistUser model to UserDataAccess and decouples API access tier grants from user-level data permissions.
Key changes
Model rename: WhitelistUser → UserDataAccess, with related_name updated from whitelists to data_accesses across all FKs
New view_user_data field: Replaces view_forecaster_data. Previously, the existence of a whitelist entry implied user data access. Now entries can exist solely for API tier overrides — only entries with view_user_data=True grant user-level data access. Existing rows are backfilled to True.
ApiAccessTier extracted to users/constants.py; BOT_BENCHMARKING renamed to BENCHMARKING
/users/me endpoint: reduced_api_restriction_projects replaced with project_data_access, only returned when ?with_data_access=true is passed
Renamed API surface: get-whitelist-status/ → get-data-access-status/, response key is_whitelisted → has_data_access
Frontend updated to match all backend renames (types, API client, download modal)
All changes consolidated into the existing 0008 migration
Summary by CodeRabbit
New Features
Refactor