feat: Add comprehensive abuse detection and prevention system by doubledare704 · Pull Request #44 · pinchbench/api

doubledare704 · 2026-04-11T13:18:32Z

This PR implements a robust abuse detection and prevention system for the benchmark API, including rate limiting, heuristic validation, and administrative controls for suspicious submissions.

Key Features

🔒 Rate Limiting
2. IP-based rate limiting: 50 submissions per 24-hour window per token/IP combination
3. Database-backed tracking: New submission_rate_limits table with efficient indexing
4. Automatic enforcement: Submissions exceeding limits are rejected with 429 status
🧠 Heuristic Validation
6. Timing validation: Reject submissions with impossibly low execution times (< 0.001s)
7. Cost consistency checks: Flag submissions with suspicious cost/token ratios
8. Automatic flagging: Suspicious submissions are marked for admin review
👨‍💼 Administrative Controls
10. Flagging system: is_flagged and flag_reason columns on submissions
11. Admin UI: Toggle flagged status for submissions
12. Filtered endpoints: Leaderboard and submissions APIs exclude flagged results by default
13. Opt-in viewing: include_flagged=true query parameter to view flagged submissions
🧪 Testing Infrastructure
15. Vitest setup: Comprehensive test suite with better-sqlite3 and miniflare
16. MockD1Database: SQLite wrapper for D1-compatible testing
17. Unit tests: Security utils, heuristics, and API endpoints
18. Integration tests: Full API workflow testing

Database Changes
New Table: submission_rate_limits
Why added: To track submission attempts per token and IP for rate limiting enforcement. Includes indexes on token_id, [ip], and created_at for efficient querying and cleanup.

Modified Table: submissions
Why added: To mark suspicious submissions for administrative review and filtering. The is_flagged index enables fast filtering of flagged submissions from public endpoints.

Migration Notes
Run 20260408_add_submission_limits_and_flags.sql to add the new table and columns
Existing submissions will have is_flagged = 0 by default
No data loss or breaking changes for existing functionality

Testing
All tests pass (62/62) with the new abuse detection features fully covered. The system maintains backward compatibility while adding comprehensive protection against abuse.

Closes #20

…tion Add comprehensive abuse detection and prevention system: - Implement IP-based rate limiting (50 submissions per 24h) with new submission_rate_limits table - Add heuristic validation for timing (reject impossibly fast submissions) and cost consistency (flag suspicious cost/token ratios) - Add is_flagged and flag_reason columns to submissions table with admin UI to toggle flagged status - Filter flagged submissions from leaderboard and submissions endpoints by default (opt-in via include_flagged query param) test(api): add vitest infrastructure and comprehensive abuse detection tests Add test suite covering rate limiting, heuristic validation, and API filtering: - Set up vitest with better-sqlite3 and miniflare for Cloudflare Workers testing - Add unit tests for security utils (rate limiting, token hashing, IP extraction) - Add unit tests for heuristics (timing validation, cost consistency checks) - Add integration tests for API endpoints (leaderboard filtering, submission handling) - Include MockD1Database wrapper for D1-compatible SQLite testing - Remove remote: true from wrangler.jsonc for local development - Expose is_flagged and flag_reason in submission responses build(db): add rate limiting table and flagging columns to submissions Add migration for abuse detection infrastructure: - Create submission_rate_limits table with token_id, ip, and timestamp tracking - Add indexes on token_id, ip, and created_at for efficient querying - Add is_flagged and flag_reason columns to submissions table - Add index on is_flagged for filtering suspicious submissions refactor(test): extract schema loader to eliminate duplication across test files Centralize database schema initialization by creating a shared schema-loader module. Replace inline SQL in test/setup.ts, test/api.test.ts, and test/security.test.ts with initDbFromSchema() that loads from the main schema.sql file. Also add missing @types/better-sqlite3 dependency and clean up mock-db.ts unused methods.

src/utils/security.ts

src/routes/results.ts

kilo-code-bot · 2026-04-11T13:21:46Z

Code Review Summary

Status: No Issues Found | Recommendation: Merge

All three previously flagged issues have been addressed across the follow-up commits. The abuse detection system is well-structured and the fixes are correct.

✅ All Issues Resolved

File	Issue
`src/utils/security.ts`	`OR ip = ?` DoS vector — fixed by separating token and IP limit queries
`src/routes/results.ts`	TOCTOU race condition — fixed by INSERT-before-check approach
`src/utils/security.ts`	Off-by-one (`< limit` → `<= limit`) — now correctly allows exactly 50 submissions

Files Reviewed

src/utils/security.ts
src/routes/results.ts
src/utils/heuristics.ts
src/routes/admin.ts
src/routes/leaderboard.ts
src/routes/submissions.ts
src/types.ts
migrations/20260408_add_submission_limits_and_flags.sql
schema.sql
test/api.test.ts
test/heuristics.test.ts
test/security.test.ts
test/mock-db.ts
test/setup.ts

_{Reviewed by claude-sonnet-4.6 · 327,009 tokens}

…se prevention BREAKING SECURITY FIX - Addresses code review issues from kilo-code-bot Problems Fixed: - **Race Condition (TOCTOU)**: Two concurrent requests could both pass the check and exceed the 50-submission limit. Fixed by implementing atomic INSERT-then-COUNT. - **Independent Token/IP Limits**: Changed `checkSubmissionLimit` to check token and IP limits separately (already had OR removed, now properly decoupled). Changes: - Refactored `recordSubmissionAttempt()` to atomically: 1. Insert the rate limit record first 2. Query the limit counts after insertion 3. Return allowed status and remaining quota - Moved rate limit gate from pre-flight check (before payload validation) to post-validation stage, immediately before database insert. This ensures: - Invalid submissions don't consume quota - Atomic check-and-insert prevents race conditions - Updated `checkSubmissionLimit()` as deprecated API (kept for backward compatibility). New code uses `recordSubmissionAttempt()` for actual enforcement. - Fixed tests: - Updated "should track submissions by both token and IP independently" to verify that quota exhaustion on one token doesn't block different tokens on same IP - Updated "should record submission attempts" to validate returned limit status - Fixed invalid test UUID in rate limit rejection test All 62 tests passing.

src/utils/security.ts

The INSERT-first approach means tokenCount already includes the current request after insertion. Using `<` (less-than) enforced a limit of 49 instead of 50. Changed from: const allowed = tokenCount < limit && ipCount < limit; To: const allowed = tokenCount <= limit && ipCount <= limit; This ensures exactly 50 submissions per 24h are allowed: - Request 50: count=50 → 50 <= 50 ✅ allowed - Request 51: count=51 → 51 <= 50 ❌ blocked Fixes critical bug reported by kilo-code-bot in code review. All tests passing (62/62).

ScuttleBot · 2026-04-13T14:01:18Z

👋 Hi @doubledare704! I'm @olearycrew's OpenClaw bot doing a triage pass.

This PR looks great — the Kilo Code Review flagged zero remaining issues and recommended merge. The abuse detection system with rate limiting, heuristic validation, and admin controls is exactly what we outlined in issue #20.

All 62 tests passing, CI green, and mergeable. This one looks ready for @olearycrew to review and merge.

Nice work! 🦀

kilo-code-bot bot reviewed Apr 11, 2026

View reviewed changes

src/utils/security.ts Outdated Show resolved Hide resolved

src/routes/results.ts Outdated Show resolved Hide resolved

kilo-code-bot bot reviewed Apr 12, 2026

View reviewed changes

src/utils/security.ts Outdated Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add comprehensive abuse detection and prevention system#44

feat: Add comprehensive abuse detection and prevention system#44
doubledare704 wants to merge 3 commits intopinchbench:mainfrom
doubledare704:feature/20-abuse-detection-prevention

doubledare704 commented Apr 11, 2026

Uh oh!

Uh oh!

Uh oh!

kilo-code-bot bot commented Apr 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

ScuttleBot commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

doubledare704 commented Apr 11, 2026

Uh oh!

Uh oh!

Uh oh!

kilo-code-bot bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

✅ All Issues Resolved

Uh oh!

Uh oh!

ScuttleBot commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kilo-code-bot bot commented Apr 11, 2026 •

edited

Loading