Description
At times, we need to filter on a large number of prefixes. CLI with prefix filter parameters would be insufficient. We should allow users to provide an input file with filtering criteria written.
This may involve updating bgpkit-parser for large number of filters performance. We could design a general filters file format (e.g. a JSON format) and allow taking that as input.
Concrete Use Cases
1. Prefix-list filtering from RIB extraction
The most common pattern that hits this limit is BGP outage/visibility investigation:
- Extract the prefix list for a target ASN from a pre-event RIB dump
- Use that prefix list to filter subsequent BGP update files
This is necessary because BGP withdrawal messages carry no AS path or origin ASN — you can only filter withdrawals by prefix. The current workaround is:
# Step 1: Extract prefix list from RIB
monocle parse -o <ASN> /tmp/rib_pre.gz 2>/dev/null | \
cut -d'|' -f5 | sort -u > /tmp/prefixes.txt
# Step 2: Build comma-separated string
PFXS=$(cat /tmp/prefixes.txt | tr '\n' ',' | sed 's/,$//')
# Step 3: Pass as -p argument
monocle parse -p "$PFXS" /tmp/updates.gz
This breaks down at scale:
| ASN size |
Approx prefixes |
-p arg size |
Works? |
| Small ISP |
~200 |
~3.6 KB |
Yes |
| Medium ISP |
~1,000 |
~18 KB |
Yes |
| Large carrier |
~5,000 |
~90 KB |
Marginal |
| Tier-1 / hyper |
~10,000+ |
~180 KB+ |
Exceeds ARG_MAX on many systems |
The ARG_MAX limit on Linux is typically ~2 MB but the effective limit for a single argument can be much lower (~128-256 KB). Even below ARG_MAX, very long arguments cause performance issues in shell expansion.
2. Country-level investigation
When investigating a country-level event, you may need to filter by all prefixes originated by ASNs in that country — potentially tens of thousands of prefixes. This is impractical with -p on the command line.
3. Repeated parsing with the same filter set
In a typical investigation, the same prefix list is applied to 10-40+ update files sequentially. Each invocation re-parses the comma-separated -p argument from scratch. A file-based input that's parsed once and reused across invocations would be more efficient.
Proposed Design
Filter file format (JSON)
{
"prefixes": ["192.0.2.0/24", "198.51.100.0/24", "2001:db8::/32"],
"origin_asns": [64496, 64497],
"peer_asns": [174, 6939],
"as_path_regex": "174 64496$",
"elem_type": "w",
"communities": ["64496:100", "64496:200"]
}
All fields optional; when multiple fields are present, they combine with AND logic (same as existing CLI filters).
CLI integration
# Use filter file instead of CLI flags
monocle parse --filter-file /tmp/filters.json /tmp/updates.gz
monocle search --filter-file /tmp/filters.json -t 2025-09-01T12:00:00Z -d 2h
# Filter file can be combined with CLI flags (AND logic)
monocle parse --filter-file /tmp/filters.json -c rrc00 /tmp/updates.gz
Alternative: plain text prefix list
For the most common case (prefix-only filtering), also support a simple newline-delimited file:
# One prefix per line
monocle parse --prefix-file /tmp/prefixes.txt /tmp/updates.gz
This is the most ergonomic option for the RIB-extract-then-filter-updates workflow, since monocle parse -o <ASN> rib.gz | cut -d'|' -f5 | sort -u already produces a newline-delimited prefix list.
Interaction with #82
If #82 adds RIB snapshot queries with --sqlite-path output, the extracted prefix list could be queried from SQLite and fed as a filter file to subsequent update analysis — replacing the current fragile shell pipeline.
Description
At times, we need to filter on a large number of prefixes. CLI with prefix filter parameters would be insufficient. We should allow users to provide an input file with filtering criteria written.
This may involve updating
bgpkit-parserfor large number of filters performance. We could design a general filters file format (e.g. a JSON format) and allow taking that as input.Concrete Use Cases
1. Prefix-list filtering from RIB extraction
The most common pattern that hits this limit is BGP outage/visibility investigation:
This is necessary because BGP withdrawal messages carry no AS path or origin ASN — you can only filter withdrawals by prefix. The current workaround is:
This breaks down at scale:
-parg sizeARG_MAXon many systemsThe
ARG_MAXlimit on Linux is typically ~2 MB but the effective limit for a single argument can be much lower (~128-256 KB). Even belowARG_MAX, very long arguments cause performance issues in shell expansion.2. Country-level investigation
When investigating a country-level event, you may need to filter by all prefixes originated by ASNs in that country — potentially tens of thousands of prefixes. This is impractical with
-pon the command line.3. Repeated parsing with the same filter set
In a typical investigation, the same prefix list is applied to 10-40+ update files sequentially. Each invocation re-parses the comma-separated
-pargument from scratch. A file-based input that's parsed once and reused across invocations would be more efficient.Proposed Design
Filter file format (JSON)
{ "prefixes": ["192.0.2.0/24", "198.51.100.0/24", "2001:db8::/32"], "origin_asns": [64496, 64497], "peer_asns": [174, 6939], "as_path_regex": "174 64496$", "elem_type": "w", "communities": ["64496:100", "64496:200"] }All fields optional; when multiple fields are present, they combine with AND logic (same as existing CLI filters).
CLI integration
Alternative: plain text prefix list
For the most common case (prefix-only filtering), also support a simple newline-delimited file:
# One prefix per line monocle parse --prefix-file /tmp/prefixes.txt /tmp/updates.gzThis is the most ergonomic option for the RIB-extract-then-filter-updates workflow, since
monocle parse -o <ASN> rib.gz | cut -d'|' -f5 | sort -ualready produces a newline-delimited prefix list.Interaction with #82
If #82 adds RIB snapshot queries with
--sqlite-pathoutput, the extracted prefix list could be queried from SQLite and fed as a filter file to subsequent update analysis — replacing the current fragile shell pipeline.