Skip to content

feat: add --filter-file and --prefix-file for large filter sets#128

Open
digizeph wants to merge 2 commits into
mainfrom
feat/filter-file-117
Open

feat: add --filter-file and --prefix-file for large filter sets#128
digizeph wants to merge 2 commits into
mainfrom
feat/filter-file-117

Conversation

@digizeph

Copy link
Copy Markdown
Member

Fixes #117

Adds two new flags to monocle parse and monocle search for loading filters from files instead of cramming everything into CLI args.

--prefix-file <PATH> — plain text, one CIDR per line. Supports # comments and blank lines. This is the ergonomic option for the common workflow:

monocle parse -o 64496 rib.gz | cut -d'|' -f5 | sort -u > prefixes.txt
monocle parse --prefix-file prefixes.txt updates.gz

--filter-file <PATH> — JSON file with all filter dimensions:

{
  "prefixes": ["192.0.2.0/24"],
  "origin_asns": ["64496"],
  "peer_asns": ["174", "6939"],
  "communities": ["64496:100"],
  "elem_type": "w",
  "include_sub": true,
  "start_ts": "2025-01-01T00:00:00Z",
  "duration": "2h"
}

All fields optional. Same string syntax as CLI flags including ! negation.

Merge semantics — file filters combine with CLI flags:

  • Vec fields (prefixes, ASNs, communities): unioned (OR)
  • Scalar fields (as_path, elem_type, time): CLI takes precedence
  • Boolean flags (include_super/sub): OR-ed

Both file types can be used together; prefixes from both are unioned.

New module src/lens/parse/filter_file.rs with full rust-docs and examples. 36 tests covering deserialization, merge semantics, real file I/O via temp files, validation through ParseFilters::validate(), and combined CLI+file scenarios.

Add file-based filter input to monocle parse and monocle search:

- --filter-file <PATH>: JSON file with structured filters (prefixes,
  origin_asns, peer_asns, communities, as_path_regex, elem_type, time
  range, include_super/sub). All fields optional.
- --prefix-file <PATH>: newline-delimited prefix list, one CIDR per
  line. Supports # comments and blank lines. Designed for the common
  RIB-extract then filter-updates workflow.

File filters merge with CLI flags: union (OR) within each dimension,
AND across dimensions. CLI takes precedence for scalar fields
(as_path, elem_type, time); boolean flags (include_super/sub) are OR-ed.

New module src/lens/parse/filter_file.rs with FilterFile struct,
load/merge functions, comprehensive rust-docs with examples, and 36
unit + integration tests covering deserialization, merge semantics,
real file I/O (temp files), validation, and combined CLI+file scenarios.

Fixes #117

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for loading large filter sets from files for monocle parse and monocle search, avoiding CLI ARG_MAX limitations and enabling reusable filter definitions. This fits into the CLI layer by extending parse/search argument parsing, and into the lens layer by adding a reusable filter-file loader/merger.

Changes:

  • Introduced src/lens/parse/filter_file.rs to load JSON filter files and newline-delimited prefix lists, and merge them into ParseFilters.
  • Added --filter-file and --prefix-file flags to monocle parse and monocle search, merging file filters with CLI-provided filters.
  • Documented the new flags in README.md and added an entry to CHANGELOG.md.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/lens/parse/mod.rs Exposes the new filter_file module and documents file-based filter usage in ParseFilters docs.
src/lens/parse/filter_file.rs Implements JSON + prefix-list loading and merge semantics, plus unit tests.
src/bin/commands/search.rs Adds CLI flags and merges file filters into SearchFilters.parse_filters.
src/bin/commands/parse.rs Adds CLI flags and merges file filters into ParseFilters before parsing.
README.md Documents --filter-file / --prefix-file usage and formats.
CHANGELOG.md Adds an “Unreleased changes” entry describing the new flags and semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/lens/parse/filter_file.rs
Comment thread src/lens/parse/filter_file.rs Outdated
Comment on lines +364 to +368
// peer_ip is Vec<IpAddr> — parse string values
for ip_str in self.peer_ips {
if let Ok(ip) = ip_str.trim().parse() {
filters.peer_ip.push(ip);
}
Comment on lines +375 to +379
if filters.elem_type.is_none() {
// Map string "a"/"w" to ParseElemType
if let Some(et) = self.elem_type.as_deref() {
filters.elem_type = match et.to_lowercase().as_str() {
"a" | "announce" | "announcement" => Some(ParseElemType::A),
Comment on lines +562 to +581
// Load and merge file-based filters into CLI filters
if let Some(ref pf) = filter_file {
match FilterFile::load(pf) {
Ok(ff) => ff.merge_into(&mut filters.parse_filters),
Err(e) => {
eprintln!("ERROR: {}", e);
std::process::exit(1);
}
}
}
if let Some(ref pf) = prefix_file {
match load_prefix_file(pf) {
Ok(prefixes) => merge_prefix_file(prefixes, &mut filters.parse_filters),
Err(e) => {
eprintln!("ERROR: {}", e);
std::process::exit(1);
}
}
}

Address PR review feedback:

- Add #[serde(deny_unknown_fields)] to FilterFile so typos like
  "origin_asn" (missing trailing s) are rejected at parse time instead
  of being silently ignored.
- merge_into now returns Result and errors on invalid peer_ips entries
  instead of silently skipping them, matching CLI behavior.
- merge_into errors on unrecognized elem_type values instead of mapping
  to None and silently dropping the filter.
- search command validates parse_filters immediately after merging
  file-based filters so invalid prefixes/ASNs/communities fail fast
  with actionable errors, not per-file later.

Updated tests to expect errors for invalid peer_ips and elem_type.
Added test for deny_unknown_fields. All callers updated for Result
return type.

Addresses PR review feedback.
@digizeph

Copy link
Copy Markdown
Member Author

Addressed all four review comments in 6d77d7b:

Line 214 (unknown JSON fields): Added #[serde(deny_unknown_fields)] to FilterFile. Typos like origin_asn (missing trailing s) now fail at parse time with a clear error listing the valid field names.

Line 368 (invalid peer_ips): merge_into now returns Result and errors on invalid peer IP strings instead of silently skipping them. Error message: Invalid peer IP 'not_an_ip' in filter file: must be a valid IP address.

Line 379 (invalid elem_type): merge_into now errors on unrecognized elem_type values instead of mapping to None and silently dropping the filter. Error message: Invalid elem_type 'x' in filter file: must be 'a' (announce) or 'w' (withdraw).

Line 581 (search validation): Added filters.parse_filters.validate() call immediately after merging file-based filters in the search command, so invalid prefixes/ASNs/communities from the file fail fast with actionable errors instead of failing per-file later during parsing.

Tests updated: test_merge_into_peer_ips_invalid_skippedtest_merge_into_peer_ips_invalid_errors, test_merge_into_elem_type_invalid_ignoredtest_merge_into_elem_type_invalid_errors, plus new test_filter_file_deny_unknown_fields. All callers updated for the Result return type.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Comment on lines +361 to +365
filters.prefix.extend(self.prefixes);
filters.origin_asn.extend(self.origin_asns);
filters.peer_asn.extend(self.peer_asns);
filters.communities.extend(self.communities);

Comment on lines +383 to +394
if let Some(et) = self.elem_type.as_deref() {
filters.elem_type = match et.to_lowercase().as_str() {
"a" | "announce" | "announcement" => Some(ParseElemType::A),
"w" | "withdraw" | "withdrawal" => Some(ParseElemType::W),
_ => {
return Err(anyhow!(
"Invalid elem_type '{}' in filter file: must be 'a' (announce) or 'w' (withdraw)",
et
));
}
};
}
Comment on lines +583 to +586
if let Err(e) = filters.parse_filters.validate() {
eprintln!("ERROR: {}", e);
return;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Filter large number of prefixes with a input file

2 participants