WAV redaction + rework Mergeable/Redactions/handler shape, fold transform/ into handler/#150
Merged
Conversation
…ion) handler API
- Move Mergeable from codec to ontology and implement on the four
*Location types; overlap detection and merge semantics are a
Location concern, not a Redaction concern.
- Flatten Redactions<S, R> to Vec<(S, R)>; conflict policy uses
S::overlaps and requires both S::try_merge and R::try_merge to
succeed for Merge.
- Drop range fields duplicated by Locations from *Redaction payloads
(AudioRedaction.time_span, ImageRedaction.bounding_box,
TabularRedaction.start/end). TextRedaction keeps start/end since
TextLocation is line-level and the range is intra-line.
- Reintroduce *Transform traits as blanket impls over *Handler that
iterate Redactions and dispatch each (location, redaction) pair to
the handler's narrow redact_at hook. AudioTransform pre-sorts
right-to-left by time_span.start_us so Remove ops don't shift
indices for later calls.
- Add hound-based WAV redaction; MP3 returns an explicit
"not supported" error.
- Move buffer-mutation helpers (apply_*_redaction, ImageOps) from
transform/ to handler/{text,image,audio,tabular}/ next to their
only consumers. transform/ now owns just the iteration protocol.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…nsform/ into handler/
- Fold each *Transform trait into the matching *Handler as a provided
default method redact(Redactions<L, R>) that loops redact_at in
insertion order. AudioHandler::redact overrides the default to sort
right-to-left by time_span.start_us so AudioOutput::Remove doesn't
shift later sample indices. One trait per modality instead of two;
no separate blanket-impl extension trait.
- Move every transform/ contents into handler/:
- *Redaction / *Output structs into handler/{text,image,audio,tabular}/
next to the redact_at hook they feed.
- Redactions<S, R>, ConflictPolicy, InsertError into handler/ as
the cross-modality batching primitives.
- Mergeable re-exported from handler/ (still defined in ontology).
- Delete transform/ entirely; nothing in the engine or codec imports
from it anymore. Engine and downstream consumers now go through
nvisy_codec::handler::{*Redaction, Redactions, ConflictPolicy, ...}.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…itten sites - ContentKind: replace four hand-written is_*() predicates with derive_more::IsVariant. Note: is_text_based renamed to is_text to match the variant name (no callers in the workspace). - Redactions<S, R>: replace the hand-written owned IntoIterator impl with derive_more::IntoIterator on the `items` field. - Entities: extend the existing derive_more::IntoIterator to also derive the ref and ref_mut variants via #[into_iterator(owned, ref, ref_mut)] on the field, deleting the hand-written `&'a Entities` impl. - Add the `is_variant` derive_more feature to nvisy-core and the `into_iterator` feature to nvisy-codec. TextData was also a candidate (collapse its From<String> and From<&str> into derive_more::From with #[from(forward)]) but HipStr<'static>'s From impls are parameterized on the backing storage (Arc) and forward mode can't see them through the wrapper. Left as hand-written. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rename The sed-driven path rewrite produced split imports (`use super::apply_*;` + `use crate::handler::*Redaction;` instead of a single grouped import, plus a few `crate::handler::TextData; use crate::handler::Handler;` pairs that should fold together). Nightly rustfmt regrouped them. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… uses location range Brings text in line with the other modalities: the redaction payload now carries only `output`, the byte range comes from the containing TextLocation. One coordinate system, not two. - TextRedaction::new(output) — drops start/end fields. Mergeable collapses to (output == other.output).then_some(self), matching Image/Audio/Tabular. - apply_text_redaction now takes explicit start/end parameters (same shape as apply_tabular_redaction). - Each text handler's redact_at (txt, json, html, pdf) finds the line / node / page / span containing location.start_offset..end_offset, computes span-relative offsets, and forwards them to the apply helper. The "exact start match" requirement is gone — entity-shaped locations (substrings) now work, not just whole-line locations. - Engine RedactionApplicator drops the now-redundant TextRedaction::new(loc.start_offset, loc.end_offset, output) → TextRedaction::new(output). - Tests updated. Added redact_substring_within_line covering the entity-shaped case explicitly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
End-to-end audio redaction (WAV) plus a substantial cleanup of the redaction-handler API: relocate
MergeabletoLocation, drop range fields that duplicated location data from each*Redaction, flattenRedactions<S, R>toVec<(S, R)>, narrow handler hooks toredact_at(&Location, Redaction), and collapse the now-unnecessarytransform/module intohandler/.Architectural changes
Mergeablemoved to ontology, implemented on*Location. Overlap detection and merge semantics are a Location concern, not a Redaction concern.Redactions<S, R>now requiresS: Overlap + Mergeable, R: Mergeable; theMergepolicy needs both to succeed.Redactions<S, R>to aVec<(S, R)>ordered by insertion. Bucketing-by-key was masquerading for overlap detection.*Redactionpayloads.AudioRedaction.time_span,ImageRedaction.bounding_box, andTabularRedaction.start/endall duplicated data already on the matching*Location.TextRedactionkeeps itsstart/endbecauseTextLocationis line-level — the range is intra-line.redact_at(&Location, Redaction). Each*Handlercapability trait gets a provided defaultredact(Redactions<L, R>)that loopsredact_at.AudioHandler::redactoverrides the default to sort right-to-left bytime_span.start_ussoAudioOutput::Removedoesn't shift later sample indices (this was the bug that motivated the redesign).transform/folded intohandler/.Redactions,ConflictPolicy,InsertError, and the per-modality*Redaction/*Outputtypes all live underhandler/now.transform/is deleted.apply_*_redaction,ImageOps) moved tohandler/{text,image,audio,tabular}/next to their only consumers.WAV redaction
WavHandler::redact_atdecodes viahound, applies the sample-level mutation, and re-encodes. Supported:i8/i16/i32PCM andf32IEEE float.Mp3Handler::redact_atreturns an explicit "MP3 redaction is not supported" error (no pure-Rust encoder; libmp3lame out of scope). The pipeline fails-fast at the first redaction with a clear message; convert to WAV upstream.AudioHandler::redact's default means callers passing multipleRemoveredactions get correct indices without each handler having to remember.Minor cleanups
ContentKind: replace four hand-writtenis_*()predicates withderive_more::IsVariant.is_text_basedrenamed tois_textto match the variant (no callers in the workspace).Redactions<S, R>:IntoIteratorviaderive_more::IntoIteratoron theitemsfield.Entities: extend the existing derive to include ref and ref_mut iterators via#[into_iterator(owned, ref, ref_mut)]; delete the hand-written&'a Entitiesimpl.Test plan
cargo check --workspace --all-features— cleancargo test --workspace --all-features— all green (425+ tests across 11 crates)cargo clippy --workspace --all-features --no-deps— cleancargo +nightly fmt --all— cleancargo doc --workspace --all-features --no-deps— no rustdoc warnings (pre-existing nvisy-cli/nvisy-server filename collision unrelated)wav_handler.rs(silence + remove across mono/stereo/i16/i32/f32)mp3_handler.rstests🤖 Generated with Claude Code