StuffBucket is a personal capture-and-retrieve app for text, snippets, links, and documents, organized by tags and collections, with a strong emphasis on durability and user ownership.
Key guarantees:
- Metadata and file payloads are synced via CloudKit.
- On macOS, synced documents are materialized into a user-selected Finder folder for user visibility.
- Links are persisted, including saved HTML snapshots, to avoid link rot (e.g. NYTimes articles).
- Metadata is stored in Core Data, synced via CloudKit.
- First-class support for iOS, iPadOS, and macOS.
- Full-text search with relevance ranking and filters.
- Search uses a custom centered search bar on iOS and the standard system toolbar search field on macOS.
- Tag editing uses platform-appropriate text input behavior.
- Optional AI assistance (summaries, key points, tags) powered by Claude or ChatGPT, opt-in and user-controlled.
Non-goals (initial versions): collaboration, web client.
- Ignore local screenshots (
Screenshots/) and personal notes (resume.txt) in git. - Use context-scoped Core Data fetches in UI to avoid multi-bundle entity ambiguity.
- Item detail fetches include sort descriptors (required by NSFetchedResultsController).
- Disable verbose Core Data debug logging (WAL checkpoint spam).
Every captured object is an Item:
- Snippet – short plain text
- Link – URL + metadata + persisted HTML snapshot
- Document – local file stored in the app container and synced via CloudKit payloads
All item types support:
- Tags
- Tags are case-insensitive; duplicates collapse to the first-seen casing.
- Collections (via tag-based pseudo-collections)
- Optional attachments: text, link, and document can co-exist on any item.
- Attachments can be added/edited after creation in the item detail view.
typeis treated as the creation kind (how the item started), not a capability limiter.
Collections are implemented as special tags with a collection: prefix:
- A tag
collection:ProjectXplaces the item in the "ProjectX" collection. - The UI surfaces collections separately from regular tags, displaying just the collection name.
- Items can belong to multiple collections by having multiple
collection:tags. - Collection names are case-preserved but matched case-insensitively.
- Safari bookmark imports map folder paths to
collection:tags.
StuffBucket keeps working files in app-local storage:
<App Container>/Library/Application Support/StuffBucket/
├── Links/
│ └── <uuid>/
│ ├── page.html
│ ├── reader.html
│ └── assets/
└── Documents/
└── <uuid>/
└── <original filename>
Extracted fallback caches are stored separately:
<App Container>/Library/Caches/
├── ExtractedArchives/<uuid>/...
└── ExtractedDocuments/<uuid>/...
On macOS, "Show in Finder" copies documents into a user-selected folder:
<User Selected Folder>/
└── StuffBucket/
└── Documents/
└── <uuid>/
└── <original filename>
Notes:
- This Finder path is a materialized copy for user visibility, not the sync source of truth.
- Link archives are not materialized to the user folder; they remain in app-local storage/caches.
- CloudKit container ID:
iCloud.com.digitalhandstand.stuffbucketapp.
When a link is saved, StuffBucket must:
- Store the original URL.
- Fetch and store:
- Page title
- Author (if available)
- Publication date (best-effort)
- Decode common HTML entities in metadata values
- Persist the article content to avoid link rot.
- Automatically archive pending link items on app launch/activation to ensure every link is archived.
For each Link item:
- Capture the rendered DOM using
WKWebView(non-persistent data store). - Extract asset URLs (images, srcset, source tags, stylesheets, icons).
- Download assets into a local
assets/folder and rewrite HTML/CSS references to local paths. - Save both:
page.html(full page snapshot)reader.html(reader-mode extraction)
- Fallback to a raw
URLSessionHTML fetch if WebKit capture fails. - Save as:
StuffBucket/Links/<uuid>/page.html
If full HTML capture fails:
- Store a raw HTML snapshot (without asset rewriting) when available.
- Mark link as "partial archive" in metadata. If both rendered and raw capture fail:
- Mark link as "failed archive."
Archives are synced via CloudKit bundle payloads:
- Local archive files (
page.html,reader.html,assets) are written to app-local storage. - The archive folder is compressed into
archiveZipDataviaArchiveBundle(LZFSE) for CloudKit replication. - On another device,
ArchiveResolverloads local files first; if missing, it extractsarchiveZipDataintoExtractedArchives/<uuid>/. archiveZipDatais omitted when bundle creation fails or exceeds the configured sync size limit.
When opening an archive:
- Load from local archive files when available.
- Otherwise extract from CloudKit bundle to local cache and open.
- iOS/iPadOS: open the locally stored
page.html(andreader.html) inside StuffBucket. - macOS: open the archived HTML in the default browser.
- If local files are missing, extract from the CloudKit bundle and open from cache.
- If archives are missing, show an unavailable state.
- Secondary action: Tapping the displayed URL opens the original link in the default browser.
- When captured from the macOS share sheet, StuffBucket should foreground and surface the new item quickly.
- Archive status badges should update live as the archive completes.
- Searchable
- Lightweight
- Preserves links and structure
- Future-proof
Documents are synced via CloudKit bundle payloads:
- Imported/attached documents are copied to app-local storage (
Documents/<uuid>/<filename>). documentZipDatacurrently stores a direct document payload (raw file bytes) for CloudKit replication.DocumentResolversupports both formats when resolving on another device:- legacy compressed bundle (extract)
- current raw payload (write directly to extracted cache)
- macOS "Show in Finder" materializes a copy to the user-selected folder.
When opening a document:
- First attempt to load from local file storage.
- If unavailable, extract from CloudKit bundle to local cache.
- All file I/O operations run on background threads to prevent UI hangs.
Size policy:
- Sync/import limit is configurable via
SyncPolicy.maxFileSizeMB(default 512 MB, min 50, max 4096). - Document imports/attachments above the limit are rejected with a user-visible error.
id: UUIDtype: enum { snippet, link, document }title: String?textContent: String?// optional snippet body for any itemtags: [String]// includes regular tags and collection: prefixed tagstrashedAt: Date?// when item was moved to trash (nil = not trashed)createdAt: DateupdatedAt: DatecollectionID: UUID?// legacy, unused - collections now via tagssource: enum { manual, share_sheet, safari_bookmarks, import }sourceExternalID: String?// stable ID for external sync (e.g. Safari)sourceFolderPath: String?// external folder path (e.g. Safari bookmark folder)documentRelativePath: String?// local path hint: Documents// (optional on any item)documentZipData: Binary?// document sync payload for CloudKit (currently raw file bytes)
linkURL: String?// optional on any itemlinkTitle: String?linkAuthor: String?linkPublishedDate: Date?htmlRelativePath: String?// Links//page.htmlarchiveStatus: enum { full, partial, failed }assetManifestJSON: String?// legacy field; currently cleared/unused in archive flowarchiveZipData: Binary?// compressed archive bundle (CloudKit source of truth)
aiSummary: String?aiArtifactsJSON: String?// structured AI outputs (key points, entities, tags)aiModelID: String?aiUpdatedAt: Date?
- Core Data +
NSPersistentCloudKitContainer - CloudKit container:
iCloud.com.digitalhandstand.stuffbucketapp - Core Data schema is CloudKit-compatible (optional fields or defaults for non-optional fields).
- View context uses
NSMergeByPropertyObjectTrumpMergePolicyandautomaticallyMergesChangesFromParent = true. - Tag normalization/deduplication is handled in app logic (case-insensitive, with explicit
collection:andtrashcanhandling).
- CloudKit file sync is implemented through Core Data binary payloads:
archiveZipDatafor link archivesdocumentZipDatafor documents
- Primary working files are local (
Application Support/StuffBucket/...). - Resolver behavior is local-first:
- If primary local files exist, use them.
- If not, extract/write from CloudKit payload into local cache (
ExtractedArchives,ExtractedDocuments).
- Cache cleanup removes extracted cache copies once a primary local copy exists.
StorageMigration.migrateLocalStorageIfNeeded()is intentionally no-op in current CloudKit-only mode.
- Reset controls are available in Settings (development/test use).
- Reset Local Data:
- deletes
Item+SearchIndexMetadata - deletes local storage root + extracted caches
- clears materialized Finder copies
- resets local search index
- deletes
- Reset Local + CloudKit Data additionally purges non-default private CloudKit zones in
iCloud.com.digitalhandstand.stuffbucketapp.
- Items can be moved to trash via a "Move to Trash" action in the item detail view.
- Trashed items are marked with a
trashcantag andtrashedAttimestamp. - Trashed items are hidden from the main view and search results.
- Searching for
trashcanreveals trashed items. - Trashed items can be restored via "Restore from Trash" action.
- Items in trash for more than 10 days are permanently deleted on app launch.
- Permanent deletion removes:
- The Core Data record (synced via CloudKit)
- Local archive files (Links//)
- Local document files (Documents//)
- Local cache files
- Search index entries
- Deletion propagates across devices via CloudKit sync.
- Debug-only destructive controls are removed from the UI before release.
- Core Data verbose debug logging is disabled to avoid WAL checkpoint spam in output.
- In-app quick add:
- New Snippet creates a text capture item immediately.
- Add Link prompts for a URL and saves it as a Link item.
- Import Document uses the Files picker and copies the file into StuffBucket/Documents.
- Saving a URL triggers:
- Immediate metadata save
- Background HTML fetch
- Archive status indicator
- Share extension captures URLs from Safari and queues them for the main app to import on launch.
- Share extension accepts URL attachments or plain-text URL payloads.
- Share extension accepts image attachments (e.g. Photos) and imports them as Document items.
- Share extension treats file:// URLs as Document items and prefers image attachments over file URLs.
- Document titles default to the shared filename (not the full file path).
- iOS share sheet includes a comment field for optional snippets/tags.
- Share sheet comment text supports quotes for snippets: double quotes (straight/smart) and single quotes used as quote boundaries become
textContentjoined by newlines; apostrophes inside words and quotes inside quoted segments are ignored; unquoted tokens become tags (#tag supported). - Share extension opens StuffBucket after capture to surface new items immediately.
- App listens for share-capture notifications to import while already running.
- Share extension bundle identifiers are prefixed by the main app bundle identifier.
- Share extension version numbers match the parent app.
- Share extension Info.plists include required bundle metadata for installation.
- App Group for share handoff storage:
group.com.digitalhandstand.stuffbucket.app. - macOS share extension follows the standard NSExtensionRequestHandling flow.
- On import, the app fetches metadata and persists an HTML snapshot when possible.
- In-app quick add:
- New Snippet creates a text capture item immediately.
- Add Link prompts for a URL and saves it as a Link item.
- Import Document uses a file picker and copies the file into StuffBucket/Documents.
- Paste URL
- Drag URL from browser
- Drag files from Finder anywhere in the window to import documents.
- Services / Share menu
- Share extension captures URLs from Safari and queues them for the main app to import on launch.
- Share extension accepts URL attachments or plain-text URL payloads.
- Share extension accepts image attachments (e.g. Photos) and imports them as Document items.
- Share extension treats file:// URLs as Document items and prefers image attachments over file URLs.
- Document titles default to the shared filename (not the full file path).
- Share extension opens StuffBucket after capture to surface new items immediately.
- Share sheet comment text supports quotes for snippets: double quotes (straight/smart) and single quotes used as quote boundaries become
textContentjoined by newlines; apostrophes inside words and quotes inside quoted segments are ignored; unquoted tokens become tags (#tag supported). - The macOS app activates when opened via the share URL to bring new items into view.
- App listens for share-capture notifications to import while already running.
- Share extension bundle identifiers are prefixed by the main app bundle identifier.
- Share extension version numbers match the parent app.
- Share extension Info.plists include required bundle metadata for installation.
- App Group for share handoff storage:
group.com.digitalhandstand.stuffbucket.app. - On import, the app fetches metadata and persists an HTML snapshot when possible.
- One-time import during onboarding or later via Settings (macOS).
- Sources (macOS):
- User-granted access to Safari bookmarks file, or HTML export from Safari.
- Imported bookmarks become Link items (optional background archive).
- Folder structure maps to
collection:tags (e.g.,collection:Recipes); a defaultSafaritag is applied.
- Default view surfaces Collections and Tags with counts (collections first, then tags).
- Tags list shows regular tags only (excludes
collection:prefixed tags). - Collections list shows collection names extracted from
collection:tags. - Selecting a tag or collection pre-fills search with
tag:/collection:filters. - Recent items list is shown above tags and collections.
- Link items display an archive status badge (Pending / Archived / Partial / Failed).
- Empty states surface primary capture actions (Add Link, Import Document).
- Search results show all item tags (including
collection:tags) under each item, with active tag/collection filters wrapped in brackets.
- Tag editing is available on the item detail view (comma-separated input).
- Title is editable in the item detail view (empty title falls back to derived display title).
- macOS tag input is left-aligned (no right-justified value column).
- Collection assignment is available separately from tag editing.
- macOS collection input is left-aligned (no right-justified value column).
- macOS tag/collection inputs omit placeholder text and are explicitly left-aligned.
- macOS content editor text is left-aligned.
- Tags display excludes
collection:prefixed tags (shown in collections section). - De-duplication based on URL + title + folder path; keep a sync link when possible.
- Document items show the filename and a "Show in Finder" action on macOS.
- macOS list rows expose "Show in Finder" for documents via context menu.
- iOS "Open Document" previews via QuickLook from local files or extracted CloudKit bundles.
- Document URL resolution falls back to CloudKit bundle extraction when local files are missing.
- If local files are incomplete, the app falls back to CloudKit bundle extraction (same as link archives).
- Document file operations run on background threads to prevent UI hangs.
- Full-text search across:
- Snippets
- Extracted text from saved HTML
- Filenames
- User-added annotations
- Tags, collections, and AI summaries (if present)
- Relevance ranking with field weighting (title > tags > content).
- Diacritic-insensitive matching.
- Phrase search with quotes and prefix matching.
- Filters:
type:,tag:,collection:,source:. - Tag/collection filters quote values with punctuation so hyphenated tags match (e.g.
tag:customer-service). - Sort by relevance or recency.
- macOS tag and collection lists support command-click to accumulate multiple filters in the search bar.
- Incremental indexing as items change.
- Seed/rebuild the index on app launch to reconcile deletes while the app was closed.
- Background reindex if HTML snapshots or files update.
- Index remains local and is rebuilt if corrupted.
- Typical queries return in <200ms on device.
- Large archives remain responsive with streaming results.
- Bulk import creates Link items with
source = safari_bookmarks. - Preserve bookmark titles and URLs; store Safari folder path in
sourceFolderPath. - Optional "Archive on import" to fetch HTML in the background.
- For HTML export imports, retain a synthetic
sourceExternalIDderived from URL + path.
- De-dup by
sourceExternalIDwhen present; fall back to URL + folder path. - If a URL already exists, merge metadata and keep the earliest Item ID.
- Tag suggestions for items based on content analysis.
- AI analyzes item title, snippet, URL, and archived article text.
- Suggestions prefer existing library tags over creating new ones.
- Actions are always user-initiated (manual "Suggest Tags" button).
- Anthropic Claude (claude-sonnet-4, claude-opus-4, claude-3.5-sonnet, claude-3.5-haiku)
- OpenAI GPT (gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo)
- User selects provider and model in AI Settings.
- Default models:
claude-sonnet-4-20250514(Anthropic),gpt-4o(OpenAI).
- BYOK (bring your own key) - user provides their own API keys.
- Keys stored in both iCloud Key-Value Store (cross-device sync) and UserDefaults (local fallback).
- Keys are never logged or exported.
- API key validation on save (test request to verify key works).
- User opens item detail view.
- User taps "Suggest Tags" button (visible when API key is configured).
- Sheet displays with loading state while AI analyzes content.
- Suggested tags shown with checkboxes (all pre-selected by default).
- Tags already in library marked with "existing" badge.
- User reviews, adjusts selection, and taps "Apply" to add tags.
- Content sent to AI includes: title, snippet (truncated), URL, article text (truncated).
- No automatic/background AI calls - always user-initiated.
- AI outputs (suggested tags) are applied directly to items, not stored separately.
- Badge showing:
- "Archived" / "Partial" / "Live only"
- Clear disclosure:
- "This page is saved locally"
- Paywalled content (e.g. NYTimes):
- Capture occurs using user’s authenticated session when possible (WKWebView-based fetch).
- Otherwise fall back to reader/plain text.
- Dynamic sites / JS-heavy pages:
- Snapshot post-load DOM via WebKit.
- Legal note:
- This is personal archival, not redistribution.
- Safari import:
- Bookmarks file unavailable or locked; fallback to HTML import.
- Invalid or empty URLs; skip with a report.
- AI:
- Network failures or rate limits; surface error and allow retry.
- Large items exceed model limits; summarize extractively or chunk with user confirmation.
- Saved NYTimes article remains readable offline after original URL changes.
- Archive opens from local files when present, or from extracted CloudKit payload on devices missing local files.
- Search finds words inside archived articles.
- Safari import can ingest 500+ bookmarks on macOS.
- User can generate AI tag suggestions using their API key and apply them to items.
- v0.2: HTML-backed link persistence
- v0.3: Full-text search, Safari bookmarks import, AI tag suggestions (Claude & OpenAI)
- v0.4: CloudKit file payload sync, local-first resolvers, macOS Finder materialization, configurable sync size limits
- Unit tests cover search query parsing/builder output, tag list encoding/decoding, and link metadata parsing with HTML entity decoding on both iOS and macOS targets.
- macOS unit tests run with an app host configuration so Xcodegen builds execute them reliably.
- Core Data item creation in tests uses context-scoped entity lookup to avoid entity ambiguity warnings when multiple models load.
- Xcodegen project generation mirrors Xcode-recommended settings for macOS targets (app sandbox/network/app groups), asset symbol generation, and the current Xcode version.
End of specification