Ruby binding for Apple's Vision framework on macOS / Apple Silicon. Calls VNRecognizeTextRequest (OCR) and VNDetectFaceRectanglesRequest (face rectangles) directly from Ruby via Swift Package Manager and a thin C bridge. Built on swift_gem.
OCR functionality overlaps with rb-vision-ocrmac, which remains as an OCR-only study piece. This gem is the broader Vision binding.
- macOS 12+, Apple Silicon
- Swift 6.3+ (SE-0495
@cattribute) for the library build. Install via swiftly. - Ruby 3.2+, Bundler 4.x
- Xcode Command Line Tools only if you want to run
examples/vision_mac.swift. The pure-Swift sample must run underxcrun swift; swiftly's 6.3 swift binary cannot JIT-link Apple system frameworks (Vision, AppKit) in interpret mode. The library build itself does not need CLT. Install withxcode-select --install.
bundle add rb-vision-macgem install rb-vision-macrequire "vision_mac"
VisionMac.recognize_text("path/to/image.png")
# => "Detected text line 1\nDetected text line 2\n..."
VisionMac.detect_faces("path/to/photo.png")
# => "0.123\t0.456\t0.234\t0.345\n..." # x, y, width, height in normalized 0..1 coordsrecognize_text uses ja-JP + en-US, .accurate, with language correction. detect_faces returns CGRect values normalized to the image (Vision's coordinate system: origin at lower-left). On Vision-side failure (unreadable image content, OS error, 30s timeout) the methods return "". A missing path raises Errno::ENOENT rather than silently returning "", so callers can distinguish bad input from a genuine empty result.
Or open an IRB console with the gem preloaded:
bundle exec rake consoleThis gem is a thin pass-through wrapper around Vision. It does no image preprocessing — no rotation, scaling, orientation correction, page splitting, or layout normalization. If Vision can't read the image as-is, the methods return "".
Vision has known weak spots that show up in real workloads:
- Vertical Japanese book pages with densely-packed text columns (
recognize_text) — Vision often fails to detect any text regions and returns 0 observations. Vertical writing is supported in principle, but region segmentation gives up on book-page layouts with many narrow columns side-by-side. Workaround at the caller: rotate the page 90° so columns become rows, or upscale low-resolution scans (≲ 1000px on the long side) before passing the path in. - Low-resolution scans (
recognize_text) — sub-1000px images sometimes return zero observations even for clean horizontal text. Upscale before calling. - Multi-page PDFs / multi-region images — split into per-page / per-region images upstream; the methods take one image at a time.
- Skew, heavy noise, faint text — deskew / denoise / contrast-boost in the caller.
- Faces under unusual angles / occlusion / low light (
detect_faces) — Vision'sVNDetectFaceRectanglesRequestmay miss faces; preprocessing (lighting normalization, rotation) is the caller's call.
Detection of these cases is also the caller's job. Both methods return "" for "Vision succeeded but found nothing" and "Vision could not segment the image" — the gem does not distinguish them. Callers that need to retry with preprocessing should branch on output.empty? and apply their own fallback chain.
A missing path is the one failure mode this gem does surface as an exception (Errno::ENOENT), since that is unambiguously bad input and never a legitimate empty result.
example.rb at the repo root demonstrates both methods end-to-end:
bundle exec ruby example.rb path/to/image.pngIt defaults to test/fixtures/sample_jp.png if no argument is given.
A self-contained Swift script lives at examples/vision_mac.swift for sanity-checking Vision behavior without going through Ruby:
xcrun swift examples/vision_mac.swift path/to/image.pngUse xcrun swift (Xcode toolchain), not bare swift from swiftly — swiftly 6.3's interpret mode does not JIT-link Apple system frameworks (Vision, AppKit) and fails at startup with symbol-resolution errors. Xcode's swift uses dyld and works as-is.
bundle install
bundle exec rake testrake test automatically compiles the Swift Package (swift build -c release) and links the C bridge into lib/vision_mac/vision_mac.bundle before running the spec, via Rake::ExtensionTask.
To run only the build step: bundle exec rake compile.
MIT.