Skip to content

b1nmar/project-onyx

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Onyx logo

Onyx

Private, on-device AI chat for iPhone β€” powered by Apple MLX.

Fork it, extend it, and ship your own private AI app.

Version License iOS Swift MLX PRs Welcome


Important

Requires a physical iPhone 15 or later. The iOS Simulator has no Metal GPU and cannot run inference.

Warning

iOS 27 beta is not yet supported. The app may crash at launch on the iOS 27 beta (under investigation). Supported: iOS 17.0 – iOS 26.x.


✨ Features

πŸ”’ Fully private All inference runs on-device. No API keys, no accounts, no server calls after the model download.
⚑ Real-time streaming Tokens appear as fast as the model generates them.
πŸ’Ž Ships with Llama 3.2 1B Instruct 4-bit quantised, β‰ˆ 860 MB. Downloads publicly β€” no HuggingFace token β€” and activates automatically.
πŸ“¦ Simple model management Download, activate, and uninstall models from a clean Models tab. Add more models with one line of code.
βš™οΈ Settings tab System prompt editor and developer toggles.
🧠 Memory-safe RAM gate blocks incompatible loads; the model auto-unloads on background and memory warnings.
🦺 Swift 6 strict concurrency Actors throughout; zero data races by construction.

🎬 Demo

Onyx demo β€” downloading the model and chatting on device

πŸš€ Getting started

1. Clone and open

git clone https://github.com/your-org/Onyx.git
cd Onyx
open Onyx/Onyx.xcodeproj

2. Set your team

Open Onyx.xcodeproj β†’ Targets β†’ Onyx β†’ Signing & Capabilities, set your development team, and change the bundle id from kiraa.Onyx to something you own.

3. Run on a physical device

Important: On-device inference requires a physical iPhone 15 or later. The iOS Simulator has no Metal GPU.

Select your device from the scheme picker, then Product β†’ Run (⌘R).

4. Download a model

  1. Tap the Models tab.
  2. Tap Download next to Llama 3.2 1B Instruct (4-bit) (β‰ˆ 860 MB β€” use Wi-Fi).
  3. The model activates automatically when the download completes.
  4. Switch to the Chat tab and start chatting.

πŸ“– New to iOS development? The step-by-step QUICKSTART takes you from a fresh Mac to a working app in under 30 minutes.


πŸ›  For developers

What you get out of the box

Capability File
On-device MLX inference MLXModelManager.swift
Real-time streaming token output ChatProvider.swift + generateFromModel()
Multi-turn conversation history with auto-trimming MLXConversationHistory.swift
Resumable HuggingFace model downloader (5-phase) ChatModelDownloader.swift
Hardware RAM gate + background/low-mem unload HardwareProfile.swift, ChatMemoryGate.swift, OnyxApp.swift
Model catalog (ships with Llama 3.2 1B) ChatModelCatalog.swift
User settings (Settings tab, SettingsView) OnyxSettings.swift, PreferencesView.swift
Installed/active model registry ChatModelRegistry.swift
Sandbox-safe file path helpers OnyxPaths.swift

What is intentionally left out

  • No persistence β€” conversations reset on restart. Trivial to add (see below).
  • No accounts or API keys β€” model downloads are public and unauthenticated; nothing leaves the device.
  • No theming β€” plain system colors throughout; swap in your own design tokens.
  • No analytics or crash reporting β€” add the SDK of your choice.

This deliberate minimalism keeps the diff small when you diverge from the skeleton.

Quick extension recipes

Add a model (one line):

Open ChatModelCatalog.swift and append to ChatModelCatalog.all (the catalog ships with a single model β€” Llama 3.2 1B Instruct):

ChatModelDescriptor(
    id: "mlx-community/my-model-4bit",
    displayName: "My Model (4-bit)",
    family: .other,
    approxSizeBytes: Int64(4.0 * 1_073_741_824),  // β‰ˆ 4 GB
    filePatterns: ChatModelCatalog.defaultFilePatterns,
    summary: "One-line description shown in the Models tab."
)

The downloader, registry, and Models tab UI pick it up automatically β€” no other changes needed. Browse available models at huggingface.co/mlx-community.

Add conversation persistence:

// Encode turns and write to the app's data directory:
let turns = await ChatProvider.shared.history.turns
let data = try JSONEncoder().encode(turns)
try data.write(to: OnyxPaths.baseDirectory().appending(path: "history.json"))

// Restore on launch:
let saved = try Data(contentsOf: OnyxPaths.baseDirectory().appending(path: "history.json"))
let turns = try JSONDecoder().decode([MLXConversationHistory.Turn].self, from: saved)

πŸ— Architecture

Onyx/
β”œβ”€β”€ Core Runtime
β”‚   β”œβ”€β”€ OnyxPaths.swift               β€” Sandbox-safe paths (AppSupport/Onyx/)
β”‚   β”œβ”€β”€ MLXErrors.swift               β€” Typed errors (metalUnavailable, modelNotInstalled, …)
β”‚   β”œβ”€β”€ MLXModelManager.swift         β€” actor: ModelContainer lifecycle + generateFromModel()
β”‚   └── MLXConversationHistory.swift  β€” actor: turn history, 16 K char / 10-pair auto-trim
β”‚
β”œβ”€β”€ Model Catalog & Registry
β”‚   β”œβ”€β”€ ChatModelCatalog.swift        β€” Curated list of downloadable models
β”‚   β”œβ”€β”€ ChatModelRegistry.swift       β€” actor: installed / active model tracking
β”‚   β”œβ”€β”€ ChatModelDownloader.swift     β€” actor: 5-phase HuggingFace download + retry
β”‚   β”œβ”€β”€ HardwareProfile.swift         β€” sysctl RAM/GPU detection; canLoadModel()
β”‚   └── ChatMemoryGate.swift          β€” Pre-flight RAM check before load
β”‚
β”œβ”€β”€ Chat Layer
β”‚   └── ChatProvider.swift            β€” @MainActor @Observable: UI ↔ MLX bridge
β”‚
└── Views
    β”œβ”€β”€ OnyxApp.swift                 β€” @main entry point (no SwiftData)
    β”œβ”€β”€ ContentView.swift             β€” TabView: Chat + Models + Settings
    β”œβ”€β”€ ChatView.swift                β€” Scrollable chat UI with input bar
    β”œβ”€β”€ MessageBubble.swift           β€” Markdown-rendering message row
    β”œβ”€β”€ ThinkingDotsView.swift        β€” Animated 3-dot waiting indicator
    β”œβ”€β”€ ModelsView.swift              β€” Download / activate / uninstall list
    β”œβ”€β”€ DownloadRow.swift             β€” Live-progress model card
    └── PreferencesView.swift         β€” Settings tab (SettingsView)

Data flow

User types β†’ ChatView
  β†’ ChatProvider.respond(to:)
    β†’ MLXConversationHistory.buildMessages(systemPrompt:)
      β†’ MLXModelManager.ensureLoaded(modelId:)     ← loads model lazily if needed
        β†’ generateFromModel(container:messages:)   ← nonisolated, off main thread
          β†’ AsyncStream<String>
            β†’ ChatView appends tokens to the streaming bubble in real time

Concurrency model

Component Isolation Reason
MLXModelManager actor Single owner of ModelContainer
MLXConversationHistory actor Turn array written from UI and inference tasks
ChatModelDownloader actor Background download; pub/sub via AsyncStream
ChatModelRegistry actor File I/O to active.txt and model directories
ChatProvider @MainActor View-model; drives SwiftUI @Observable state
generateFromModel() nonisolated GPU-intensive; must not block the main thread
sysctl helpers nonisolated Called at app launch before any actor exists

The build setting SWIFT_DEFAULT_ACTOR_ISOLATION = MainActor makes all unannotated functions @MainActor. Functions that must run off-thread are explicitly nonisolated.


πŸ”¨ Build commands

# Resolve Swift packages and build for Simulator (UI compiles; no inference)
xcodebuild -project Onyx/Onyx.xcodeproj -scheme Onyx \
  -destination 'platform=iOS Simulator,name=iPhone 16' \
  -resolvePackageDependencies

xcodebuild build -project Onyx/Onyx.xcodeproj -scheme Onyx \
  -destination 'platform=iOS Simulator,name=iPhone 16'

# On-device inference: connect an iPhone 15+ and select it in Xcode's scheme picker, then ⌘R

πŸ“± Hardware requirements

Device RAM Status
iPhone 15 (base) 6 GB βœ… Supported
iPhone 15 Pro / Max 8 GB βœ… Supported
iPhone 16 (all models) 8 GB βœ… Supported
iPad Pro M2+ 8–16 GB βœ… Supported
iOS Simulator β€” ❌ No Metal GPU β€” UI works, inference does not

OS support: iOS 17.0 – 26.x. iOS 27 beta is not yet supported β€” see the warning at the top.

The com.apple.developer.kernel.increased-memory-limit entitlement allows the app to keep a 2 GB model resident on 6 GB devices. Call MLXModelManager.shared.unloadModel() when entering the background to free memory:

// In OnyxApp.swift (or a scene delegate):
.onChange(of: scenePhase) { _, phase in
    if phase == .background {
        Task { await MLXModelManager.shared.unloadModel() }
    }
}

πŸŽ› Customisation reference

Key Storage Default Description
onyx.systemPrompt UserDefaults "You are a helpful AI assistant…" System prompt injected before every conversation
onyx.logPrompts UserDefaults true Log outgoing prompts to stdout (πŸ“¨ [Onyx])

Change settings at runtime:

// System prompt
ChatProvider.shared.systemPrompt = "You are a pirate. Respond only in pirate speak."

// Silence debug logging
OnyxSettings.shared.logPrompts = false

Model downloads

Downloads come straight from public mlx-community repos on HuggingFace β€” no account, token, or API key is needed. Gated or private repos are not supported by this build.


🀝 Contributing

  1. Fork the repo and create a feature branch: git checkout -b feature/my-improvement
  2. Make your changes. The easiest first contribution is adding a model β€” one line in ChatModelCatalog.swift.
  3. Run the simulator build to confirm it compiles cleanly (see Build commands).
  4. Open a pull request β€” describe what it adds and why it belongs in a skeleton.

All contributions are welcome: new model descriptors, UI improvements, documentation, and tests.


πŸ“„ License

Apache 2.0 β€” see LICENSE.

Built on:


Onyx 0.1 beta Β· Made for people who want their AI conversations to stay on their phone. πŸ’Ž

About

Project Onyx

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Swift 97.4%
  • Python 2.6%