Skip to content

Dynosol/imessage-lm

Repository files navigation

iMessage LM

Fine-tune a language model (Qwen3) on your iMessage history to create a chatbot that texts like you. Also, take a fun exploration into your message history.

Overview

This project extracts your iMessage conversations, processes them into training pairs, and fine-tunes a Qwen3 model using Unsloth for fast, memory-efficient training.

Setup

# Clone and install dependencies
git clone <repo-url>
cd imessage-lm
uv sync

# Create .env file with your credentials
cp .env.example .env
# Edit .env with your HuggingFace token and repo name

Environment Variables

Create a .env file:

HF_TOKEN=your_huggingface_token
HF_REPO=your_username/your_repo_name
BOT_NAME=YourBot

Workflow

1. Export iMessage Database

Run the dump_imessage_db.py script from imessage-dump to export your iMessage database, then drop it into imessage-sqlite-output-goes-here/.

2. Process Messages

Run finetune/clean.ipynb to:

  • Extract messages from the SQLite database
  • Filter out 2FA codes, automated messages, and noise
  • Create conversation pairs (their message -> your reply)
  • Export to finetune_data.jsonl

3. Fine-tune Model

Run finetune/main.ipynb to:

  • Load Qwen3-1.7B with 4-bit quantization
  • Fine-tune with LoRA using your conversation data
  • Push the adapter to HuggingFace Hub

4. Inference

Run inference.ipynb to chat with your fine-tuned model.

Project Structure

imessage-lm/
├── finetune/
│   ├── clean.ipynb      # Data extraction and processing
│   ├── explore.ipynb    # Data exploration
│   └── main.ipynb       # Model fine-tuning
├── inference.ipynb      # Run inference with trained model
├── imessage-sqlite-output-goes-here/  # Place chat.db here
├── finetune_data.jsonl  # Generated training data (gitignored)
├── pyproject.toml
└── .env                 # Your credentials (gitignored)

License

MIT

About

Finetune an open-source LLM on your iMessage history to mimic your speech patterns.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors