Skip to content

mboard8070/article-gen

Repository files navigation

Article-Gen

AI-powered multi-modal content generation and publishing platform that creates professional articles with AI-generated images and publishes them to Substack.

Case Study

Problem: Publishing AI-assisted articles is usually split across separate tools for topic discovery, drafting, image generation, captioning, and CMS upload. That creates a lot of manual glue work before an article is ready to review.

What I built: Article-Gen is a Streamlit workflow that generates long-form articles, creates contextual Flux images, captions those images with LLaVA, and publishes the complete result to Substack as a draft.

Architecture: The app pulls or accepts article prompts, uses a local Gemma 2 9B text-generation route through Ollama, generates three Flux.1-dev images for the article body, captions images with LLaVA 1.5, and sends the assembled draft through a Substack publisher module.

Technical depth: The repo includes Docker support, local and remote Flux app variants, Substack integration, custom prompt mode, training-data preparation, PEFT/LoRA style fine-tuning scripts, and model merge utilities.

Proof: The README documents the full workflow, model parameters, Docker setup, local setup, required credentials, and publishing path from generated article to Substack draft.

Tradeoffs: The system targets draft creation rather than unattended publishing. Keeping Substack output as drafts preserves editorial review while still automating the repetitive generation and formatting work.

Features

  • Automatic Article Generation: Fetches trending news from Science, Technology, and Business topics and generates 800-1000 word professional articles
  • Custom Generation Mode: Provide your own prompts for full control over article and image content
  • AI Image Generation: Creates three contextual images per article (opening, middle, closing) using Flux.1-dev
  • Automatic Image Captioning: Generates professional captions for each image using LLaVA vision model
  • Substack Integration: Publishes complete articles with images as drafts directly to your Substack publication
  • Fine-tuning Support: Train custom writing styles using LoRA adapters on the Gemma model

Tech Stack

  • Text Generation: Google Gemma 2 9B (via Ollama)
  • Image Generation: Flux.1-dev (Black Forest Labs)
  • Image Captioning: LLaVA 1.5 7B
  • UI Framework: Streamlit
  • News Source: Google News API
  • Fine-tuning: PEFT/LoRA with HuggingFace Transformers

Requirements

  • NVIDIA GPU with CUDA support
  • Docker with NVIDIA GPU support (recommended)
  • HuggingFace API token (for gated models)
  • Substack account (for publishing)

Installation

Using Docker (Recommended)

# Clone the repository
git clone <repo-url>
cd article-gen

# Configure environment variables
cp .env.example .env
# Edit .env with your credentials

# Start the application
docker-compose up -d

Access the application at http://localhost:8502

Local Installation

# Create virtual environment
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Run the application
streamlit run app.py

Configuration

Create a .env file with the following variables:

HF_TOKEN=your_huggingface_token
SUBSTACK_PUBLICATION_URL=https://yourname.substack.com
SUBSTACK_EMAIL=your_email@gmail.com
SUBSTACK_PASSWORD=your_password

For cookie-based Substack authentication, export your browser cookies using EditThisCookie and save as substack_cookies.json.

Usage

Generate Articles

  1. Launch the Streamlit app
  2. Choose generation mode:
    • Automatic: Select a news topic and let the AI generate content
    • Custom: Enter your own article and image prompts
  3. Review the generated article and images
  4. Publish to Substack as a draft

Fine-tune Custom Writing Style

# Prepare training data
python prepare_training_data.py --input your_articles.json --output training_data.jsonl

# Train LoRA adapter
python train_lora.py --epochs 3 --batch-size 1 --lr 2e-4

# Merge adapter with base model
python merge_lora.py

Project Structure

article-gen/
├── app.py                    # Main Streamlit app (remote Flux API)
├── app_local.py              # Streamlit app (local Flux model)
├── substack_publisher.py     # Substack integration module
├── train_lora.py             # LoRA fine-tuning script
├── prepare_training_data.py  # Training data preparation
├── merge_lora.py             # Merge LoRA with base model
├── Modelfile                 # Ollama model configuration
├── requirements.txt          # Python dependencies
├── Dockerfile                # Container configuration
└── docker-compose.yml        # Docker Compose setup

Model Parameters

Model Parameter Value
Gemma 2 9B Max tokens 2048
Gemma 2 9B Temperature 0.7
Flux.1-dev Resolution 1024x1024
Flux.1-dev Inference steps 30
LLaVA 1.5 Max new tokens 80

License

MIT License

About

AI article-generation and Substack draft workflow with Gemma, Flux images, LLaVA captions, Streamlit, and LoRA style tuning

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors