Skip to content

fix: Streaming stalls mid-generation with Ollama models, requires user input to resume #50

@jstrockin

Description

@jstrockin

Bug Description

When using Ollama models, streaming output stalls mid-generation.
The agent loop pauses and waits for user input before continuing.
Typing anything (Enter or a short message) causes generation to resume.
Expected behaviour: streaming continues uninterrupted without user intervention.

Steps to Reproduce

1. Run kit with any Ollama model (tested: ollama/gemma4:latest, ollama/qwen3:14b)
2. Ask a question that produces a long or multi-step response
3. Observe output stops mid-generation
4. Type anything and press Enter — generation resumes

Relevant Code / Configuration

temperature: 0.6
top-k: 20
top-p: 0.95
max-tokens: 8192
thinking-level: "off"
max-steps: 50

Affected Component

Streaming / Ollama provider

Kit Version

dev (go install github.com/mark3labs/kit@latest, built 2026-05-25)

Additional Context

Ollama version: 0.30.6
Hardware: Apple M4 Mac mini
No keepalive or poll-interval flags available in kit --help.
Issue occurs consistently across multiple sessions and models.

Checklist

  • I've searched existing issues and this hasn't been reported yet
  • I've tested with the latest version of Kit

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions