MLX engine: agent.py shows no response in chat mode (missing streaming support)

### Environment

- **Device**: Macmini m4
- **Python**: 3.x
- **Mode**: Option B (MLX + 9B)
- **Model**: `mlx-community/Qwen3.5-9B-MLX-4bit`

### Description

When using `mlx_engine.py` as the backend (Option B), `agent.py` detects the model correctly and intent classification works, but **no response is displayed** for regular chat messages.

The MLX server logs show all requests returning `200`, so the server is processing requests successfully — the issue is in the response format.

### Steps to Reproduce

1. Start MLX engine:
   ```bash
   python3 mlx/mlx_engine.py
   ```

2. In another terminal, start agent:
   ```bash
   python3 agent.py
   ```

3. Type any message (e.g., "hello") and press Enter

4. The spinner shows "classifying" → "thinking", then returns to the prompt with **no output**

### Expected Behavior

The model's response should be displayed, just like when using `llama-server` (Option A).

### Actual Behavior

- The prompt returns with no visible response
- The MLX engine logs show successful `200` responses:

```
"POST /v1/chat/completions HTTP/1.1" 200 -
"POST /v1/chat/completions HTTP/1.1" 200 -
"GET /props HTTP/1.1" 200 -
```

```

  🍎 mac code
  claude code, but it runs on your Mac for free

  model  Qwen3.5-9b-MLX  local
  tools  search · fetch · exec · files
  cost   $0.00/hr  Apple M4 Metal · localhost:8000

─────────────────────────────────────────────────────────
  type / to see all commands

  auto ? > hello




  auto ? > 
```

### Root Cause Analysis

After reading the code, I believe the issue is that `mlx_engine.py` does **not support streaming responses**.

In `agent.py`, chat responses go through `stream_llm()` (line 525), which sends `"stream": true` and expects **Server-Sent Events (SSE)** format:

```
data: {"choices":[{"delta":{"content":"Hello"}}]}
data: {"choices":[{"delta":{"content":"!"}}]}
data: [DONE]
```

However, `mlx_engine.py`'s `_handle_chat()` (line 246) always returns a single JSON response, ignoring the `stream` parameter. When `agent.py` tries to parse the response as SSE, it gets nothing.

Non-streaming calls (like `classify_intent()` via `llm_call()`) work fine because they don't send `"stream": true`.

### Suggested Fix

Add SSE streaming support to `mlx_engine.py` when `stream=True` is requested, using `mlx_lm.stream_generate()` to yield tokens incrementally in the SSE format that `agent.py` expects.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MLX engine: agent.py shows no response in chat mode (missing streaming support) #2

Environment

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Suggested Fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

MLX engine: agent.py shows no response in chat mode (missing streaming support) #2

Description

Environment

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Suggested Fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions