Better support for multiclass/multilabel classifiers

Our current `SequenceClassificationResult` proto and corresponding REST response shape assumes single-label classification — we return a single `predicted_index` and optional `predicted_label`, which is just wrong for multilabel models where multiple labels can be active simultaneously. Users with multilabel models are currently getting silently incorrect results.

**Changes needed:**

Refactor `SequenceClassificationResult` to drop `predicted_index` and `predicted_label` in favor of a `repeated ClassificationLabel` field, where each entry contains the index, label string, and score for that prediction. For single-label models this will have one entry, for multilabel models it will contain all labels that cleared the sigmoid threshold. Raw `logits` stay as-is.

```proto
message ClassificationLabel {
  uint32 index = 1;
  string label = 2;
}

message ClassificationResult {
  repeated float logits = 1;
  repeated ClassificationLabel labels = 2;
}
```

Same fix should be applied to the REST response shape for consistency.

Note this is a breaking change to the proto.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better support for multiclass/multilabel classifiers #360

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Better support for multiclass/multilabel classifiers #360

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions