Skip to content

server: add "schema" and validation#24150

Open
ngxson wants to merge 4 commits into
masterfrom
xsn/server_schema
Open

server: add "schema" and validation#24150
ngxson wants to merge 4 commits into
masterfrom
xsn/server_schema

Conversation

@ngxson
Copy link
Copy Markdown
Contributor

@ngxson ngxson commented Jun 4, 2026

Overview

Add the notion of server_schema that will define the input schema in a more systematic way.

This is actually something I wanted to do since a long time ago. Motivations for this proposal:

  1. To address bugs like GHSA-8947-pfff-2f3c due to lack of input validation
  2. Generalize the validation logic
  3. Inline documentation, can be exported to markdown in the future

TODO in follow-up PRs:

  • Migrate other schema (for example, chat completion) to using this system
  • Export schema to markdown
  • Allow linking arg.cpp <--> server-schema.cpp via an enum? (not sure how it will be useful)

Example of the code:

// old way
params.sampling.top_k              = json_value(data, "top_k",               defaults.sampling.top_k);

// new way
add((new field_num("top_k", params.sampling.top_k))
    ->set_limits(0, INT32_MAX)
    ->set_desc("Limit the next token selection to the K most probable tokens (0 = disabled)"));

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: to spot and fix mistakes during the migration

@ngxson ngxson requested a review from a team as a code owner June 4, 2026 23:34
@ServeurpersoCom
Copy link
Copy Markdown
Contributor

ServeurpersoCom commented Jun 5, 2026

Testing this on my pod. It seems to me that the OAI spec can have sampling values less than 0 even if it is rarely useful in practice, such as encouraging repetitions with frequency_penalty. ( https://developers.openai.com/api/reference/python/resources/completions/methods/create search for "-2")

auto schema = make_llama_cmpl_schema(params_base, params);

// eval all fields in the schema
for (const auto & f : schema) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type errors return the raw nlohmann message without the field name, maybe wrap the eval loop in a try/catch to prepend it so the client knows which param failed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I added it in the last commit(s), along with some corrections for the numerical limits. PTAL

example error message:

"message": "Field 'min_keep': Value must be between 0 <= value <= 2147483647, but got -100"

"message": "Field 'min_keep': [json.exception.type_error.302] type must be number, but is string",

Comment thread tools/server/server-schema.h Outdated
using field_handler = std::function<void(field_eval_context &, const json &)>;

struct field {
std::set<const char *> name;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::set<const char *> compares pointer addresses, not strings, so the alias order is not the insertion order. I got max_tokens winning over n_predict on my pod just by changing TU link order. A std::vector fixes it and it's a 2 line change, tested here.

Suggested change
std::set<const char *> name;
std::vector<const char *> name;

And :

-        name.insert(n);
+        name.push_back(n);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, nice catch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants