server: add "schema" and validation#24150
Conversation
|
Testing this on my pod. It seems to me that the OAI spec can have sampling values less than 0 even if it is rarely useful in practice, such as encouraging repetitions with frequency_penalty. ( https://developers.openai.com/api/reference/python/resources/completions/methods/create search for "-2") |
| auto schema = make_llama_cmpl_schema(params_base, params); | ||
|
|
||
| // eval all fields in the schema | ||
| for (const auto & f : schema) { |
There was a problem hiding this comment.
Type errors return the raw nlohmann message without the field name, maybe wrap the eval loop in a try/catch to prepend it so the client knows which param failed?
There was a problem hiding this comment.
yes I added it in the last commit(s), along with some corrections for the numerical limits. PTAL
example error message:
"message": "Field 'min_keep': Value must be between 0 <= value <= 2147483647, but got -100"
"message": "Field 'min_keep': [json.exception.type_error.302] type must be number, but is string",
| using field_handler = std::function<void(field_eval_context &, const json &)>; | ||
|
|
||
| struct field { | ||
| std::set<const char *> name; |
There was a problem hiding this comment.
std::set<const char *> compares pointer addresses, not strings, so the alias order is not the insertion order. I got max_tokens winning over n_predict on my pod just by changing TU link order. A std::vector fixes it and it's a 2 line change, tested here.
| std::set<const char *> name; | |
| std::vector<const char *> name; |
And :
- name.insert(n);
+ name.push_back(n);
Overview
Add the notion of
server_schemathat will define the input schema in a more systematic way.This is actually something I wanted to do since a long time ago. Motivations for this proposal:
TODO in follow-up PRs:
Example of the code:
Requirements