support quant#540
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces support for quantization in the SGLang backend by adding a new sglang_quantization argument to the SGLangBackendArgs class. The changes include updating the argument parser, the factory method, and the keyword argument conversion. A review comment suggests accessing the new attribute directly in the from_args method to maintain consistency with existing fields, as the attribute is guaranteed to be present.
| if hasattr(args, "target_batch_size") and hasattr(args, "max_length") | ||
| else None | ||
| ), | ||
| sglang_quantization=getattr(args, "sglang_quantization", None), |
There was a problem hiding this comment.
For consistency with the other sglang_* fields in this method (lines 189-200), you should access args.sglang_quantization directly. Since this argument is explicitly added to the parser in the add_args method of this class, it is guaranteed to be present in the args namespace when from_args is called.
| sglang_quantization=getattr(args, "sglang_quantization", None), | |
| sglang_quantization=args.sglang_quantization, |
Motivation
Modifications
Related Issues
Accuracy Test
Benchmark & Profiling
Checklist