BUG: Duplicate bias addition in PseudoQuant4x16NoMasterFn causes incorrect inference output

In the file pseudoquant_linear_fns.py, inside the class PseudoQuant4x16NoMasterFn (line151-155), the forward pass contains the following code:
```
        y = torch.nn.functional.linear(x_flat_dq, weight_dq, bias)

        y = y.unflatten(dim=0, sizes=x.shape[:-1])
        if bias is not None:
            y += bias
```
The manual addition (y += bias) results in the bias being added twice, which leads to incorrect inference output when module has bias term .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Duplicate bias addition in PseudoQuant4x16NoMasterFn causes incorrect inference output #16

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

BUG: Duplicate bias addition in PseudoQuant4x16NoMasterFn causes incorrect inference output #16

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions