Skip to content

BUG: Duplicate bias addition in PseudoQuant4x16NoMasterFn causes incorrect inference output #16

@manyizhang

Description

@manyizhang

In the file pseudoquant_linear_fns.py, inside the class PseudoQuant4x16NoMasterFn (line151-155), the forward pass contains the following code:

        y = torch.nn.functional.linear(x_flat_dq, weight_dq, bias)

        y = y.unflatten(dim=0, sizes=x.shape[:-1])
        if bias is not None:
            y += bias

The manual addition (y += bias) results in the bias being added twice, which leads to incorrect inference output when module has bias term .

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions