compressed nan/inf representations used for new fp8? e4m3 e5m2

via https://dblalock.substack.com/p/2022-9-18-arxiv-roundup-reliable

> [FP8 Formats for Deep Learning](https://arxiv.org/abs/2209.05433)
> A group of NVIDIA, ARM, and Intel researchers got fp8 training working reliably, with only a tiny accuracy loss compared to fp16.

> 8-bit floating point (FP8) binary interchange format consisting of two encodings - E4M3 (4-bit exponent and 3-bit mantissa) and E5M2 (5-bit exponent and 2-bit mantissa). While E5M2 follows IEEE 754 conventions for representatio of special values, E4M3's dynamic range is extended by not representing infinities and having only one mantissa bit-pattern for NaNs. We demonstrate the efficacy of the FP8 format on a variety of image and language tasks, effectively matching the result quality achieved by 16-bit training sessions.

> reducing the number of NaN/Inf encodings in fp1-4-3 down to just one bitstring

> how much accuracy loss does this approach cause? They find that, across a huge array of models and tasks, the consistent answer is: not much—around 0-.3% accuracy/BLEU/perplexity:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compressed nan/inf representations used for new fp8? e4m3 e5m2 #16

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

compressed nan/inf representations used for new fp8? e4m3 e5m2 #16

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions