Crate half

source ·
Expand description

A crate that provides support for half-precision 16-bit floating point types.

This crate provides the f16 type, which is an implementation of the IEEE 754-2008 standard binary16 a.k.a half floating point type. This 16-bit floating point type is intended for efficient storage where the full range and precision of a larger floating point value is not required. This is especially useful for image storage formats.

This crate also provides a bf16 type, an alternative 16-bit floating point format. The bfloat16 format is a truncated IEEE 754 standard binary32 float that preserves the exponent to allow the same range as f32 but with only 8 bits of precision (instead of 11 bits for f16). See the bf16 type for details.

Because f16 and bf16 are primarily for efficient storage, floating point operations such as addition, multiplication, etc. are not implemented by hardware. While this crate does provide the appropriate trait implementations for basic operations, they each convert the value to f32 before performing the operation and then back afterward. When performing complex arithmetic, manually convert to and from f32 before and after to reduce repeated conversions for each operation.

This crate also provides a slice module for zero-copy in-place conversions of u16 slices to both f16 and bf16, as well as efficient vectorized conversions of larger buffers of floating point values to and from these half formats.

The crate uses #[no_std] by default, so can be used in embedded environments without using the Rust std library. A std feature to enable support for the standard library is available, see the Cargo Features section below.

A prelude module is provided for easy importing of available utility traits.

Cargo Features

This crate supports a number of optional cargo features. None of these features are enabled by default, even std.

  • use-intrinsics – Use core::arch hardware intrinsics for f16 and bf16 conversions if available on the compiler target. This feature currently only works on nightly Rust until the corresponding intrinsics are stabilized.

    When this feature is enabled and the hardware supports it, the functions and traits in the slice module will use vectorized SIMD intructions for increased efficiency.

    By default, without this feature, conversions are done only in software, which will also be the fallback if the target does not have hardware support. Note that without the std feature enabled, no runtime CPU feature detection is used, so the hardware support is only compiled if the compiler target supports the CPU feature.

  • alloc – Enable use of the alloc crate when not using the std library.

    Among other functions, this enables the vec module, which contains zero-copy conversions for the Vec type. This allows fast conversion between raw Vec<u16> bits and Vec<f16> or Vec<bf16> arrays, and vice versa.

  • std – Enable features that depend on the Rust std library. This also enables the alloc feature automatically.

    Enabling the std feature also enables runtime CPU feature detection when the use-intrsincis feature is also enabled. Without this feature detection, intrinsics are only used when compiler target supports the target feature.

  • serde – Adds support for the serde crate by implementing Serialize and Deserialize traits for both f16 and bf16.

  • num-traits – Adds support for the num-traits crate by implementing ToPrimitive, FromPrimitive, AsPrimitive, Num, Float, FloatCore, and Bounded traits for both f16 and bf16.

  • bytemuck – Adds support for the bytemuck crate by implementing Zeroable and Pod traits for both f16 and bf16.

  • zerocopy – Adds support for the zerocopy crate by implementing AsBytes and FromBytes traits for both f16 and bf16.

Modules

  • A collection of the most used items and traits in this crate for easy importing.
  • Contains utility functions and traits to convert between slices of u16 bits and f16 or bf16 numbers.
  • Contains utility functions and traits to convert between vectors of u16 bits and f16 or bf16 vectors.

Structs

  • A 16-bit floating point type implementing the bfloat16 format.
  • A 16-bit floating point type implementing the IEEE 754-2008 standard binary16 a.k.a half format.