Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Float16

16-bit half-precision floating-point number.

Usage

var Float16 = require( '@stdlib/number/float16/ctor' );

Float16( value )

16-bit half-precision floating-point number constructor.

var x = new Float16( 5.0 );
// returns <Float16>

Properties

Float16.name

Static property returning the constructor name.

var str = Float16.name;
// returns 'Float16'

Float16.BYTES_PER_ELEMENT

Size (in bytes) of the underlying value.

var nbytes = Float16.BYTES_PER_ELEMENT;
// returns 2

Float16.prototype.BYTES_PER_ELEMENT

Size (in bytes) of the underlying value.

var x = new Float16( 5.0 );

var nbytes = x.BYTES_PER_ELEMENT;
// returns 2

Instance

A Float16 instance has the following properties...

value

A read-only property returning the underlying value as a number.

var x = new Float16( 5.0 );

var v = x.value;
// returns 5.0

Methods

Accessor Methods

These methods do not mutate a Float16 instance and, instead return a half-precision floating-point number representation.

Float16.prototype.toString()

Returns a string representation of a Float16 instance.

var x = new Float16( 5.0 );
var str = x.toString();
// returns '5'

x = new Float16( -3.14 );
str = x.toString();
// returns '-3.140625'

Float16.prototype.toJSON()

Returns a JSON representation of a Float16 instance. JSON.stringify() implicitly calls this method when stringifying a Float16 instance.

var x = new Float16( 5.0 );

var o = x.toJSON();
/*
  {
    "type": "Float16",
    "value": 5.0
  }
*/

To revive a Float16 number from a JSON string, see @stdlib/number/float16/reviver.

Float16.prototype.valueOf()

Converts a Float16 instance to a primitive value.

var x = new Float16( 5.0 );
var v = x.valueOf();
// returns 5.0

x = new Float16( 3.14 );
v = x.valueOf();
// returns 3.140625

Notes

  • The underlying value is stored as a half-precision floating-point number IEEE 754 with 1 sign bit, 5 exponent bits, and 10 significand bits.
  • A half-precision floating-point number has a range of approximately ±6.55e4 and a precision of about 3-4 decimal digits.

Examples

var Float16 = require( '@stdlib/number/float16/ctor' );

var x = new Float16( 3.14 );

console.log( 'type: %s', typeof x );
// => 'type: object'

console.log( 'str: %s', x );
// => 'str: 3.140625'

console.log( 'value: %d', x.value );
// => 'value: 3.140625'

console.log( 'JSON: %s', JSON.stringify( x ) );
// => 'JSON: {"type":"Float16","value":3.140625}'

C APIs

Usage

#include "stdlib/number/float16/ctor.h"

stdlib_float16_t

An opaque type definition for a half-precision floating-point number.

stdlib_float16_t v = stdlib_float16_from_bits( 51648 );

stdlib_float16_bits_t

An opaque type definition for a union for accessing the underlying binary representation of a half-precision floating-point number.

#include <stdint.h>

stdlib_float16_t x = stdlib_float16_from_bits( 51648 );

stdlib_float16_bits_t y;
y.value = x;

uint16_t bits = y.bits;
// returns 51648

The union has the following members:

  • value: stdlib_float16_t half-precision floating-point number.
  • bits: uint16_t binary representation.

The union allows "type punning"; however, while (more or less) defined in C99, behavior is implementation-defined in C++. For more robust conversion, prefer using explicit helpers for converting to and from binary representation.

stdlib_float16_from_bits( bits )

Converts a 16-bit binary representation to a half-precision floating-point number.

stdlib_float16_t v = stdlib_float16_from_bits( 51648 ); // => -11.5

The function accepts the following arguments:

  • bits: [in] uint16_t 16-bit integer corresponding to a binary representation.

stdlib_float16_to_bits( x )

Converts a half-precision floating-point number to a 16-bit binary representation.

#include <stdint.h>

stdlib_float16_t v = stdlib_float16_from_bits( 51648 ); // => -11.5

uint16_t bits = stdlib_float16_to_bits( v );

The function accepts the following arguments:

  • x: [in] stdlib_float16_t half-precision floating-point number.

Notes

  • The stdlib_float16_t type should be treated as a storage and interchange type. Native hardware support for mathematical functions operating on half-precision floating-point numbers varies. As a consequence, for most operations, one should first promote to single-precision (i.e., float), perform the desired operation, and then downcast back to half-precision.

Examples

#include "stdlib/number/float16/ctor.h"
#include <stdint.h>
#include <stdio.h>

int main( void ) {
  const stdlib_float16_t x[] = {
    stdlib_float16_from_bits( 51648 ), // -11.5
    stdlib_float16_from_bits( 18880 )  // 11.5
  };

  int i;
  for ( i = 0; i < 2; i++ ) {
    printf( "%d\n", stdlib_float16_to_bits( x[ i ] ) );
  }
}