numbarrow.core.is_null

Overview

Arrow uses a packed validity bitmap to track which elements in an array are non-null. Each bit corresponds to one element: bit=1 means valid, bit=0 means null. Bits are packed LSB-first into uint8 bytes.

This module provides a single Numba @njit compiled function that reads the bitmap and returns whether a given element index is null.

Module

Null detection for Apache Arrow validity bitmaps.

Arrow uses a packed bitmap to track which elements in an array are valid (non-null). Each bit corresponds to one element: bit=1 means valid, bit=0 means null. Bits are packed LSB-first into uint8 bytes — element i lives at byte i // 8, bit position i % 8 within that byte.

numbarrow.core.is_null.is_null(index_: int, bitmap: ndarray) bool[source]

Check whether element index_ is null according to bitmap.

Arrow validity bitmaps store one bit per element, packed LSB-first into uint8 bytes. A set bit (1) means valid; a cleared bit (0) means null.

Parameters:
  • index – zero-based element index

  • bitmap – uint8 array containing the packed validity bitmap

Returns:

True if the element is null (bit is 0), False if valid (bit is 1)

numbarrow.core.is_null.is_null_struct(index_, struct_bitmap, field_bitmap)[source]

Check whether a struct field value is null at either the struct or field layer.

Arrow StructArrays carry a validity bitmap for the struct itself (is this entire row null?) independent of each child field’s bitmap (is this particular field null within a non-null row?). A value is null if either layer marks it as null.

Parameters:
  • index – zero-based element index

  • struct_bitmap – uint8 packed bitmap for struct-level validity, or None

  • field_bitmap – uint8 packed bitmap for field-level validity, or None

Returns:

True if null at either layer

numbarrow.core.is_null.unpack_booleans(offset: int, length: int, packed_data: ndarray) ndarray[source]

Unpack bit-packed boolean data into a boolean array.

Parameters:
  • offset – bit offset into packed_data to start reading

  • length – number of boolean values to extract

  • packed_data – uint8 array containing LSB-first packed bits

Returns:

boolean array of length elements