numbox.utils
numbox.utils.highlevel
Dynamically defining StructRef
Defining numba StructRef requires writing a lot of boilerplate code.
A utility for concise definition of StructRef types that supports caching
is provided in numbox.utils.highlevel.make_structref(). To use it,
define a separate module such as type_classes.py such as:
from numba.experimental.structref import register
from numba.core.types import StructRef
@register
class DataStructTypeClass(StructRef):
pass
Then in a different module main.py define:
from numba.core.types import float32, unicode_type
from numpy import isclose
from numbox.utils.highlevel import make_structref
from type_classes import DataStructTypeClass
def derive_output(struct_):
if struct_.control == "double":
return struct_.value * 2
else:
return struct_.value
data_struct = make_structref(
"DataStruct",
{"value": float32, "control": unicode_type},
DataStructTypeClass,
struct_methods={
"derive_output": derive_output
}
)
if __name__ == "__main__":
data_1 = data_struct(3.14, "double")
data_2 = data_struct(2.17, "something else")
assert isclose(data_1.derive_output(), 6.28)
assert isclose(data_2.derive_output(), 2.17)
- numbox.utils.highlevel.cres(sig, **kwargs)[source]
Returns Python proxy to FunctionType rather than CPUDispatcher returned by njit
- numbox.utils.highlevel.cres_if_available(lib, sig, **kwargs)[source]
Like
cres(sig, **kwargs), but stubs out the wrapper if the C symbol matchingfunc.__name__is absent fromlib.Use for binding sets that target multiple library versions where some symbols may only exist in newer releases. Callers get a stub that raises
NotImplementedErrorinstead of a confusing LLVM link error at call time.
- numbox.utils.highlevel.make_structref(struct_name: str, struct_fields: Iterable[str] | dict[str, Type], struct_type_class: type | Type, *, struct_methods: dict[str, Callable] | None = None, jit_options: dict | None = None)[source]
Makes structure type with struct_name and struct_fields from the StructRef type class.
A unique struct_type_class for each structref needs to be provided. If caching of code that will be using the created struct type is desired, these type class(es) need/s to be defined in a python module that is not executed. (Same requirement is also to observed even when the full definition of StructRef is entirely hard-coded rather than created dynamically.)
In particular, that’s why struct_type_class cannot be incorporated into the dynamic compile / exec routine here.
Dictionary of methods to be bound to the created structref can be provided as well. Struct methods will get inlined into the caller if numba deems it to be optimal (even if jit_options says otherwise), therefore changing the methods code without poking the jitted caller can result in a stale cache - when the latter is cached. This is not an exclusive limitation of a dynamic structref creation via this function and is equally true when the structref definition is coded explicitly.
Anchor file
The generated
code_txtis written to a content-addressed file under numba’s cache directory and that file – nothighlevel.py– is used as thecompile()anchor. See the “Cache-anchor mechanism” section indocs/numbox.utils.rstfor the rationale.
numbox.utils.preprocessing
Cache-anchor mechanism
make_structref writes the generated code_txt to a
content-addressed file under numba’s cache directory and uses that
file – not highlevel.py – as the compile() anchor. The
content-addressing is what keeps numba’s per-overload cache correct
when two versions of the generated code differ only in co_consts.
Numba’s per-overload cache key
(numba.core.caching.Cache._index_key) is:
(sig, codegen.magic_tuple(), hash(co_code), hash(closure_cells))
It hashes co_code only, not co_consts. Python’s
LOAD_CONST
opcode encodes an index into co_consts rather than the value
itself, so two methods differing only in a numeric literal
(return self.x + 1 vs return self.x + 1000) produce identical
co_code. With a shared anchor file both would resolve to the same
cache_subpath (numba’s
_CacheLocator.get_suitable_cache_subpath derives the cache subdir
from a hash of co_filename) and the second to compile would
silently load the first’s binary. Per-content anchors segregate the
two by cache_subpath so the collision never arises.
Structural body changes – different operators, additional
statements, renamed variables – produce different co_code and
therefore different _index_key values regardless of the anchoring
scheme; those invalidate cleanly even under a shared anchor. The
narrow failure mode protected by content-addressing is constant-only
edits (numeric or string literals, default arg values) where
co_code is identical across versions.
Python 3.14’s
LOAD_SMALL_INT
opcode inlines small integers directly into co_code, narrowing
the failure mode on that version to constants outside the inline
range. Earlier supported versions (3.10–3.13) collide on any
constant edit.
See also numba.core.caching.Cache._index_key and
numba.core.caching._SourceFileBackedLocatorMixin.get_source_stamp
in numba’s source for the cache key construction and source-stamp
validity check.
Source-anchor machinery for dynamically-exec’d code.
Content-addressed anchors keep numba’s per-overload cache correct
when two exec’d code blocks differ only in co_consts. See the
“Cache-anchor mechanism” section in docs/numbox.utils.rst for
the rationale and references.
numbox.utils.lowlevel
- numbox.utils.lowlevel.array_data_p(arr)[source]
Return the data pointer of a numpy array as signed intp.
arr.ctypes.dataisuint64under numba; the cast aligns with the signed-pointer convention used by numbox binding signatures. Callable from Python and@njitcontexts.
- numbox.utils.lowlevel.extract_struct_member(context: BaseContext, builder: IRBuilder, struct_fe_ty: StructRef, struct_obj, member_name: str, incref: bool = False)[source]
For the given struct object of the given numba (front-end) type extract member with the given name (must be literal, available at compile time)
- numbox.utils.lowlevel.get_func_p_from_func_struct(builder: IRBuilder, func_struct)[source]
Extract void* function pointer from the low-level FunctionType structure
- numbox.utils.lowlevel.get_str_from_p_as_int(p)[source]
Given pointer to null-terminated array of characters as an integer p, return unicode string object copying the original string’s data
- numbox.utils.lowlevel.get_unicode_data_p(s)[source]
Given Python unicode string, return pointer to its data payload, array of null-terminated characters. See https://github.com/numba/numba/blob/release0.61/numba/cpython/unicode.py#L83
- numbox.utils.lowlevel.load_at(p, ty)[source]
Load a value of type
tyfrom raw pointerp(carried asintp).Caller is responsible for
ppointing at a live region of memory whose LLVM layout matchesty.
- numbox.utils.lowlevel.populate_structref(context, builder, signature, structref_type_, structref_, args, ordered_args_names, decref_old=False)[source]
Store args with the corresponding ordered names ordered_args_names in structref with type structref_type_ and payload at data_pointer.
Based on numba.experimental.structref::define_attributes::struct_setattr_impl
Do not call decref_old when populating a newly-created structref, as there’s nothing to decref there.
- numbox.utils.lowlevel.store_at(p, v)[source]
Store
vat raw pointerp(LLVM type derived fromv’s numba type).Caller is responsible for
ppointing at a writable region of memory whose LLVM layout matchesv’s type, and for castingvto the intended width (e.g.store_at(p, int32(value))to write 4 bytes).