"""Variadic formatted I/O — ``printf``, ``fprintf``, ``snprintf``, ``sscanf``
callable from BOTH plain Python and ``@njit`` code.
The public bindings are regular Python functions; numba dispatches to a
private ``@intrinsic`` codegen path via ``@overload`` when called inside
``@njit``. Same source runs unchanged in either mode, matching numba's
own convention for builtins like ``print`` and ``range``::
def debug_kernel(x, label):
printf("step %d: %s\\n", x, label)
fflush(stdout())
return x * 2
debug_kernel(7, "before") # pure Python: writes via sys.stdout
njit(debug_kernel)(7, "before") # jitted: writes via libc printf
Call convention
---------------
User-facing API is C-like — positional ``*args`` after the format string;
no tuple wrapper at the call site. Internally the @overload path bundles
the args into a tuple before calling the private ``_xxx_intrinsic`` (the
intrinsic itself still uses the tuple-as-args shape because numba's
``@intrinsic`` typing function doesn't accept Python-level ``*args``)::
printf("x = %d, ratio = %.3f\\n", n, ratio)
fprintf(stderr(), "warning: %s\\n", msg)
snprintf(array_data_p(buf), buf.size, "[%d:%d]", lo, hi)
sscanf(buf_p, "%d %lf", array_data_p(n_out), array_data_p(x_out))
printf("no args here\\n")
Format string must be a literal in @njit
----------------------------------------
Required so the format string can be embedded as an IR global constant —
the same constraint a C compiler operates under when emitting a
format-checked printf call. A runtime-built ``unicode`` raises a clean
``TypingError`` at call typing time. In pure-Python mode the format
string can of course be any str.
Format string encoding: UTF-8
-----------------------------
Non-ASCII codepoints in the literal are encoded as UTF-8 byte sequences
and embedded into the IR global. printf treats every non-``%`` byte as
opaque pass-through, so the bytes flow through libc to stdout / FILE\\* /
the snprintf buffer unmodified. Modern terminals, files, and Windows
10+ consoles all expect UTF-8.
.. note::
``%-Ns`` width is byte-counted by printf in every libc, so non-ASCII
output won't right-pad to a codepoint count. That's printf's contract.
Pad in numba-side string formatting (``f"{s:<10}"``) before passing
through ``%s`` if codepoint-counted widths matter.
Cross-mode caveats
------------------
1. **Length modifiers in format strings.** ``%lld``, ``%ld``, ``%lf``,
``%hd``, etc. are valid in C printf but rejected by Python's ``%``
operator. The pure-Python impls strip length modifiers via a regex
before formatting (``%lld`` → ``%d``, ``%.3lf`` → ``%.3f``). The
stripped form produces identical output for typical values because
Python ints / floats carry the same width independent of the spec.
2. **C-ABI auto-promotion of integer args to 64-bit.** The @njit impl
widens every integer variadic arg to 64-bit before the libc call
(sext / zext as appropriate). Without this, a user writing
``printf("%lld", np.int32(7))`` in @njit would have libc read 8 bytes
from a 4-byte source — register garbage in the high bits. With the
widening, ``%lld`` against int32 / int16 / int8 / bool works
correctly. Diverges from C ABI (C doesn't promote int to long long)
but matches user expectations and the pure-Python ``%`` behavior.
3. **String args + ``%s``.** Pure-Python's ``%`` handles strings
natively. The @njit @overload auto-converts ``unicode_type`` args
via ``get_unicode_data_p`` so libc sees a NUL-terminated C string.
Users no longer need to call ``get_unicode_data_p`` themselves at
the call site.
4. **``%ld`` on Win64 (LLP64).** ``long`` is 4 bytes on Win64 but 8 on
LP64; ``%ld`` against int64 truncates the high 32 bits on Win64 in
@njit. Pure-Python mode hides this because Python's ``%`` ignores
length modifiers. Prefer ``%lld`` + int64 for portable 8-byte width.
5. **``snprintf`` truncation rc on Windows.** Pure-Python and Linux/macOS
@njit follow C99 semantics (return would-have-written count).
Windows @njit targets MSVCRT ``_snprintf`` (returns ``-1`` on
truncation, no NUL-term guarantee). Portable check that works on
every platform: ``(rc < 0) or (rc >= size)``.
6. **``fprintf`` to non-stdio FILE\\* in pure-Python.** The Python impl
routes ``stdout()`` / ``stderr()`` / ``stdin()`` handles to the
corresponding ``sys.*`` streams via an address cache. ``fopen``-
returned FILE\\* values aren't dereferenceable from Python without
a ctypes call, so they raise a clear error in pure-Python mode
(use ``open()`` + ``f.write()`` for Python-side file I/O).
7. **``sscanf`` is @njit-only.** Pure-Python implementations of
sscanf-style parsing are usually better served by ``int()`` /
``float()`` / ``re``; calling from pure-Python raises
``NotImplementedError``.
References
----------
- `printf(3) <https://man7.org/linux/man-pages/man3/printf.3.html>`_
- `fprintf(3) <https://man7.org/linux/man-pages/man3/fprintf.3.html>`_
- `snprintf(3) <https://man7.org/linux/man-pages/man3/snprintf.3.html>`_
- `sscanf(3) <https://man7.org/linux/man-pages/man3/sscanf.3.html>`_
- `Microsoft _snprintf
<https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/snprintf-snprintf-snprintf-l-snwprintf-snwprintf-l>`_
"""
import ctypes
import re
import sys
from llvmlite import ir as llir
from numba.core import cgutils
from numba.core.cgutils import get_or_insert_function
from numba.core.errors import TypingError
from numba.core.types import (
BaseTuple, Boolean, Float, Integer, UnicodeType, int32, intp,
unliteral,
)
from numba.extending import intrinsic, overload
from numbox.core.bindings.utils import (
extract_literal_str, intp_ll_type, load_lib, platform_,
)
__all__ = ["printf", "fprintf", "snprintf", "sscanf"]
load_lib("c")
# Windows MSVCRT exports "_snprintf" with non-C99 truncation semantics
# (returns -1 on truncation, no NUL guarantee). UCRT's C99-compliant
# "snprintf" is a header-only inline over __stdio_common_vsnprintf that
# isn't directly linkable in the simple C99 calling shape — declaring
# `i32 @snprintf(...)` in LLVM IR and letting the JIT linker resolve it
# crashes with an access violation. So on Windows the @njit binding
# deliberately targets MSVCRT (see snprintf module-docstring section on
# cross-mode caveats); pure-Python uses POSIX/C99 semantics universally.
_SNPRINTF_SYMBOL = "_snprintf" if platform_ == "Windows" else "snprintf"
# C-printf length modifiers: 'hh', 'h', 'l', 'll', 'L'. Python's % operator
# rejects them with `ValueError: unsupported format character`. Strip them
# before pure-Python formatting; the stripped %d / %f produces equivalent
# output for typical values (Python's % uses the value's natural width).
_LENGTH_MODIFIER_RE = re.compile(
r'%([-+0# ]*[0-9*]*\.?[0-9*]*)(hh|ll|h|l|L|j|z|t|q|I32|I64)([diouxXfFeEgGaAcsp])'
)
def _python_fmt_compat(fmt):
"""Strip C printf length modifiers so Python's % accepts the format."""
return _LENGTH_MODIFIER_RE.sub(r'%\1\3', fmt)
# ---------------------------------------------------------------------------
# Helpers for the @intrinsic codegen layer (private)
# ---------------------------------------------------------------------------
def _promote_for_varargs(builder, arg_ty, arg_val):
"""C-ABI promotion + opportunistic widening of all integer args to 64-bit.
- ``float32`` → ``float64`` (fpext)
- ``bool`` → ``int64`` (zext)
- any signed ``Integer`` of width < 64 → ``int64`` (sext)
- any unsigned ``Integer`` of width < 64 → ``int64`` (zext)
- 64-bit ints, doubles, pointers — pass through
Widening to 64-bit (rather than just int32 per strict C ABI) is a
deliberate choice: it makes ``%lld + int32`` work in @njit, which
matches pure-Python's behavior (Python ignores length modifiers and
uses the value's natural width). The cost is a single LLVM sext /
zext / fpext per arg — free at runtime.
Arg-type validation is done at typing time via
``_validate_writer_arg_type``; the ``else: raise`` here is
defense-in-depth — should typing-layer validation fail to filter
everything (e.g. a future numba changes how tuples flatten), the
codegen path stops rather than dropping garbage into libc's
variadic call.
"""
i64_ll = llir.IntType(64)
if isinstance(arg_ty, Float):
if arg_ty.bitwidth == 32:
return builder.fpext(arg_val, llir.DoubleType())
return arg_val
if isinstance(arg_ty, Boolean):
return builder.zext(arg_val, i64_ll)
if isinstance(arg_ty, Integer):
if arg_ty.bitwidth < 64:
if arg_ty.signed:
return builder.sext(arg_val, i64_ll)
return builder.zext(arg_val, i64_ll)
return arg_val
raise TypingError(
f"variadic arg of type {arg_ty!r} is not supported by printf-family "
f"bindings; allowed: Float, Integer, Boolean, or intp pointer"
)
def _validate_writer_arg_type(name, idx, ty):
"""Raise ``TypingError`` unless ``ty`` is a scalar type supported by
``printf`` / ``fprintf`` / ``snprintf`` — Float, Integer (incl. intp
for raw pointers), Boolean, or a unicode type (which the @overload
layer auto-converts via ``get_unicode_data_p``).
Without this guard, ``_promote_for_varargs`` would silently
``return arg_val`` for unsupported types, dropping numpy arrays,
complex numbers, tuples, etc. directly into libc's variadic call as
LLVM aggregates — silent miscompilation, not a clean error.
"""
if isinstance(unliteral(ty), (Float, Integer, Boolean, UnicodeType)):
return
raise TypingError(
f"{name}: arg {idx} has unsupported type {ty!r}; allowed: Float, "
f"Integer, Boolean, unicode (auto-converted to char* via "
f"get_unicode_data_p), or intp (raw pointer / preconverted string)"
)
_PERCENT_N_RE = re.compile(
r'%(?:[0-9]+\$)?[-+0# ]*[*0-9]*(?:[0-9]+\$)?'
r'(?:\.[*0-9]*(?:[0-9]+\$)?)?'
r'(?:hh|ll|h|l|L|j|z|t|q|I32|I64)?n'
)
def _reject_percent_n_or_raise(name, fmt_str):
"""Raise ``TypingError`` if ``fmt_str`` contains a ``%n`` directive
(with or without flags / width / precision / length modifier).
``%n`` causes printf to write the byte-count-written-so-far through a
caller-supplied ``int*`` pointer. Allowing it would (a) be a memory
write through an arg that pure-Python's ``%`` operator rejects with
``ValueError`` — breaking dual-mode equivalence — and (b) be a memory-
safety hazard widely flagged by static analyzers and disabled by
glibc's ``_FORTIFY_SOURCE`` for writable format strings.
``%%n`` (a literal ``%`` followed by ``n``) is allowed: the regex
operates on the format string after stripping ``%%`` pairs to a
sentinel that the directive matcher cannot land on.
"""
stripped = fmt_str.replace('%%', '\x00\x00')
if _PERCENT_N_RE.search(stripped):
raise TypingError(
f"{name}: %n directive in format string {fmt_str!r} is not "
f"allowed (writes byte-count-written through caller pointer; "
f"memory-safety hazard, also diverges from pure-Python behavior). "
f"Use sscanf if you need %n's read-position semantics."
)
def _unpack_args_tuple(builder, args_ty, args_pack):
"""Extract individual LLVM values from a tuple-of-args LLVM aggregate."""
arg_types = tuple(args_ty)
return [
(arg_types[i], builder.extract_value(args_pack, i))
for i in range(len(arg_types))
]
def _emit_variadic_call(builder, symbol, fmt_str, leading_vals, args_ty, args_pack, *, leading_tys):
"""Emit IR for ``symbol(*leading_vals, fmt_p, *promoted_args) -> i32``.
``leading_tys`` is the explicit list of LLVM types for the leading
positional args (size_t, FILE*, char*, etc. — anything that precedes
the format string in the libc signature). Callers pass it so the
function-type declaration documents the libc ABI contract at the
codegen call site, rather than implicitly reading
``[v.type for v in leading_vals]`` and relying on numba's lowering to
produce the right LLVM types. The explicit form is what's needed for
size_t in snprintf — derive via ``intp_ll_type(context)`` to keep
consistency with numba's intp lowering API.
"""
i8p = llir.IntType(8).as_pointer()
i32_ll = llir.IntType(32)
mod = builder.module
fmt_bytes = cgutils.make_bytearray((fmt_str + '\x00').encode('utf-8'))
# cgutils.global_constant uses linkage='internal' and routes through
# module.get_unique_name → scope.deduplicate, which auto-suffixes when
# the same name is re-used within a module (printf_format,
# printf_format.1, ...). Across modules, internal-linkage globals are
# module-private; LLVM's linker further renames on merge into the
# shared MCJIT engine. So multiple call sites — same or distinct
# format strings — each get their own deduplicated global.
global_fmt = cgutils.global_constant(mod, f"{symbol}_format", fmt_bytes)
fmt_p = builder.bitcast(global_fmt, i8p)
unpacked = _unpack_args_tuple(builder, args_ty, args_pack)
promoted = [_promote_for_varargs(builder, t, v) for t, v in unpacked]
fn_ty = llir.FunctionType(i32_ll, list(leading_tys) + [i8p], var_arg=True)
fn = get_or_insert_function(mod, fn_ty, symbol)
return builder.call(fn, list(leading_vals) + [fmt_p] + promoted)
# ---------------------------------------------------------------------------
# Private @intrinsics — the @njit codegen path
# ---------------------------------------------------------------------------
@intrinsic(prefer_literal=True)
def _printf_intrinsic(typingctx, fmt_ty, args_ty):
"""libc printf via a tuple-of-args. Internal; user code calls printf()."""
fmt_str = extract_literal_str("printf", fmt_ty, field="format string")
_reject_percent_n_or_raise("printf", fmt_str)
if not isinstance(args_ty, BaseTuple):
raise TypingError(f"printf: args must be a tuple, got {args_ty!r}")
for i, ty in enumerate(tuple(args_ty)):
_validate_writer_arg_type("printf", i, ty)
def codegen(context, builder, sig, llvm_args):
_, args_pack = llvm_args
return _emit_variadic_call(
builder, "printf", fmt_str, [], args_ty, args_pack,
leading_tys=[])
return int32(fmt_ty, args_ty), codegen
@intrinsic(prefer_literal=True)
def _fprintf_intrinsic(typingctx, fp_ty, fmt_ty, args_ty):
"""libc fprintf via a tuple-of-args. Internal; user code calls fprintf()."""
fmt_str = extract_literal_str("fprintf", fmt_ty, field="format string")
_reject_percent_n_or_raise("fprintf", fmt_str)
if not isinstance(args_ty, BaseTuple):
raise TypingError(f"fprintf: args must be a tuple, got {args_ty!r}")
if fp_ty != intp:
raise TypingError(
f"fprintf: fp must be intp (FILE* as pointer-as-int), got {fp_ty!r}"
)
for i, ty in enumerate(tuple(args_ty)):
_validate_writer_arg_type("fprintf", i, ty)
def codegen(context, builder, sig, llvm_args):
i8p = llir.IntType(8).as_pointer()
fp_int, _, args_pack = llvm_args
fp_ptr = builder.inttoptr(fp_int, i8p)
return _emit_variadic_call(
builder, "fprintf", fmt_str, [fp_ptr], args_ty, args_pack,
leading_tys=[i8p])
return int32(fp_ty, fmt_ty, args_ty), codegen
@intrinsic(prefer_literal=True)
def _snprintf_intrinsic(typingctx, buf_ty, size_ty, fmt_ty, args_ty):
"""libc snprintf via a tuple-of-args. Internal; user code calls snprintf()."""
fmt_str = extract_literal_str("snprintf", fmt_ty, field="format string")
_reject_percent_n_or_raise("snprintf", fmt_str)
if not isinstance(args_ty, BaseTuple):
raise TypingError(f"snprintf: args must be a tuple, got {args_ty!r}")
if buf_ty != intp:
raise TypingError(
f"snprintf: buf must be intp (pointer-as-int), got {buf_ty!r}"
)
if size_ty != intp:
raise TypingError(
f"snprintf: size must be intp (size_t-as-int), got {size_ty!r}"
)
for i, ty in enumerate(tuple(args_ty)):
_validate_writer_arg_type("snprintf", i, ty)
def codegen(context, builder, sig, llvm_args):
i8p = llir.IntType(8).as_pointer()
buf_int, size_val, _, args_pack = llvm_args
buf_ptr = builder.inttoptr(buf_int, i8p)
# size_t is pointer-width on all current 64-bit platforms; derive
# the LLVM type from numba's intp via the shared helper so the
# libc snprintf signature stays correct under platform changes and
# matches the typing-time `size_ty != intp` guard above.
return _emit_variadic_call(
builder, _SNPRINTF_SYMBOL, fmt_str,
[buf_ptr, size_val], args_ty, args_pack,
leading_tys=[i8p, intp_ll_type(context)])
return int32(buf_ty, size_ty, fmt_ty, args_ty), codegen
@intrinsic(prefer_literal=True)
def _sscanf_intrinsic(typingctx, buf_ty, fmt_ty, args_ty):
"""libc sscanf via a tuple-of-args. Internal; user code calls sscanf().
Args are intp output pointers; no default-arg promotion applies
(pointers don't promote). See sscanf() for the caller-facing contract.
"""
fmt_str = extract_literal_str("sscanf", fmt_ty, field="format string")
if not isinstance(args_ty, BaseTuple):
raise TypingError(f"sscanf: args must be a tuple, got {args_ty!r}")
if buf_ty != intp:
raise TypingError(
f"sscanf: buf must be intp (pointer-as-int), got {buf_ty!r}"
)
for i, ty in enumerate(tuple(args_ty)):
if ty != intp:
raise TypingError(
f"sscanf: args[{i}] must be intp (output pointer), got {ty!r}"
)
def codegen(context, builder, sig, llvm_args):
i8p = llir.IntType(8).as_pointer()
buf_int, _, args_pack = llvm_args
buf_ptr = builder.inttoptr(buf_int, i8p)
return _emit_variadic_call(
builder, "sscanf", fmt_str, [buf_ptr], args_ty, args_pack,
leading_tys=[i8p])
return int32(buf_ty, fmt_ty, args_ty), codegen
# ---------------------------------------------------------------------------
# @overload helper: build an impl source string with str args auto-converted
# ---------------------------------------------------------------------------
def _build_args_tuple_expr_from_starargs(arg_tys):
"""Build a Python source fragment ``(...)`` that constructs the args
tuple for the intrinsic call, indexing into the impl's ``*args`` and
auto-converting any ``UnicodeType`` arg via ``get_unicode_data_p``
so libc sees a NUL-terminated C string for ``%s``.
Numba requires the @overload's impl signature to match the typing
signature shape exactly — ``*args`` in typing must be ``*args`` in
impl. So we cannot expand per-arity explicit parameters; we have
to index into the ``args`` tuple from inside the impl.
"""
n = len(arg_tys)
if n == 0:
return "()"
parts = []
for i, ty in enumerate(arg_tys):
# Both numba's runtime ``unicode_type`` and the compile-time
# ``Literal[str]`` (StringLiteral) need conversion. StringLiteral
# is NOT a UnicodeType subclass directly — its MRO is
# ``StringLiteral → Literal → Dummy → Type`` — so we use
# ``unliteral(ty)`` to strip any Literal wrapping before the check.
if isinstance(unliteral(ty), UnicodeType):
parts.append(f"get_unicode_data_p(args[{i}])")
else:
parts.append(f"args[{i}]")
inner = ", ".join(parts)
return f"({inner},)" if n == 1 else f"({inner})"
def _build_overload_impl(name, fixed_params, args_tys, intrinsic_callable,
get_unicode_data_p):
"""Build an impl function via exec'd source that:
- Takes ``(fixed_params..., *args)`` — matching the typing-function shape
- Bundles ``*args`` into a tuple via index expressions, auto-converting
``UnicodeType`` args via ``get_unicode_data_p``
- Calls the underlying intrinsic with the bundled tuple
"""
args_tuple_expr = _build_args_tuple_expr_from_starargs(args_tys)
sig_params = ", ".join(list(fixed_params) + ["*args"])
intrinsic_args = ", ".join(list(fixed_params) + [args_tuple_expr])
src = f"def impl({sig_params}):\n return _intr({intrinsic_args})\n"
ns = {"_intr": intrinsic_callable, "get_unicode_data_p": get_unicode_data_p}
exec(compile(src, f"<{name}-overload-impl>", "exec"), ns)
return ns["impl"]
# Lazy import to avoid circular dependency at module load
def _get_unicode_data_p_lazy():
from numbox.utils.lowlevel import get_unicode_data_p
return get_unicode_data_p
# ---------------------------------------------------------------------------
# Pure-Python stdio-handle address cache (lazy init for fprintf routing)
# ---------------------------------------------------------------------------
_PY_STREAM_BY_FP = None
def _ensure_py_stream_cache():
global _PY_STREAM_BY_FP
if _PY_STREAM_BY_FP is not None:
return
# Defer the imports to avoid a circular dep at module load — _fmtio is
# imported AFTER _stdio by numbox.core.bindings.__init__, but doing
# the calls here at first use guarantees the bindings are ready.
from numbox.core.bindings import stdout, stderr, stdin
_PY_STREAM_BY_FP = {
int(stdout()): sys.stdout,
int(stderr()): sys.stderr,
int(stdin()): sys.stdin,
}
# ---------------------------------------------------------------------------
# Public Python-callable wrappers + @overload registrations
# ---------------------------------------------------------------------------
def _reject_percent_n_in_python(name, fmt):
"""Pure-Python equivalent of ``_reject_percent_n_or_raise``: raise
``ValueError`` if ``%n`` appears in ``fmt``. Python's ``%`` operator
would itself raise on ``%n`` (it's an unsupported format character),
but the message is generic; this gives users the same clear error
message in both Python and @njit modes.
"""
stripped = fmt.replace('%%', '\x00\x00')
if _PERCENT_N_RE.search(stripped):
raise ValueError(
f"{name}: %n directive in format string {fmt!r} is not allowed "
f"(writes byte-count-written through caller pointer; memory-"
f"safety hazard). Use sscanf if you need %n's read-position "
f"semantics."
)
[docs]
def printf(fmt, *args):
"""C-style ``printf(fmt, *args)`` — dual-mode (plain Python AND @njit).
From plain Python: writes to ``sys.stdout`` via ``str.__mod__`` after
stripping C length modifiers (``%lld`` → ``%d``, etc.).
From @njit: ``@overload`` below routes to the private ``_printf_intrinsic``
after auto-converting any ``unicode_type`` args via ``get_unicode_data_p``
so libc ``%s`` receives a NUL-terminated C string. Format string must be
a literal in @njit (see module docstring for caveats).
``%n`` is rejected in both modes (see ``_reject_percent_n_or_raise``).
Returns the number of bytes written (or written-equivalent), as int32.
"""
_reject_percent_n_in_python("printf", fmt)
text = _python_fmt_compat(fmt) % args
sys.stdout.write(text)
sys.stdout.flush()
return len(text.encode('utf-8'))
@overload(printf)
def _overload_printf(fmt, *args):
fmt_str = extract_literal_str("printf", fmt, field="format string")
_reject_percent_n_or_raise("printf", fmt_str)
for i, ty in enumerate(args):
_validate_writer_arg_type("printf", i, ty)
impl = _build_overload_impl(
"printf", ["fmt"], args, _printf_intrinsic,
_get_unicode_data_p_lazy(),
)
return impl
[docs]
def fprintf(fp, fmt, *args):
"""C-style ``fprintf(fp, fmt, *args)`` — dual-mode.
``fp`` is a FILE\\* as ``intp`` (from ``stdout()`` / ``stderr()`` /
``stdin()`` or ``fopen()``).
From plain Python: routes ``stdout()`` / ``stderr()`` / ``stdin()``
handles to the corresponding ``sys.*`` streams via an address cache.
Arbitrary FILE\\* values (e.g. ``fopen``-returned) raise
``RuntimeError`` — use ``open()`` + ``f.write()`` for Python-side
file I/O.
From @njit: routes to ``_fprintf_intrinsic`` with str auto-conversion.
"""
_reject_percent_n_in_python("fprintf", fmt)
_ensure_py_stream_cache()
py_stream = _PY_STREAM_BY_FP.get(int(fp))
if py_stream is None:
raise RuntimeError(
f"fprintf in pure-Python mode only supports stdout / stderr / "
f"stdin handles; got fp={int(fp):#x}. For arbitrary FILE* "
f"(e.g. fopen) use @njit, or use Python's open() + write()."
)
text = _python_fmt_compat(fmt) % args
py_stream.write(text)
py_stream.flush()
return len(text.encode('utf-8'))
@overload(fprintf)
def _overload_fprintf(fp, fmt, *args):
fmt_str = extract_literal_str("fprintf", fmt, field="format string")
_reject_percent_n_or_raise("fprintf", fmt_str)
if fp != intp:
raise TypingError(
f"fprintf: fp must be intp (FILE* as pointer-as-int), got {fp!r}"
)
for i, ty in enumerate(args):
_validate_writer_arg_type("fprintf", i, ty)
impl = _build_overload_impl(
"fprintf", ["fp", "fmt"], args, _fprintf_intrinsic,
_get_unicode_data_p_lazy(),
)
return impl
[docs]
def snprintf(buf_p, size, fmt, *args):
"""C-style ``snprintf(buf_p, size, fmt, *args)`` — dual-mode.
``buf_p`` is an ``intp`` pointer to the destination buffer (caller-
owned). Typically ``array_data_p(numpy_array)`` — that helper works
in both modes.
Returns the number of characters that WOULD have been written if
``size`` were unlimited (excluding the trailing NUL), as int32. See
the module docstring for the Windows @njit truncation-rc divergence
(Python and Linux/macOS @njit follow C99; Windows @njit uses
MSVCRT ``_snprintf`` which returns ``-1`` on truncation).
"""
_reject_percent_n_in_python("snprintf", fmt)
text_bytes = (_python_fmt_compat(fmt) % args).encode('utf-8')
n_would_have = len(text_bytes)
if size > 0:
n_write = min(n_would_have, size - 1)
# Slice content BEFORE appending NUL so the NUL is always at the
# correct position even when truncating. The previous form
# ``memmove(buf_p, text_bytes + b'\x00', n_write + 1)`` left the
# NUL out of the copied prefix when truncating — buf got
# n_write content bytes with no terminator.
src = text_bytes[:n_write] + b'\x00'
ctypes.memmove(buf_p, src, len(src))
return n_would_have
@overload(snprintf)
def _overload_snprintf(buf_p, size, fmt, *args):
fmt_str = extract_literal_str("snprintf", fmt, field="format string")
_reject_percent_n_or_raise("snprintf", fmt_str)
if buf_p != intp:
raise TypingError(
f"snprintf: buf must be intp (pointer-as-int), got {buf_p!r}"
)
if size != intp:
raise TypingError(
f"snprintf: size must be intp (size_t-as-int), got {size!r}"
)
for i, ty in enumerate(args):
_validate_writer_arg_type("snprintf", i, ty)
impl = _build_overload_impl(
"snprintf", ["buf_p", "size", "fmt"], args, _snprintf_intrinsic,
_get_unicode_data_p_lazy(),
)
return impl
[docs]
def sscanf(buf, fmt, *args):
"""C-style ``sscanf(buf, fmt, *args)`` — @njit-only.
Args are intp output pointers (typically ``array_data_p`` of a
1-element numpy array of the right dtype). See the @njit-only
docstring on ``_sscanf_intrinsic`` for the pointer-vs-spec contract.
Pure-Python users: this binding raises ``NotImplementedError``. For
parsing in pure Python use ``int()``, ``float()``, or ``re``.
"""
raise NotImplementedError(
"sscanf is @njit-only; wrap the call in @njit, or use Python's "
"int() / float() / re for pure-Python parsing"
)
@overload(sscanf)
def _overload_sscanf(buf, fmt, *args):
extract_literal_str("sscanf", fmt, field="format string") # validates Literal[str]
if buf != intp:
raise TypingError(
f"sscanf: buf must be intp (pointer-as-int), got {buf!r}"
)
# sscanf args are output POINTERS — must all be intp. No auto-conversion;
# pointers don't promote. Build the impl by reusing the helper but
# without get_unicode_data_p (intp args pass through unchanged).
for i, ty in enumerate(args):
if ty != intp:
raise TypingError(
f"sscanf: args[{i}] must be intp (output pointer), got {ty!r}"
)
impl = _build_overload_impl(
"sscanf", ["buf", "fmt"], args, _sscanf_intrinsic,
_get_unicode_data_p_lazy(), # unused (no UnicodeType in args)
)
return impl