mpremote: Add smart encoding selection for fs_writefile. by andrewleech · Pull Request #11 · andrewleech/micropython

andrewleech · 2026-02-25T01:22:07Z

Summary

mpremote fs cp file transfers to device are slow because fs_writefile() uses repr() encoding, expanding each byte to \xNN (~4x wire overhead for binary data).

This adds automatic encoding selection with a three-tier fallback:

deflate+base64 — device has deflate module and data compresses >20%
base64 — device has binascii.a2b_base64 but data doesn't compress well
repr — universal fallback (existing behaviour, unchanged)

The ROMFS deploy path is updated to share the new compression utilities and capability detection, replacing its inline zlib.compressobj(wbits=-9), hardcoded wbits value, and 14-line try/except capability detection. Also fixes a missing .strip() on the ROMFS base64-only encoding path.

Testing

64 transfer+readback integrity tests on STM32WB55 over 115200 baud UART with SPI flash. All verified via SHA-256 readback.

Random binary (incompressible, ratio ~1.0 — auto selects base64):

Size	repr	base64	deflate	auto	best/repr
1 KB	1.56 KB/s	2.87 KB/s	2.28 KB/s	2.81 KB/s	1.8x
5 KB	1.87 KB/s	4.19 KB/s	3.88 KB/s	4.18 KB/s	2.2x
10 KB	1.91 KB/s	4.52 KB/s	4.46 KB/s	4.49 KB/s	2.4x
50 KB	1.96 KB/s	4.77 KB/s	4.91 KB/s	4.76 KB/s	2.5x

Python source (ratio ~0.4 — auto selects deflate):

Size	repr	base64	deflate	auto	best/repr
1 KB	2.26 KB/s	2.88 KB/s	3.06 KB/s	2.75 KB/s	1.4x
5 KB	2.91 KB/s	4.13 KB/s	6.28 KB/s	5.98 KB/s	2.2x
10 KB	2.71 KB/s	4.52 KB/s	8.03 KB/s	7.58 KB/s	3.0x
50 KB	3.18 KB/s	4.79 KB/s	9.15 KB/s	9.41 KB/s	3.0x

Log data (ratio ~0.5 — auto selects deflate):

Size	repr	base64	deflate	auto	best/repr
1 KB	2.37 KB/s	2.81 KB/s	2.97 KB/s	2.68 KB/s	1.3x
5 KB	3.03 KB/s	4.23 KB/s	5.55 KB/s	5.47 KB/s	1.8x
10 KB	3.08 KB/s	4.56 KB/s	6.76 KB/s	6.84 KB/s	2.2x
50 KB	3.52 KB/s	4.82 KB/s	7.83 KB/s	7.54 KB/s	2.2x

All zeros (ratio ~0.005 — auto selects deflate):

Size	repr	base64	deflate	auto	best/repr
1 KB	1.23 KB/s	2.43 KB/s	3.67 KB/s	3.90 KB/s	3.2x
5 KB	1.50 KB/s	2.86 KB/s	13.68 KB/s	13.25 KB/s	9.1x
10 KB	1.53 KB/s	4.54 KB/s	18.55 KB/s	18.41 KB/s	12.2x
50 KB	1.54 KB/s	4.76 KB/s	23.98 KB/s	23.92 KB/s	15.6x

Auto-selection picks the fastest encoding for each data type in all cases.

Not tested on other ports or boards.

Trade-offs and Alternatives

chunk_size default changes from 256 to None (auto-sized per encoding). Callers omitting chunk_size get 256 for repr (matching prior behaviour). Explicit values are respected.
Devices without binascii.a2b_base64 fall back to repr() with no behaviour change.
Device capabilities are probed once via hasattr() and cached for the session.
An alternative would be to always use base64 without deflate, which would be simpler but miss the 2-3x additional speedup on compressible data (typical firmware files).

Fixes a build issue on newer Zephyr versions. Signed-off-by: Antonio Galea <antonio.galea@gmail.com>

The Pololu Zumo 2040 Robot is supported in pico-sdk now so we should not include the header file here anymore, similarly to other boards. This is necessary for future changes from the SDK to be reflected in MicroPython builds. Signed-off-by: Paul Grayson <paul@pololu.com>

Replace the custom rosc_random_u8()/rosc_random_u32() implementation with the pico_rand API from the Pico SDK. The RP2040 datasheet notes that ROSC "does not meet the requirements of randomness for security systems because it can be compromised", and the current 8-bit LFSR conditioning is not a vetted algorithm under NIST SP 800-90B. pico_rand uses various hardware RNG sources depending on the available platform (including the RP2350 hardware TRNG) and is officially supported and maintained as part of the Pico SDK. This changes os.urandom(), the mbedTLS entropy source, the PRNG seed, and the lwIP random function to all use pico_rand, and removes the custom ROSC random functions from main.c. Signed-off-by: Michel Le Bihan <michel@lebihan.pl>

When the timeout parameter of `esp32.RMT.wait_done()` is set to a non-zero value, the underlying `rmt_tx_wait_all_done` blocks (it passes the timeout to `xQueueReceive`). Thus we should release the GIL so that other MicroPython threads are not blocked from running. Signed-off-by: Daniël van de Giessen <daniel@dvdgiessen.nl>

This commit lets the native emitter preserve the value of the index register when performing register-indexed loads or stores of halfword or word values on Thumb. The original code was optimised too aggressively for a register-starved architecture like Thumb, and the index value in the sequence to generate was assumed to be allocated somewhere safe. This is valid on other architectures, but not on Thumb. To solve this, load operations do clobber a temporary register that should be safe to use, REG_TEMP2, to store the scaled register offset. REG_TEMP2's value is only used within the scope of a single ASM API instruction. Save operations unfortunately use a register that is aliased to REG_TEMP2, since they need to have three values in registers to perform the operation. This means the index register needs to be pushed to the stack before performing the scale + store operation, and then popped from the stack. That's a 4 bytes penalty on each store and a minor speed hit on generated code (plus a minor footprint increase of the firmware image). Signed-off-by: Alessandro Gatti <a.gatti@frob.it>

This commit lets the native emitter preserve the value of the index register when performing register-indexed loads or stores for halfword or word values on RV32. The original code was optimised too aggressively to reduce the generated code's size, using compressed opcodes that alias the target register to one of the operands. In register-indexed load/store operations, the index register was assumed to be allocated somewhere safe, but it was not always the case. To solve this, now all halfword and word register-indexed operations will use REG_TEMP2 to store the scaled index register. The size penalty on generated code varies across operation sizes and enabled extensions: - byte operations stay the same size with or without Zba - halfword operations will be 2 bytes larger without Zba, and will stay the same size with Zba - word operations will be 4 bytes larger without Zba, and 2 bytes larger with Zba There is also a minor firmware footprint increase to hold the extra logic needed for conditional register clobbering, but it shouldn't be that large anyway. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>

This commit introduces a test that should check whether viper load or store operations won't clobber either the buffer address or the index value being used. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>

Call `mp_event_handle_nowait()` in the VFS reader buffer refill path so that pending scheduled events (USB task, network poll, etc.) get processed during long-running import/parse/compile operations. Without this, importing a large Python module from the filesystem blocks for too long causing TinyUSB event queue to overflow. For example, on renesas-ra, running a script that imports iperf3 via mpremote run, asserts, most likely due to SOF interrupts not getting processing: queue_event at lib/tinyusb/src/device/usbd.c:382 dcd_event_handler at lib/tinyusb/src/device/usbd.c:1318 dcd_event_sof at lib/tinyusb/src/device/dcd.h:237 dcd_int_handler at tinyusb/src/portable/renesas/rusb2/dcd_rusb2.c:964 <signal handler called> disk_ioctl at extmod/vfs_fat_diskio.c:125 validate at lib/oofatfs/ff.c:3359 f_read at lib/oofatfs/ff.c:3625 file_obj_read at extmod/vfs_fat_file.c:75 mp_stream_rw at py/stream.c:60 mp_reader_vfs_readbyte at extmod/vfs_reader.c:59 next_char at py/lexer.c:174 mp_lexer_to_next at py/lexer.c:713 mp_parse at py/parse.c:1167 Signed-off-by: iabdalkader <i.abdalkader@gmail.com>

The aim of this commit is to clarify the command line options available. While they are available as well as in the CLI with --help, it's useful to document them and provide a few examples. Signed-off-by: Jos Verlinde <jos_verlinde@hotmail.com>

Factor out mp_os_urandom() of each port into extmod/modos.c, which then calls the port-specific function mp_hal_get_random(). Move mp_hal_get_random() to mphalport where suitable. At the MIMXRT and SAMD it is left in modos.c, since there are different implementation depending on the MCU family. At the ALIF, ESP32, CC3200 and RP2 port the file modos.c was removed, since it was empty after moving mp_hal_get_random(). Tested for the cc3200, esp32, esp8266, mimxrt, nrf, rp2, samd, stm32 and unix ports. Compiled for the alif and the renesas port. Signed-off-by: robert-hh <robert@hammelrath.com>

Fixes issue micropython#18825. Signed-off-by: robert-hh <robert@hammelrath.com>

Signed-off-by: Damien George <damien@micropython.org>

github-actions · 2026-02-25T01:40:22Z

Code size report:

Reference:  zephyr/mpconfigport: Remove duplicate builtins.open definition. [1ab9b66]
Comparison: mpremote: Add smart encoding selection for fs_writefile. [merge of 7f0d6a6]
  mpy-cross:   +80 +0.021% 
   bare-arm:   +28 +0.050% 
minimal x86:   +42 +0.022% 
   unix x64:   +56 +0.007% standard
      stm32:   +96 +0.024% PYBV10
      esp32:  +508 +0.029% ESP32_GENERIC
     mimxrt:   +88 +0.023% TEENSY40
        rp2: +1232 +0.134% RPI_PICO_W[incl +8(bss)]
       samd:   +88 +0.032% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:   +30 +0.007% VIRT_RV32

andrewleech · 2026-02-25T02:24:01Z

/review

mpy-reviewer

There's a critical bug in _detect_encoding_capabilities: the single dict eval includes __import__('deflate') first, so on any device without the deflate module, ImportError is raised mid-expression and the whole eval fails. The except Exception path then sets both deflate and base64 to False, meaning devices without deflate permanently lose base64 encoding and fall back to repr — a regression. The deflate and base64 capabilities need to be probed independently. Also, test_compression_ratio is defined twice in compression_utils.py.

mpy-reviewer · 2026-02-25T02:29:38Z

tools/mpremote/mpremote/transport.py

+        try:
+            caps = self.eval(
+                "{"
+                "'deflate':hasattr(__import__('deflate'),'DeflateIO'),"


If deflate is not available on the device, __import__('deflate') raises ImportError, aborting the entire dict construction. The except Exception at the outer level then sets both deflate and base64 to False — even though binascii.a2b_base64 is almost certainly present. On any device without deflate this PR is a regression vs repr.

The capabilities must be probed independently. Either do two separate eval calls, or use a device-side try/except around the deflate import:

caps = self.eval( "{'base64':hasattr(__import__('binascii'),'a2b_base64')," "'bytesio':hasattr(__import__('io'),'BytesIO')}" ) try: has_dfl = self.eval("hasattr(__import__('deflate'),'DeflateIO')") except Exception: has_dfl = False caps['deflate'] = has_dfl and caps.get('base64') and caps.get('bytesio')

mpy-reviewer · 2026-02-25T02:29:38Z

tools/mpremote/mpremote/compression_utils.py

+    if not sample:
+        return 1.0
+    compressed = compress_chunk(sample)
+    return len(compressed) / len(sample)


test_compression_ratio is already defined at line 69. Remove the duplicate.

mpy-reviewer · 2026-02-25T02:29:38Z

tools/mpremote/mpremote/compression_utils.py

@@ -0,0 +1,86 @@
+#!/usr/bin/env python3


Please remove the shebang; this is a library module, not a standalone script.

mpy-reviewer · 2026-02-25T02:29:38Z

tools/mpremote/mpremote/transport.py

+            # Setup imports and file handle on device
+            if encoding == "deflate":
+                self.exec(
+                    "from binascii import a2b_base64 as _a\n"


The ROMFS path in commands.py imports under the full names (a2b_base64, BytesIO, DeflateIO, RAW). Using single-letter aliases (_a, _B, _D, _R) here is inconsistent. Please use the same names in both paths.

This is a reproducer for micropython#17117. Signed-off-by: Jeff Epler <jepler@unpythonic.net>

Signed-off-by: Jeff Epler <jepler@unpythonic.net>

This turns the reproducer into a sensible-ish crash: TypeError: exceptions must derive from BaseException Closes: micropython#17117 Signed-off-by: Jeff Epler <jepler@unpythonic.net>

Signed-off-by: Jeff Epler <jepler@unpythonic.net>

Signed-off-by: stijn <stijn@ignitron.net>

This is convenient when trying to figure out the correct values for --include/--exclude/--test-dirs/... arguments. Signed-off-by: stijn <stijn@ignitron.net>

Test file paths which get passed to the run_tests function can be absolute or relative and with or without leading slash in the latter case, depending on the arguments to run-tests.py, but the skip_tests list with tests to skip only contains relative paths so using simple string equality comparison easily leads to false negatives. Compare the full absolute path instead such that it doesn't matter anymore in which form the tests are passed. Note: - use realpath to resolve symlinks plus make the comparison case insensitive on windows - the test_file passed to run_one_test is not altered by this commit, such that when the user inputs relative paths the tests are also still displayed with relative paths - likewise the test_file_abspath is not modified because functions like run_micropython rely on it having forward slashes In practice this means that it used to be so that the only forms of running tests for which the skip_tests lists actually worked were: >python ./run-tests.py >python ./run-tests.py -d extmod whereas it now works consistently so also for these invocations, which in the end all point to the exact same path: >python ./run-tests.py -d ./extmod >python ./run-tests.py -d ../tests/extmod >python ./run-tests.py -d /full/path/to/tests/extmod These examples used to not skip any of the tests in the extmod/ directory thereby leading to test failures. Signed-off-by: stijn <stijn@ignitron.net>

Scan the --test-dirs argument for the main tests directory being passed and if so do the same thing as if running from within that main test directory. In practice this makes the following (which used to counterintuitively try and fail to run the .py files in the tests/ directory itself) >python micropython/tests/run-tests.py -d micropython/tests do the same thing as >cd micropython/tests >python ./run-tests.py which is logical and convenient. Signed-off-by: stijn <stijn@ignitron.net>

Signed-off-by: stijn <stijn@ignitron.net>

Test file paths which get passed to the run_tests function can be absolute or relative and with or without leading slash in the latter case, depending on the arguments to run-tests.py, but since that path is used to: - display which tests run - record which tests ran in the results.json - craft the filename for the .exp/.out file for failed tests it is desirable to always use the same file path irregardless of how the user passed the path. In practice this means that all forms of running our own tests like: >python ./run-tests.py -i extmod >python ./run-tests.py -d extmod >python ./run-tests.py -d ./extmod >python ./run-tests.py -d ../tests/extmod >python ./run-tests.py -d /full/path/to/tests/extmod will now consistently all display the tests like pass extmod/time_time_ns.py FAIL extmod/some_failing_test.py and produce output files like results/extmod_some_failing_test.py.exp results/extmod_some_failing_test.py.out instead of displaying/using the exact path as passed. For external tests, meaning not in the tests/ directory, we also want to be consistent so there the choice was made to always use absolute paths. Signed-off-by: stijn <stijn@ignitron.net>

Fixes fatal crash if serial port access returns an error (for example: port is native USB-CDC and the host hard faults during the test run). Instead of crashing, have the runner mark this as a test run error and continue. It's not certain the next test will run successfully, but this provides the context of output showing what was happening when the communication error occurred. Without this change, that output is lost when the fatal exception terminates the runner process. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <angus@redyak.com.au>

This gives a more user-friendly name when the Python object (eg Pin) is printed. If the nodelabel is unavailable then it uses `dev->name` as a fallback. Signed-off-by: Fin Maaß <f.maass@vogl-electronic.com>

andrewleech · 2026-02-28T09:05:11Z

/review

mpy-reviewer

The .strip() fix in the ROMFS base64 path is correct and the performance gains are well documented. However there is a correctness bug in _detect_encoding_capabilities that will regress base64 support on devices without deflate, and a few smaller issues below.

mpy-reviewer · 2026-02-28T09:12:49Z

tools/mpremote/mpremote/transport.py

+        try:
+            caps = self.eval(
+                "{"
+                "'deflate':hasattr(__import__('deflate'),'DeflateIO'),"


This will raise ImportError on devices that don't have the deflate module, causing the entire eval() to fail. The except Exception then sets base64 to False too, which is a regression — devices without deflate but with binascii.a2b_base64 will silently lose base64 support and fall back to repr.

The old code probed each capability with a separate try/exec block precisely to avoid this. Either do the same here, or guard the import in the eval string, e.g. with a helper that catches ImportError on the device side.

mpy-reviewer · 2026-02-28T09:12:49Z

tools/mpremote/mpremote/transport.py

+                "deflate": caps.get("deflate") and caps.get("bytesio") and caps.get("base64"),
+                "base64": caps.get("base64"),
+            }
+        except Exception:


Catching bare Exception here silently swallows errors that are not ImportError / transport errors (e.g. OOM on the device, serialisation bugs). Should be except (Exception,): at minimum with a comment, or ideally just except TransportExecError.

mpy-reviewer · 2026-02-28T09:12:49Z

tools/mpremote/mpremote/commands.py

        chunk_size = max(chunk_size, rom_min_write)

    # Detect capabilities of the device to use the fastest method of transfer.
+    caps = transport._detect_encoding_capabilities()


_detect_encoding_capabilities is a private method (underscore prefix). Calling it directly from commands.py breaks the encapsulation the underscore signals. Either make it public or expose the capabilities through a higher-level API.

mpy-reviewer · 2026-02-28T09:12:49Z

tools/mpremote/mpremote/compression_utils.py

@@ -0,0 +1,86 @@
+#!/usr/bin/env python3


please remove the shebang — this is a library module, not a standalone script.

mpy-reviewer · 2026-02-28T09:12:49Z

tools/mpremote/mpremote/compression_utils.py

+def compress_chunk(data, wbits=DEFAULT_WBITS):
+    """Compress a single chunk using raw deflate.
+
+    Each chunk is independently compressed/decompressable, which is required


"decompressable" → "decompressible"

mpy-reviewer · 2026-02-28T09:12:49Z

tools/mpremote/mpremote/compression_utils.py

+    Returns:
+        Ratio of compressed/original size (0.0-1.0+). Lower = better compression.
+    """
+    sample = data[:sample_size] if len(data) > sample_size else data


data[:sample_size] if len(data) > sample_size else data can just be data[:sample_size] — Python slicing past the end is safe.

Automatically detect device capabilities (deflate, base64, bytes.fromhex) and select the best encoding for file transfers. Deflate+base64 is used when the device supports it and data compresses well, base64 alone as a fallback, and repr as the universal fallback. Each capability is probed independently so a missing deflate module does not suppress base64 detection. Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>

ant9000 and others added 12 commits February 24, 2026 18:18

lib/libhydrogen: Update to latest release.

c1ed5f7

Fixes a build issue on newer Zephyr versions. Signed-off-by: Antonio Galea <antonio.galea@gmail.com>

tests/micropython: Add a test for checking viper value clobbering.

43a4914

This commit introduces a test that should check whether viper load or store operations won't clobber either the buffer address or the index value being used. Signed-off-by: Alessandro Gatti <a.gatti@frob.it>

extmod/modos: Raise an error at a negative argument of os.urandom().

6b8bcb6

Fixes issue micropython#18825. Signed-off-by: robert-hh <robert@hammelrath.com>

tests/extmod/os_urandom.py: Add test for os.urandom.

6dbabc9

Signed-off-by: Damien George <damien@micropython.org>

mpy-reviewer bot reviewed Feb 25, 2026

View reviewed changes

jepler and others added 12 commits February 26, 2026 16:15

tests/micropython: Add a test for throwing incomplete exception.

2050055

This is a reproducer for micropython#17117. Signed-off-by: Jeff Epler <jepler@unpythonic.net>

py/objtype: Expose mp_native_base_init_wrapper_obj.

aaa30ab

Signed-off-by: Jeff Epler <jepler@unpythonic.net>

py/objexcept: Check for incompletely constructed exceptions.

c199ba9

This turns the reproducer into a sensible-ish crash: TypeError: exceptions must derive from BaseException Closes: micropython#17117 Signed-off-by: Jeff Epler <jepler@unpythonic.net>

tests/cpydiff: Add a section for throwing incomplete exceptions.

2631b06

Signed-off-by: Jeff Epler <jepler@unpythonic.net>

tools/codeformat.py: Use input files when formatting python code.

412ffd4

Signed-off-by: stijn <stijn@ignitron.net>

tests/run-tests.py: Add an argument for showing which tests would run.

2ab88c3

This is convenient when trying to figure out the correct values for --include/--exclude/--test-dirs/... arguments. Signed-off-by: stijn <stijn@ignitron.net>

tests/cmdline: Make tests succeed irregardless of invocation path.

7b91633

Signed-off-by: stijn <stijn@ignitron.net>

zephyr: Use nodelabel when printing device name.

104deaa

This gives a more user-friendly name when the Python object (eg Pin) is printed. If the nodelabel is unavailable then it uses `dev->name` as a fallback. Signed-off-by: Fin Maaß <f.maass@vogl-electronic.com>

mpy-reviewer bot reviewed Feb 28, 2026

View reviewed changes

andrewleech force-pushed the feature/smart-encoding-fs-writefile branch from 5599a34 to a46e4ae Compare February 28, 2026 19:29

andrewleech force-pushed the feature/smart-encoding-fs-writefile branch from a46e4ae to 7f0d6a6 Compare March 2, 2026 11:16

Conversation

andrewleech commented Feb 25, 2026

Summary

Testing

Trade-offs and Alternatives

Uh oh!

github-actions bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andrewleech commented Feb 25, 2026

Uh oh!

mpy-reviewer bot left a comment

Choose a reason for hiding this comment

Uh oh!

mpy-reviewer bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

mpy-reviewer bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

mpy-reviewer bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

mpy-reviewer bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

andrewleech commented Feb 28, 2026

Uh oh!

mpy-reviewer bot left a comment

Choose a reason for hiding this comment

Uh oh!

mpy-reviewer bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

mpy-reviewer bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

mpy-reviewer bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

mpy-reviewer bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

mpy-reviewer bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

mpy-reviewer bot Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

github-actions bot commented Feb 25, 2026 •

edited

Loading