Outboard DFU issue - error: esp: erasing flash failed

Hi
I’ve been testing outboard DFU, and have had success with my hardware setup.
However my firmware image now includes more libraries and is larger, and I’m seeing a failure.
(The binpack file is 622KB rather than 379KB, so I don’t think I’m hitting the 1.5MB limit)

Using Notecard FW 4.4.1.4015700 and CLI version 1.2.1

I’ve pasted the notecard trace output below at the end

The offending line seems to be
S52:54.66 loader: erasing flash for firmware.bin from 0x00010000 len 0x000e3690

As a first line of debug, I tried running a similar command with ESPtool
esptool.py -p /dev/cu.usbserial-556E0262731 erase_region 0x00010000 0x000e3690

This failed with the following error
A fatal error occurred: Size of data to erase must be a multiple of 4096

So I expect the the DFU is failing for the same reason.
Is this a bug? the notecard is trying to erase a length of 0xe3690 (=931472)

Which looks to be what is specified in the binpack (apologies, I’ve edited to show the correct binpack header now!).

/// BINPACK ///
WHEN: 2023-08-08 12:11:43 UTC
HOST: esp32
SNAP: 8192
LOAD: bootloader.bin,4096,17408,17408,15577,5ec4bbd3bb1f2dc36232cc1c8f1d3ee2
LOAD: partitions.bin,32768,3072,3072,294,118cbdbfce82eb12bbdeb7c59af2fce8
LOAD: firmware.bin,65536,931472,931472,606031,a29df4453f7258e65a38360a625ae549
/// BINPACK ///
S52:31.00 _health.qo: starting outboard DFU of esp32 firmware

S52:33.12 loader: 994ms timeout waiting for reply

S52:34.23 loader: 994ms timeout waiting for reply

S52:35.33 loader: 994ms timeout waiting for reply

S52:35.48 loader: reading source offset 0x00000000 len 0x00002000 leaving 0x00095e8a remaining

S52:37.42 binpack created 2023-08-08 12:11:43 UTC

S52:37.42 binpack compression frame length: 8192

S52:37.43 binpack will load bootloader.bin at 00001000 for 17408 bytes

S52:37.43 binpack will load partitions.bin at 00008000 for 3072 bytes

S52:37.44 binpack will load firmware.bin at 00010000 for 931472 bytes

S52:37.44 loader: resetting source offset to 0x0000013c to align image to page boundary

S52:37.44 loader: reading source offset 0x0000013c len 0x00002000 leaving 0x00095d4e remaining

S52:39.19 loader: processing bootloader.bin off 0 (15577/8192/613710 remaining)

S52:39.19 loader: erasing flash for bootloader.bin from 0x00001000 len 0x00004400

S52:39.53 loader: bootloader.bin frame compressed len 7096

S52:39.53 loader: bootloader.bin frame decompressed from 7096 to 8192

S52:39.54 loader: writing flash for bootloader.bin len 0x00000800 6144 remaining

S52:40.19 loader: writing flash for bootloader.bin len 0x00000800 4096 remaining

S52:41.14 loader: writing flash for bootloader.bin len 0x00000800 2048 remaining

S52:41.97 loader: writing flash for bootloader.bin len 0x00000800 0 remaining

S52:42.66 loader: loaded bootloader.bin 0x00002bbc off 7100 len 00007100 (8477/1092/613710 remaining)

S52:42.66 loader: processing bootloader.bin off 7100 (8477/1092/613710 remaining)

S52:42.67 loader: bootloader.bin frame compressed len 7452

S52:42.67 loader: loaded bootloader.bin 0x00003000 off 8192 len 00001092 (7385/0/613710 remaining)

S52:42.67 loader: reading source offset 0x0000213c len 0x00002000 leaving 0x00093d4e remaining

S52:44.93 loader: processing bootloader.bin off 8192 (7385/8192/605518 remaining)

S52:44.93 loader: bootloader.bin frame decompressed from 7452 to 8192

S52:44.94 loader: writing flash for bootloader.bin len 0x00000800 6144 remaining

S52:45.75 loader: writing flash for bootloader.bin len 0x00000800 4096 remaining

S52:46.99 loader: writing flash for bootloader.bin len 0x00000800 2048 remaining

S52:47.90 loader: writing flash for bootloader.bin len 0x00000800 0 remaining

S52:48.86 loader: loaded bootloader.bin 0x000048dc off 14556 len 00006364 (1021/1828/605518 remaining)

S52:48.86 loader: processing bootloader.bin off 14556 (1021/1828/605518 remaining)

S52:48.86 loader: bootloader.bin frame compressed len 1017

S52:48.87 loader: bootloader.bin frame decompressed from 1017 to 1024

S52:48.87 loader: writing flash for bootloader.bin len 0x00000400 0 remaining

S52:49.59 loader: loaded bootloader.bin 0x00004cd9 off 15577 len 00001021 (0/807/605518 remaining)

S52:49.63 loader: flash verified for bootloader.bin

S52:49.63 loader: end-to-end bin verified for bootloader.bin

S52:49.64 loader: programming completed for bootloader.bin

S52:49.64 loader: resetting source offset to 0x00003e15 to align image to page boundary

S52:49.64 loader: reading source offset 0x00003e15 len 0x00002000 leaving 0x00092075 remaining

S52:51.70 loader: processing partitions.bin off 0 (294/8192/598133 remaining)

S52:51.70 loader: erasing flash for partitions.bin from 0x00008000 len 0x00000c00

S52:51.83 loader: partitions.bin frame compressed len 290

S52:51.84 loader: partitions.bin frame decompressed from 290 to 3072

S52:51.84 loader: writing flash for partitions.bin len 0x00000800 1024 remaining

S52:52.45 loader: writing flash for partitions.bin len 0x00000400 0 remaining

S52:53.07 loader: loaded partitions.bin 0x00008126 off 294 len 00000294 (0/7898/598133 remaining)

S52:53.09 loader: flash verified for partitions.bin

S52:53.09 loader: end-to-end bin verified for partitions.bin

S52:53.10 loader: programming completed for partitions.bin

S52:53.10 loader: resetting source offset to 0x00003f3b to align image to page boundary

S52:53.10 loader: reading source offset 0x00003f3b len 0x00002000 leaving 0x00091f4f remaining

S52:54.65 loader: processing firmware.bin off 0 (606031/8192/597839 remaining)

**S52:54.66 loader: erasing flash for firmware.bin from 0x00010000 len 0x000e3690**

S52:57.74 loader: 2994ms timeout waiting for reply

S52:57.74 loader: programming error: esp: erasing flash failed (reason #2)

S52:58.46 _health.qo: outboard DFU of esp32 firmware: esp: erasing flash failed (reason #2)

S52:58.47 dfu: disabled DFU mode

S52:58.48 sync: wakeup: dfu disabled

S52:58.48 dfu: cannot update host: esp: erasing flash failed (reason #2)

S52:58.49 dfu: retry #1

FYI, if I try to flash just the fimware.bin using esptool
e.g.
esptool.py write_flash 0x10000 .pio/build/m5stack-core2/firmware.bin

I see
Flash will be erased from 0x00010000 to 0x000eefff...

Even though the firmware.bin is smaller than that.

Having a peek at how esptool does it, it has a roundup function:

print(
    "Flash will be erased from {:#010x} to {:#010x}...".format(
        address - bytes_over,
        div_roundup(write_end, esp.FLASH_SECTOR_SIZE)
        * esp.FLASH_SECTOR_SIZE
        - 1,
    )
)

here is a trace from when it works (with the smaller binpack file)

S07:50.06 loader: erasing flash for firmware.bin from 0x00010000 len 0x000749e0
0x749e0 is not a multiple of 4096 either, so perhaps my assumption was wrong.

similarly,
esptool.py erase_region 0x00010000 0x000749e0
fails with the same error as before

S07:26.34 _health.qo: starting outboard DFU of esp32 firmware
G07:28.72 accel: movements:6 free-falls:0 5s:1000000000000000000000000000000000000000000000000
S07:28.96 loader: 994ms timeout waiting for reply
S07:30.07 loader: 994ms timeout waiting for reply
S07:30.21 loader: reading source offset 0x00000000 len 0x00002000 leaving 0x0005a90d remaining
S07:32.16 binpack created 2023-07-10 15:25:19 UTC
S07:32.16 binpack compression frame length: 8192
S07:32.17 binpack will load bootloader.bin at 00001000 for 17408 bytes
S07:32.17 binpack will load partitions.bin at 00008000 for 3072 bytes
S07:32.17 binpack will load firmware.bin at 00010000 for 477664 bytes
S07:32.18 loader: resetting source offset to 0x0000013c to align image to page boundary
S07:32.18 loader: reading source offset 0x0000013c len 0x00002000 leaving 0x0005a7d1 remaining
S07:34.63 loader: processing bootloader.bin off 0 (15577/8192/370641 remaining)
S07:34.63 loader: erasing flash for bootloader.bin from 0x00001000 len 0x00004400
S07:34.96 loader: bootloader.bin frame compressed len 7096
S07:34.97 loader: bootloader.bin frame decompressed from 7096 to 8192
S07:34.97 loader: writing flash for bootloader.bin len 0x00000800 6144 remaining
S07:35.63 loader: writing flash for bootloader.bin len 0x00000800 4096 remaining
S07:36.58 loader: writing flash for bootloader.bin len 0x00000800 2048 remaining
S07:37.41 loader: writing flash for bootloader.bin len 0x00000800 0 remaining
S07:38.09 loader: loaded bootloader.bin 0x00002bbc off 7100 len 00007100 (8477/1092/370641 remaining)
S07:38.10 loader: processing bootloader.bin off 7100 (8477/1092/370641 remaining)
S07:38.10 loader: bootloader.bin frame compressed len 7452
S07:38.10 loader: loaded bootloader.bin 0x00003000 off 8192 len 00001092 (7385/0/370641 remaining)
S07:38.11 loader: reading source offset 0x0000213c len 0x00002000 leaving 0x000587d1 remaining
S07:39.95 loader: processing bootloader.bin off 8192 (7385/8192/362449 remaining)
S07:39.96 loader: bootloader.bin frame decompressed from 7452 to 8192
S07:39.96 loader: writing flash for bootloader.bin len 0x00000800 6144 remaining
S07:40.77 loader: writing flash for bootloader.bin len 0x00000800 4096 remaining
S07:42.01 loader: writing flash for bootloader.bin len 0x00000800 2048 remaining
S07:42.92 loader: writing flash for bootloader.bin len 0x00000800 0 remaining
S07:43.87 loader: loaded bootloader.bin 0x000048dc off 14556 len 00006364 (1021/1828/362449 remaining)
S07:43.88 loader: processing bootloader.bin off 14556 (1021/1828/362449 remaining)
S07:43.88 loader: bootloader.bin frame compressed len 1017
S07:43.88 loader: bootloader.bin frame decompressed from 1017 to 1024
S07:43.89 loader: writing flash for bootloader.bin len 0x00000400 0 remaining
S07:44.61 loader: loaded bootloader.bin 0x00004cd9 off 15577 len 00001021 (0/807/362449 remaining)
S07:44.64 loader: flash verified for bootloader.bin
S07:44.65 loader: end-to-end bin verified for bootloader.bin
S07:44.65 loader: programming completed for bootloader.bin
S07:44.66 loader: resetting source offset to 0x00003e15 to align image to page boundary
S07:44.66 loader: reading source offset 0x00003e15 len 0x00002000 leaving 0x00056af8 remaining
S07:46.51 loader: processing partitions.bin off 0 (294/8192/355064 remaining)
S07:46.51 loader: erasing flash for partitions.bin from 0x00008000 len 0x00000c00
S07:46.64 loader: partitions.bin frame compressed len 290
S07:46.65 loader: partitions.bin frame decompressed from 290 to 3072
S07:46.65 loader: writing flash for partitions.bin len 0x00000800 1024 remaining
S07:47.26 loader: writing flash for partitions.bin len 0x00000400 0 remaining
S07:47.87 loader: loaded partitions.bin 0x00008126 off 294 len 00000294 (0/7898/355064 remaining)
S07:47.89 loader: flash verified for partitions.bin
S07:47.90 loader: end-to-end bin verified for partitions.bin
S07:47.90 loader: programming completed for partitions.bin
S07:47.90 loader: resetting source offset to 0x00003f3b to align image to page boundary
S07:47.91 loader: reading source offset 0x00003f3b len 0x00002000 leaving 0x000569d2 remaining
S07:50.05 loader: processing firmware.bin off 0 (362962/8192/354770 remaining)
S07:50.06 loader: erasing flash for firmware.bin from 0x00010000 len 0x000749e0
S07:52.34 loader: firmware.bin frame compressed len 5879
S07:52.35 loader: firmware.bin frame decompressed from 5879 to 8192
S07:52.35 loader: writing flash for firmware.bin len 0x00000800 6144 remaining
S07:53.03 loader: writing flash for firmware.bin len 0x00000800 4096 remaining
S07:53.84 loader: writing flash for firmware.bin len 0x00000800 2048 remaining
S07:54.62 loader: writing flash for firmware.bin len 0x00000800 0 remaining
S07:55.37 loader: loaded firmware.bin 0x000116fb off 5883 len 00005883 (357079/2309/354770 remaining)
S07:55.38 loader: processing firmware.bin off 5883 (357079/2309/354770 remaining)
S07:55.38 loader: firmware.bin frame compressed len 3904
S07:55.39 loader: loaded firmware.bin 0x00012000 off 8192 len 00002309 (354770/0/354770 remaining)
S07:55.39 loader: reading source offset 0x00005f3b len 0x00002000 leaving 0x000549d2 remaining
S07:57.24 loader: processing firmware.bin off 8192 (354770/8192/346578 remaining)
S07:57.24 loader: firmware.bin frame decompressed from 3904 to 8192
S07:57.24 loader: writing flash for firmware.bin len 0x00000800 6144 remaining
S07:57.86 loader: writing flash for firmware.bin len 0x00000800 4096 remaining
S07:58.47 loader: writing flash for firmware.bin len 0x00000800 2048 remaining
S07:59.09 loader: writing flash for firmware.bin len 0x00000800 0 remaining

wondering if anyone at Blues is looking into this? or do I need to pick it apart a bit more myself?

After many hours of debug… I’ve whittled it down to (seemingly!) being caused by including certain elements that come with the LVGL library.

for example including a particular font (lv_font_montserrat_16) seems to break outboard DFU, but normal flashing of the host is totally fine…

Investigations ongoing…

I haven’t been able to pin it down to anything in particular.

I’m now using the latest developer notecard firmware (5.1.1.16026) with a wifi notecard, as outboard DFU is much faster this way, so I can iterate.

Including the montserrat_16 font I mentioned can “tip it over”, but in a more minimal example the font is OK. conversely a larger firmware without that font can also fail to flash.

I doesn’t seem to be purely down to the size though, as I intentionally created a >1MB binpack file which flashes OK via outboard DFU.

I’ve tried combining the firmware/partitions/bootloader with esptool before doing the binpack, but same result.

Also, I’ve tried it on an ESP32-S2 host, which doesn’t have the issue (exact same binpack file).
I haven’t been able to recreate the issue with the S2 chip.

This isn’t a random or intermittent error, it happens consistently with particular binpack files.
So it seems to depend on the content of the binpack file, but not the absolute size.

One thought is that it could be related to compression/decompression - some sequence present in the file that trips it up. But the fact it works on ESP32-S2 make that unlikely.

The outcome might be that I switch to an ESP32-S3 host (which I expect to work the same as S2). But frustrating that I haven’t been able to pinpoint the bug.

Would have been nice to have some input from Blues on this issue.

a development - I’ve now had it fail with ESP32-S2. It gets past the erasing stage and then has a different error
This was using a ~600KB LVGL based firmware, same one I’ve been trying to get working on ESP32

S04:10.10 loader: writing flash for firmware.bin len 0x00000800 2048 remaining
S04:10.72 loader: writing flash for firmware.bin len 0x00000800 0 remaining
S04:11.34 loader: loaded firmware.bin 0x000110c3 off 4291 len 00004291 (590535/3901/586634 remaining)
S04:11.34 loader: processing firmware.bin off 4291 (590535/3901/586634 remaining)
S04:11.35 loader: firmware.bin frame compressed len 4939
S04:11.35 loader: loaded firmware.bin 0x00012000 off 8192 len 00003901 (586634/0/586634 remaining)
S04:11.36 loader: reading source offset 0x000020a5 len 0x00002000 leaving 0x0008d38a remaining
S04:11.37 loader: processing firmware.bin off 8192 (586634/8192/578442 remaining)
S04:11.37 loader: firmware.bin frame decompressed from 4939 to 8192
S04:11.37 loader: writing flash for firmware.bin len 0x00000800 6144 remaining
S04:11.99 loader: writing flash for firmware.bin len 0x00000800 4096 remaining
S04:12.61 loader: writing flash for firmware.bin len 0x00000800 2048 remaining
S04:13.23 loader: writing flash for firmware.bin len 0x00000800 0 remaining
S04:13.85 loader: loaded firmware.bin 0x00012412 off 9234 len 00001042 (585592/7150/578442 remaining)
S04:13.85 loader: processing firmware.bin off 9234 (585592/7150/578442 remaining)
S04:13.86 loader: firmware.bin frame compressed len 5698
S04:13.87 loader: firmware.bin frame decompressed from 5698 to 8192
S04:13.87 loader: writing flash for firmware.bin len 0x00000800 6144 remaining
S04:19.38 loader: 4861ms timeout waiting for reply
S04:19.38 *** retrying after esp: flash could not be programmed (reason #2) ***
S04:21.41 loader: writing flash for firmware.bin len 0x00000800 6144 remaining
S04:24.08 loader: writing flash for firmware.bin len 0x00000800 6144 remaining
S04:24.73 loader: debug: Error: 
S04:24.73 loader: debug: INVALID_COMMAND
S04:24.73 loader: debug: 
S04:24.74 *** retrying after esp: flash could not be programmed (reason #9) ***
S04:26.76 loader: writing flash for firmware.bin len 0x00000800 6144 remaining
S04:27.41 loader: debug: Error: 
S04:27.42 loader: debug: INVALID_COMMAND
S04:27.42 loader: debug: 
S04:27.43 *** retrying after esp: flash could not be programmed (reason #9) ***
S04:29.45 loader: writing flash for firmware.bin len 0x00000800 6144 remaining
S04:30.10 loader: debug: Error: 
S04:30.10 loader: debug: INVALID_COMMAND
S04:30.11 loader: debug: 
S04:30.11 *** retrying after esp: flash could not be programmed (reason #9) ***
S04:32.13 *** giving up esp: flash could not be programmed (reason #9) ***

interesting… the S2 log I just posted. After that it retries a few times, sometimes giving a different error

S05:17.77 *** retrying after esp: flash could not be programmed (reason #2) ***
S05:19.79 loader: writing flash for firmware.bin len 0x00000800 0 remaining
S05:20.52 loader: debug: Error: 
S05:20.52 loader: debug: INVALID_CRC
S05:20.53 loader: debug: 

But it does eventually succeed on retry 4

S07:28.12 dfu: cannot update host: esp: flash could not be programmed (reason #9)
S07:28.13 dfu: retry #4
S07:28.13 sync: wakeup: card firmware update error
S12:03.89 loader: loaded firmware.bin 0x000a04c8 off 591048 len 00001224 (3778/3778/0 remaining)
S12:03.90 loader: processing firmware.bin off 591048 (3778/3778/0 remaining)
S12:03.90 loader: firmware.bin frame compressed len 3774
S12:03.91 loader: firmware.bin frame decompressed from 3774 to 4480
S12:03.91 loader: writing flash for firmware.bin len 0x00000800 2432 remaining
S12:04.72 loader: writing flash for firmware.bin len 0x00000800 384 remaining
S12:05.47 loader: writing flash for firmware.bin len 0x00000180 0 remaining
S12:06.09 loader: loaded firmware.bin 0x000a138a off 594826 len 00003778 (0/0/0 remaining)
S12:07.35 loader: flash verified for firmware.bin
S12:07.35 loader: end-to-end bin verified for firmware.bin
S12:07.36 loader: programming completed for firmware.bin
S12:07.36 loader: programming completed successfully
S12:08.10 _health.qo: outboard DFU of esp32 firmware successfully completed
S12:08.11 dfu: disabled DFU mode
S12:08.11 sync: wakeup: dfu disabled
S12:08.13 _health.qo: DFU host firmware completed: successful host update
S12:08.48 sync: wakeup: dfu SetStateToCompleted
S12:08.49 sync: wakeup: dfu completed
S12:08.49 dfu: successfully updated host

Just a follow up that I have now tried this with ESP32-S3 (in a M5stack CoreS3)
Its is working well so far, although I haven’t tried with a image larger than ~500kB.

Small stumble because the S3 bootloader needs to be at 0x0 not 0x1000, so my binpack command looks like this:

/note-cli_1.2.1_macOS_arm64/notecard -binpack esp32 0x0:.pio/build/m5stack-cores3/bootloader.bin 0x8000:.pio/build/m5stack-cores3/partitions.bin 0x10000:.pio/build/m5stack-cores3/firmware.bin

This isn’t really “solved” but this workaround will do for me.

1 Like

I’ve now seen the same error on ESP32-S3
So not solved or worked around.
Very disappointing. I’ll update if I find out more, but have burned a lot of time on this now.

Interesting…
I tried a different firmware that I had seen work (but only tried once). It failed 6 times, but on the 7th retry it worked.

Could it be a timeout issue?
e.g. if it takes more than ~3s to erase the flash it gives up.

S30:42.96 loader: erasing flash for firmware.bin from 0x00010000 len 0x000c4d00
S30:46.41 loader: 2960ms timeout waiting for reply
S30:46.41 *** loader program_begin err esp: erasing flash failed (reason #2) *** (err 0)
S30:46.42 loader: programming error: esp: erasing flash failed (reason #2)

Full Trace log:
Notecard Log Nov 2.log (164.8 KB)

To get an idea if timeout could be an issue, erasing ~1MB of flash with esptool over USB takes 3.6 seconds. Which appears to be longer than the timeout allowed.

esptool.py erase_region 0x0 0x100000

Notecard is erasing closer to ~800kB, which takes 2.6s with esptool, but possibly longer over serial like the notecard is doing?

OK, comparing 3 similar firmwares of slightly different size

The largest “always” (only tried a couple of times) gives “erasing flash failed”
The middle sized one worked on the 7th retry,
The smallest worked on the 1st retry

I’ve invited blues support to the project.

Alternatively, if there is a way you could give me a test version of the notecard firmware with a longer timeout, I’d be happy to give it a try,

I went back to this after another month…
I updated to latest Notecard firmware (5.3.1) and couldn’t recreate the issue…
but then I reverted back to 5.1.1 and it was still working.

Hardware should be almost identical. So its a mystery!