U-boot 'nand markbad' has no effect - u-boot

Situation : board with an Arm CPU that has Nand flash next to it. On power-up, U-boot bootloader starts up and copies the flash contents to RAM, then it transfers control to that code in RAM. A Linux system with some application code, composed through Buildroot, starts running. Its entire filesystem is stored as a single UBIFS file in flash, and it starts using that.
When a certain byte is set, the bootloader keeps in control, and starts a TFTP transfer to download and store a new flash image.
Trigger : a board came back defective. Linux kernel startup clearly shows the issue:
[ 1.931150] Creating 8 MTD partitions on "atmel_nand":
[ 1.936285] 0x000000000000-0x000000040000 : "at91bootstrap"
[ 1.945280] 0x000000040000-0x0000000c0000 : "bootloader"
[ 1.954065] 0x0000000c0000-0x000000100000 : "bootloader env"
[ 1.963262] 0x000000100000-0x000000140000 : "bootloader redundant env"
[ 1.973221] 0x000000140000-0x000000180000 : "spare"
[ 1.981552] 0x000000180000-0x000000200000 : "device tree"
[ 1.990466] 0x000000200000-0x000000800000 : "kernel"
[ 1.999210] 0x000000800000-0x000010000000 : "rootfs"
...
[ 4.016251] ubi0: attached mtd7 (name "rootfs", size 248 MiB)
[ 4.022181] ubi0: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
[ 4.029040] ubi0: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
[ 4.035941] ubi0: VID header offset: 2048 (aligned 2048), data offset: 4096
[ 4.042960] ubi0: good PEBs: 1980, bad PEBs: 4, corrupted PEBs: 0
[ 4.049033] ubi0: user volume: 2, internal volumes: 1, max. volumes count: 128
[ 4.056359] ubi0: max/mean erase counter: 2/0, WL threshold: 4096, image sequence number: 861993884
[ 4.065476] ubi0: available PEBs: 0, total reserved PEBs: 1980, PEBs reserved for bad PEB handling: 36
[ 4.074898] ubi0: background thread "ubi_bgt0d" started, PID 77
...
[ 4.298009] UBIFS (ubi0:0): UBIFS: mounted UBI device 0, volume 0, name "rootfs", R/O mode
[ 4.306415] UBIFS (ubi0:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
[ 4.316418] UBIFS (ubi0:0): FS size: 155926528 bytes (148 MiB, 1228 LEBs), journal size 9023488 bytes (8 MiB, 72 LEBs)
[ 4.327197] UBIFS (ubi0:0): reserved for root: 0 bytes (0 KiB)
[ 4.333095] UBIFS (ubi0:0): media format: w4/r0 (latest is w5/r0), UUID AE9F77DC-04AF-433F-92BC-D3375C83B518, small LPT model
[ 4.346924] VFS: Mounted root (ubifs filesystem) readonly on device 0:15.
[ 4.356186] devtmpfs: mounted
[ 4.367038] Freeing unused kernel memory: 1024K
[ 4.371812] Run /sbin/init as init process
[ 4.568143] UBIFS (ubi0:1): background thread "ubifs_bgt0_1" started, PID 83
[ 4.644809] UBIFS (ubi0:1): recovery needed
[ 4.685823] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 235:4096, read only 126976 bytes, retry
[ 4.732212] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 235:4096, read only 126976 bytes, retry
[ 4.778705] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 235:4096, read only 126976 bytes, retry
[ 4.824159] ubi0 error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 235:4096, read 126976 bytes
... which causes an exception, but the kernel keeps on going, then another error is detected :
[ 5.071518] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 709:4096, read only 126976 bytes, retry
[ 5.118110] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 709:4096, read only 126976 bytes, retry
[ 5.164447] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 709:4096, read only 126976 bytes, retry
[ 5.210987] ubi0 error: ubi_io_read: error -74 (ECC error) while reading 126976 bytes from PEB 709:4096, read 126976 bytes
... but impressively, the system still comes up alive and behaves almost fine.
Why does the kernel not mark these flash blocks as bad ? Those data can't be read anyway, and at least the next image flashing might skip the bad blocks...
Investigation : so the Kernel found a defective PEB #235 (decimal) in the "rootfs" partition of the flash. Each PEB is 128KB, so the error sits somewhere beyond byte 30,801,920 (decimal). Since the "rootfs" partition only starts from byte 0x800000 of the flash, the actual damaged page must be somewhere beyond byte 39,190,528 (decimal) or 0x2560000. And sure enough, when using the nand read utility within U-boot :
U-Boot> nand read 0x20000000 0x2560000 0x1000
NAND read: device 0 offset 0x2560000, size 0x1000
4096 bytes read: OK
U-Boot> nand read 0x20000000 0x2561000 0x1000
NAND read: device 0 offset 0x2561000, size 0x1000
4096 bytes read: OK
U-Boot> nand read 0x20000000 0x2562000 0x1000
NAND read: device 0 offset 0x2562000, size 0x1000
PMECC: Too many errors
NAND read from offset 2562000 failed -5
0 bytes read: ERROR
so the damaged page sits at offset 8K within that block of flash.
From various other posts, I learned that nand flash with 2K pages organized in 128K blocks, has an extra 64 "Out Of Band" bytes over every 2048 payload bytes, bringing each page to a gross size of 2112 bytes. Anyway, the entire block of 128K will have to be disused, as this is the erase size. No problem, there is storage to spare, I just want to make sure that the next flashing will skip over this bad block.
Since neither the Linux kernel nor the bootloader bothered to mark the bad block, I'll do it by hand in U-boot:
U-Boot> nand markbad 2562000
block 0x02562000 successfully marked as bad
A similar investigation for the 2nd bad flash page reveals that the other error sits at flash address 0x60a1000 :
U-Boot> nand read 0 60A1000 800
NAND read: device 0 offset 0x60a1000, size 0x800
PMECC: Too many errors
NAND read from offset 60a1000 failed -5
0 bytes read: ERROR
so here too, the nand markbad utility is used to manually put a permanent mark on this block :
U-Boot> nand markbad 60a1000
block 0x060a1000 successfully marked as bad
and to verify that everything is taken into account :
U-Boot> nand bad
Device 0 bad blocks:
02560000
060a0000
Just like it should be - from the start of each 128K block, both blocks are marked.
Problem : so I learned that the 64 OOB bytes are divided in 2 bytes marker, 38 bytes error-correcting code, and 24 bytes journaling. Of all the OOB bytes accompanying each 2048 payload bytes, only the very first piece of 64 bytes, accompanying the first page of 2KB, lends its 2 bytes marker code to indicate the status of the entire 128KB block. These 2 bytes should be modified in the flash device itself so that this status is persistent. So in my U-boot session, instead of launching the Linux system, I restarted the CPU and remained in U-boot :
U-Boot> reset
resetting ...
RomBOOT
ba_offset = 0xc ...
AT91Bootstrap 3.6.0-00029-g0cd4e6a (Wed Nov 12 12:14:04 CET 2014)
NAND: ONFI flash detected
NAND: Manufacturer ID: 0x2c Chip ID: 0x32
NAND: Disable On-Die ECC
PMECC: page_size: 0x800, oob_size: 0x40, pmecc_cap: 0x4, sector_size: 0x200
NAND: Initialize PMECC params, cap: 0x4, sector: 0x200
NAND: Image: Copy 0x80000 bytes from 0x40000 to 0x26f00000
NAND: Done to load image
U-Boot 2013.10-00403-g1f9a20a (Nov 12 2014 - 12:14:27)
CPU: SAMA5D31
Crystal frequency: 12 MHz
CPU clock : 528 MHz
Master clock : 132 MHz
DRAM: 128 MiB
NAND: 256 MiB
MMC: mci: 0
In: serial
Out: serial
Err: serial
Net: macb0
Hit any key to stop autoboot: 0
U-Boot> nand info
Device 0: nand0, sector size 128 KiB
Page size 2048 b
OOB size 64 b
Erase size 131072 b
U-Boot> nand bad
Device 0 bad blocks:
U-Boot>
The bad blocks have been forgotten - the marker code was not applied persistently ?
Granted, this U-boot version seems rather old. Has the nand markbad utility been improved since then ?
Workaround : I modified the OOB bytes of the first page within the bad block myself. I read all 2112 bytes of the first page into RAM, then modified the 2 bytes marker code, and wrote the 2112 bytes back from RAM into flash. Technically, I should have erased the whole 128K flash page and then written back all 128K of contents. But my laziness has been challenged enough today. Nand flash can be toggled from 1 to 0 arbitrarily - it's the reverse operation that is hard, requiring an erase to restore a whole 128K page back to all-0xFF. I noticed that all the "block good" markers are encoded as 0xFFFF, so I figured that writing "0x0000" instead should suffice.
U-Boot> nand read.raw 0x20200000 0x2560000 1
NAND read: 2112 bytes read: OK
The format for nand read.raw is a bit quirky, as opposed to nand.read which expects size as the last argument in bytes, it wants size expressed in number-of-pages instead. The first page is all we need, so argument '1' does the trick. The contents, which have now been transferred to RAM, can be inspected with U-boot's md utility :
U-Boot> md 0x20200000 0x210
20200000: 23494255 00000001 00000000 01000000 UBI#............
20200010: 00080000 00100000 9cfb6033 00000000 ........3`......
...
202007e0: 00000000 00000000 00000000 00000000 ................
202007f0: 00000000 00000000 00000000 00000000 ................
20200800: ffffffff ffffffff ffffffff ffffffff ................
20200810: ffffffff ffffffff ffffffff ffffffff ................
20200820: ffffffff b0c9aa24 0008fdb8 00000000 ....$...........
20200830: 00000000 00000000 00000000 00000000 ................
Note how the md utility expects its size argument in yet a different format : this one expects it in units of words. Just to keep us alert.
The dump at address 0x20200800 clearly shows how markbad has failed its purpose: the 2 marker bytes of the bad block are still merrily on 0xFFFF.
Then to modify these bytes, another U-boot utility comes in handy :
U-Boot> mm 0x20200800
20200800: ffffffff ? 00000000
20200804: ffffffff ? q
It's a bit crude, I've changed the 4 first OOB bytes instead of just the 2 first marker bytes. Finall, to write the modified contents back into flash :
U-Boot> nand write.raw 0x20200000 0x2560000 1
NAND write: 2112 bytes written: OK
Funny enough, the nand bad diagnostic doesn't notice the block which has just been marked, even after some nand read attempts which do fail.
U-Boot> nand bad
Device 0 bad blocks:
U-Boot>
But this is no cause for alarm. The 2nd bad block was marked manually in a similar fashion, and upon another reset :
U-Boot> reset
resetting ...
RomBOOT
ba_offset = 0xc ...
AT91Bootstrap 3.6.0-00029-g0cd4e6a (Wed Nov 12 12:14:04 CET 2014)
...
U-Boot 2013.10-00403-g1f9a20a (Nov 12 2014 - 12:14:27)
...
Hit any key to stop autoboot: 0
U-Boot> nand bad
Device 0 bad blocks:
02560000
060a0000
U-Boot>
Lo and behold, the 'bad block' marking has persisted ! The next flash storage operation neatly skipped over the bad blocks, saving a consistent kernel and filesystem in the various partitions of the flash. This was the intention all along, but it seems to require gritty manual work. Is there no automated way ?

U-Boot has changed quite a bit since 2014. Patches possibly of relevance to your problem include:
dc0b69fa9f97 ("mtd: nand: mxs_nand: allow to enable BBT support")
c4adf9db5d38 ("spl: nand: sunxi: remove support for so-called 'syndrome' mode")
8d1809a96699 ("spl: nand: simple: replace readb() with chip specific read_buf()")
Please, retest with U-Boot Git HEAD. If there is still something missing, please, report it to the U-Boot developer list or even better send your patch.

Related

can't understand offset member in program header and section header

I read man page. it says that
in ELF header:
e_phoff - This member holds the program header table's file offset in bytes.
e_shoff - This member holds the section header table's file offset in bytes.
In Program Header
p_offset This member holds the offset from the beginning of the file
at which the first byte of the segment resides.
In Section Header
sh_offset This member's value holds the byte offset from the beginning of the file to the first byte in the section
I'm confused. In my opinion, this means that in Elf header i can see offset to all program and section headers. And in the program header, i can see offset to the concrete segment in the file. In section header, i can see offset to the concrete section in the file. But it is not true. I found simple elf parse and i have seen this result
segment offset: 52
section offset: 6032
Program Entry point: 0x8048420
Section header list:
.interp: 0x8048154
offset: 340
.note.ABI-tag: 0x8048168
offset: 360
.note.gnu.build-id: 0x8048188
offset: 392
.gnu.hash: 0x80481ac
offset: 428
.dynsym: 0x80481e8
offset: 488
.dynstr: 0x80482b8
offset: 696
.gnu.version: 0x8048342
offset: 834
.gnu.version_r: 0x804835c
offset: 860
.rel.dyn: 0x804837c
offset: 892
.rel.plt: 0x8048394
offset: 916
.init: 0x80483ac
offset: 940
.plt: 0x80483d0
offset: 976
.plt.got: 0x8048410
offset: 1040
.text: 0x8048420
offset: 1056
.fini: 0x8048604
offset: 1540
.rodata: 0x8048618
offset: 1560
.eh_frame_hdr: 0x8048628
offset: 1576
.eh_frame: 0x8048664
offset: 1636
.init_array: 0x8049efc
offset: 3836
.fini_array: 0x8049f00
offset: 3840
.dynamic: 0x8049f04
offset: 3844
.got: 0x8049ff4
offset: 4084
.got.plt: 0x804a000
offset: 4096
.data: 0x804a018
offset: 4120
.bss: 0x804a020
offset: 4128
.comment: 0x0
offset: 4128
.symtab: 0x0
offset: 4172
.strtab: 0x0
offset: 5244
.shstrtab: 0x0
offset: 5768
Program header list
Phdr segment: 0x8048034
offset: 52
Interpreter: /lib/ld-linux.so.2
offset: 340
Text segment: 0x8048000
offset: 0
Data segment: 0x8049efc
offset: 3836
Dynamic segment: 0x8049f04
offset: 3844
Note segment: 0x8048168
offset: 360
PT_GNU_EH_FRAME: 0x8048628
offset: 1576
PT_GNU_STACK: 0x0
offset: 0
PT_GNU_RELRO: 0x8049efc
offset: 3836
As you can see Elf offset have section offset = 6032, but all sections offset less than Elf offset. Actually all sections in this program have offset like 6032 + (n * sizeof(Elf32_Shdr)). in this case, I can’t understand what does the offset in the section header mean? I thought, that it is offset in process image, but man page is talking about the offset inside the file. the same question about the offset in the program header. Please clarify what all the same mean section header offset and program header offset.
the parser is too large, so I did not attach it. But if someone needs it, I will do it
Actually all sections in this program have offset like 6032 + (n * sizeof(Elf32_Shdr)).
No, not all sections have this offsets, but all section entries of the section header table.
What you see is that the table is placed at a higher offset than the sections that are defined in its entries.
In your example:
At offset 1056 of the file starts the ".text" section.
At offset 6032 of the file starts the section header table. Its 14th entry (at 6032 + 13 * sizeof(Elf32_Shdr)) defines the ".text" section and gives its offset as 1056.

Can GNU LD print memory usage by memory space, rather then just as a bulk percentage?

I'm working on an embedded project on an ARM mcu that has a custom linker file with several different memory spaces:
/* Memory Spaces Definitions */
MEMORY
{
rom (rx) : ORIGIN = 0x00400000, LENGTH = 0x00200000
data_tcm (rw) : ORIGIN = 0x20000000, LENGTH = 0x00008000
prog_tcm (rwx) : ORIGIN = 0x00000000, LENGTH = 0x00008000
ram (rwx) : ORIGIN = 0x20400000, LENGTH = 0x00050000
sdram (rw) : ORIGIN = 0x70000000, LENGTH = 0x00200000
}
Specifically, I have a number of different memory devices with different characteristics (TCM, plain RAM (with a D-Cache in the way), and an external SDRAM), all mapped as part of the same address space.
I'm specifically placing different variables in the different memory spaces, depending on the requirements (am I DMA'ing into it, do I have cache-coherence issues, do I expect to overflow the D-cache, etc...).
If I exceed any one of the sections, I get a linker error. However, unless I do so, the linker only prints the memory usage as bulk percentage:
Program Memory Usage : 33608 bytes 1.6 % Full
Data Memory Usage : 2267792 bytes 91.1 % Full
Given that I have 3 actively used memory spaces, and I know for a fact that I'm using 100% of one of them (the SDRAM), it's kind of a useless output.
Is there any way to make the linker output the percentage of use for each memory space individually? Right now, I have to manually open the .map file, search for the section header, and then manually subtract the size from the total available memory specified in the .ld file.
While this is kind of a minor thing, it'd sure be nice to just have the linker do:
Program Memory Usage : 33608 bytes 1.6 % Full
Data Memory Usage : 2267792 bytes 91.1 % Full
data_dtcm : xxx bytes xx % Full
ram : xxx bytes xx % Full
sdram : xxx bytes xx % Full
This is with GCC-ARM, and therefore GCC-LD.
Arrrgh, so of course, I find the answer right after asking the question:
--print-memory-usage
Used as -Wl,--print-memory-usage, you get the following:
Memory region Used Size Region Size %age Used
rom: 31284 B 2 MB 1.49%
data_tcm: 26224 B 32 KB 80.03%
prog_tcm: 0 GB 32 KB 0.00%
ram: 146744 B 320 KB 44.78%
sdram: 2 MB 2 MB 100.00%

!heap -stat -h doesn't show allocations

I'm trying to determine why my applications consumes 4GB of Private Bytes. So I took a full memory dump, loaded it in windbg. But analyzing using !heap -stat -h produces weird results which don't add up:
0:000> !heap -s
(...)
Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
(k) (k) (k) (k) length blocks cont. heap
-------------------------------------------------------------------------------------
000002d0a0000000 00000002 2800804 2780508 2800700 2984 1980 177 0 6 LFH
000002d09fe10000 00008000 64 4 64 2 1 1 0 0
000002d09ff70000 00001002 1342924 1334876 1342820 13042 3342 87 0 0 LFH
Ok, I got a 2.8GB heap and a 1.34GB heap. Let's look at the allocations of the first one:
0:000> !heap -stat -h 000002d0a0000000
heap # 000002d0a0000000
group-by: TOTSIZE max-display: 20
size #blocks total ( %) (percent of total busy bytes)
651 291 - 1035e1 (16.00)
79c 1df - e3ce4 (14.06)
28 156d - 35908 (3.31)
(...)
IIUC, the first line means block size 0x651(=1617 bytes), number blocks 0x291(=657), for total bytes of 0x103531(=1062369 bytes =~1MB), and that's 16% of total busy bytes. But looking at the summary, there should be ~2.8GB of busy bytes!
Another disparity:
0:000> !heap -stat -h 000002d0a0000000 -grp A
heap # 000002d0a0000000
group-by: ALLOCATIONSIZE max-display: 20
size #blocks total ( %) (percent of total busy bytes)
a160 1 - a160 (0.62)
7e50 2 - fca0 (0.97)
0:000> !heap -h 000002d0a0000000
(...)
(509 lines that note allocations with size 7e50, like this one:)
000002d0a3f48000: 11560 . 07e60 [101] - busy (7e50)
Edit: Many lines also say Internal at the end, which appears to mean HEAP_ENTRY_VIRTUAL_ALLOC - but the 509 lines with (7e50) don't.
My question: How can I get !heap -stat -h to show all the allocations, so they add up to the output of !heap -s?
At the moment I can only explain the busy percentage, but that may already be helpful. Its value is a bit misleading.
Virtual memory is memory taken from VirtualAlloc(). The C++ heap manager uses that basic mechanism to get memory from the operating system. That virtual memory can be committed (ready to use) or reserved (can be committed later). The output of !heap -s tells you the status of the heaps with respect to that virtual memory.
So we agree that any memory the C++ heap manager can use is committed memory. This coarse granular virtual memory is split into finer blocks by the C++ heap manager. The heap manager may allocate such smaller blocks and free them, depending on the need of malloc()/free() or new/delete operations.
When blocks become free, they are no longer busy. At the same time, the C++ heap manager may decide to not give the free block back to the OS, because
it can't, since other parts of the 64k virtual memory are still in use
or it doesn't want to (internal reasons we can't exactly know, e.g. performance reasons)
Since the free parts do not count as busy, the busy percentage seems to be too high when compared to the virtual memory.
Mapped to your case, this means:
you have 2.8 GB of virtual memory
in heap 000002d0a0000000, you have ~1 MB / 16% = 6.25 MB of memory in use, the rest could be in free heap blocks (it possibly isn't)
The following example is based on this C++ code:
#include "stdafx.h"
#include <iostream>
#include <Windows.h>
#include <string>
#include <iomanip>
int main()
{
HANDLE hHeap = HeapCreate(0, 0x1000000, 0x10000000); // no options, initial 16M, max 256M
HeapAlloc(hHeap, HEAP_GENERATE_EXCEPTIONS, 511000); // max. allocation size for non-growing heap
std::cout << "Debug now, handle is 0x" << std::hex << std::setfill('0') << std::setw(sizeof(HANDLE)) << hHeap << std::endl;
std::string dummy;
std::getline(std::cin, dummy);
return 0;
}
The only 511kB block will be reported as 100%, although it is only ~1/32 of the 16 MB:
0:001> !heap -stat -h 009c0000
heap # 009c0000
group-by: TOTSIZE max-display: 20
size #blocks total ( %) (percent of total busy bytes)
7cc18 1 - 7cc18 (100.00)
To see the free parts as well, use !heap -h <heap> -f:
0:001> !heap -h 0x01430000 -f
Index Address Name Debugging options enabled
3: 01430000
Segment at 01430000 to 11430000 (01000000 bytes committed)
Flags: 00001000
ForceFlags: 00000000
Granularity: 8 bytes
Segment Reserve: 00100000
Segment Commit: 00002000
DeCommit Block Thres: 00000200
DeCommit Total Thres: 00002000
Total Free Size: 001f05c7
Max. Allocation Size: 7ffdefff
Lock Variable at: 01430138
Next TagIndex: 0000
Maximum TagIndex: 0000
Tag Entries: 00000000
PsuedoTag Entries: 00000000
Virtual Alloc List: 014300a0
Uncommitted ranges: 01430090
FreeList[ 00 ] at 014300c4: 01430590 . 0240e1b0
0240e1a8: 7cc20 . 21e38 [100] - free <-- no. 1
02312588: 7f000 . 7f000 [100] - free <-- no. 2
[...]
01430588: 00588 . 7f000 [100] - free <-- no. 32
Heap entries for Segment00 in Heap 01430000
address: psize . size flags state (requested size)
01430000: 00000 . 00588 [101] - busy (587)
01430588: 00588 . 7f000 [100]
[...]
02312588: 7f000 . 7f000 [100]
02391588: 7f000 . 7cc20 [101] - busy (7cc18)
0240e1a8: 7cc20 . 21e38 [100]
0242ffe0: 21e38 . 00020 [111] - busy (1d)
02430000: 0f000000 - uncommitted bytes.
0:001> ? 7cc18
Evaluate expression: 511000 = 0007cc18
Here we see that I have a heap of 256 MB (240 MB uncommitted, 0x0f000000 + 16 MB committed, 0x01000000). Summing up the items in the FreeList, I get
0:001> ? 0n31 * 7f000 + 21e38
Evaluate expression: 16264760 = 00f82e38
So almost everything (~16 MB) is considered as free and not busy by the C++ heap manager. Memory like that 16 MB is reported by !heap -s in this way in WinDbg 6.2.9200:
0:001> !heap -s
LFH Key : 0x23e41d0e
Termination on corruption : ENABLED
Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
(k) (k) (k) (k) length blocks cont. heap
-----------------------------------------------------------------------------
004d0000 00000002 1024 212 1024 6 5 1 0 0 LFH
00750000 00001002 64 20 64 9 2 1 0 0
01430000 00001000 262144 16384 262144 15883 32 1 0 0
External fragmentation 96 % (32 free blocks)
-----------------------------------------------------------------------------
IMHO there's a bug regarding reserved and committed memory: it should be 262144k virtual - 16384 committed = 245760k reserved.
Note how the list length matches the number of free blocks reported before.
Above explains the busy percentage only. The remaining question is: the free memory reported in your case doesn't match this scenario.
Usually I'd say the remaining memory is in virtual blocks, i.e. memory blocks that are larger than 512 kB (32 bit) or 1 MB (64 bit) as mentioned on MSDN for growable heaps. But that's not the case here.
There is no output about virtual blocks and the number of virtual blocks is reported as 0.
A program that generates a virtual block would be
#include "stdafx.h"
#include <iostream>
#include <Windows.h>
#include <string>
#include <iomanip>
int main()
{
HANDLE hHeap = HeapCreate(0, 0x1000000, 0); // no options, initial 16M, growable
HeapAlloc(hHeap, HEAP_GENERATE_EXCEPTIONS, 20*1024*1024); // 20 MB, force growing
std::cout << "Debug now, handle is 0x" << std::hex << std::setfill('0') << std::setw(sizeof(HANDLE)) << hHeap << std::endl;
std::string dummy;
std::getline(std::cin, dummy);
return 0;
}
and the !heap command would mention the virtual block:
0:001> !heap -s
LFH Key : 0x7140028b
Termination on corruption : ENABLED
Heap Flags Reserv Commit Virt Free List UCR Virt Lock Fast
(k) (k) (k) (k) length blocks cont. heap
-----------------------------------------------------------------------------
006d0000 00000002 1024 212 1024 6 5 1 0 0 LFH
001d0000 00001002 64 20 64 9 2 1 0 0
Virtual block: 01810000 - 01810000 (size 00000000)
00810000 00001002 16384 16384 16384 16382 33 1 1 0
External fragmentation 99 % (33 free blocks)
-----------------------------------------------------------------------------
In your case however, the value virtual blocks is 0. Perhaps this what is reported as "Internal" in your version of WinDbg. If you have not upgraded yet, try version 6.2.9200 to get the same output as I do.

Erasing Flash NOR: ioctl(MEMUNLOCK) return status?

I'm trying to erase a NOR Flash memory with Linux MTD driver in C...
I'm confused about the return status from the ioctl(MEMUNLOCK) call which returns an error even if ioctl(MEMERASE) is successful after it.
The following code displays the warning message but works (i.e. the Flash block has been erased):
int erase_MTD_Pages(int fd, size_t size, off_t offset)
{
mtd_info_t mtd_info;
erase_info_t ei;
ioctl(fd, MEMGETINFO, &mtd_info);
ei.length = mtd_info.erasesize;
for(ei.start = offset; ei.start < (offset+size); ei.start += mtd_info.erasesize) {
if(ioctl(fd, MEMUNLOCK, &ei) < 0)
{
// logPrintf(FAILURE, "[Flash] Can not unlock MTD (MEMUNLOCK, errno=%d)!\n", errno);
// return RETURN_FILE_ERROR;
logPrintf(WARNING, "[Flash] Can not unlock MTD (MEMUNLOCK, errno=%d)!\n", errno);
}
if(ioctl(fd, MEMERASE, &ei) < 0)
{
logPrintf(FAILURE, "[Flash] Can not erase MTD (MEMERASE, errno=%d)!\n", errno);
return RETURN_FILE_ERROR;
}
}
return RETURN_SUCCESS;
}
When I look some C codes on the net, the return status from MEMUNLOCK is not always checked (e.g. from mtc.c):
ioctl(fd, MEMUNLOCK, &mtdEraseInfo);
if(ioctl(fd, MEMERASE, &mtdEraseInfo)) {
fprintf(stderr, "Could not erase MTD device: %s\n", mtd);
close(fd);
exit(1);
}
flash_unlock also returns an error:
root $ cat /proc/mtd
dev: size erasesize name
mtd0: 00020000 00020000 "X-Loader-NOR"
mtd1: 000a0000 00020000 "U-Boot-NOR"
mtd2: 00040000 00020000 "Boot Env-NOR"
mtd3: 00400000 00020000 "Kernel-NOR"
mtd4: 03b00000 00020000 "File System-NOR"
root $ mtd_debug info /dev/mtd3
mtd.type = MTD_NORFLASH
mtd.flags = MTD_CAP_NORFLASH
mtd.size = 4194304 (4M)
mtd.erasesize = 131072 (128K)
mtd.writesize = 1
mtd.oobsize = 0
regions = 0
root $ flash_unlock /dev/mtd3
Could not unlock MTD device: /dev/mtd3
Am I missing something? Is it normal to get an error from MEMUNLOCK with some configurations?
Notes / Environment:
The read-only flag (MTD_WRITEABLE) in not set on the mtd3 partition (only on mtd0 and mtd1).
flash_lock also returns the same error.
TI AM3505 (ARM Cortex A8, OMAP34).
Linux 2.6.37.
Flash NOR Spansion S29GL512S12DHIV1.
Kernel log:
mtdoops: mtd device (mtddev=name/number) must be supplied
physmap platform flash device: 08000000 at 08000000
physmap-flash.0: Found 1 x16 devices at 0x0 in 16-bit bank. Manufacturer ID 0x000001 Chip ID 0x002301
Amd/Fujitsu Extended Query Table at 0x0040
Amd/Fujitsu Extended Query version 1.5.
Silicon revision: 14
Address sensitive unlock: Required
Erase Suspend: Read/write
Block protection: 1 sectors per group
Temporary block unprotect: Not supported
Block protect/unprotect scheme: 8
Number of simultaneous operations: 0
Burst mode: Not supported
Page mode: 12 word page
Vpp Supply Minimum Program/Erase Voltage: 0.0 V
Vpp Supply Maximum Program/Erase Voltage: 0.0 V
Top/Bottom Boot Block: Uniform, Top WP
number of CFI chips: 1
RedBoot partition parsing not available
Using physmap partition information
Creating 5 MTD partitions on "physmap-flash.0":
0x000000000000-0x000000020000 : "X-Loader-NOR"
0x000000020000-0x0000000c0000 : "U-Boot-NOR"
0x0000000c0000-0x000000100000 : "Boot Env-NOR"
0x000000100000-0x000000500000 : "Kernel-NOR"
0x000000500000-0x000004000000 : "File System-NOR"
For a flash chip that I worked on (drivers/mtd/devices/m25p80.c), I found that UNLOCK was not implemented. The driver's ioctl(UNLOCK) returned -EOPNOTSUPP=95. And code inspection showed mtd_unlock return status being dropped on the floor, as you have found.
These imply assumptions in the m25p80 driver that flash will just never be locked, and in the mtd drivers that it's OK for the device driver to omit UNLOCK. On the board I worked on, flash was being locked by u-boot after every write, so erase and reprogram from linux didn't work at all. I looked at u-boot driver and device datasheet, got some code to implement m25p80_lock and m25p80_unlock, it was not too difficult after I knew what was up. I did not upstream it.
It does seem like a defect for chip drivers to not implement these.
By the way Mousstix, very nice job providing full information in this question.
On newer Kernels (tested on 4.1.18) there is an device-tree option named "use-advanced-sector-protection;" When this is set, I was able to erase/write to protected flash-regions.
It is also documented in the Kernel: Documentation/devicetree/bindings/mtd/mtd-physmap.txt

List File In C (.LST)

After compiling some code, the compiler generates a bunch of files. I have statistics, symbols, call tree, errors, list, debug and exe. I have figured out what each means, except for the list file. What is the function of the list file. Is it for the user or the computer/embedded system itself?
The exact contents of the list file varies slightly by tool and chip being used.
The major part of the file will be the translation of the C source code into assembly instructions that has been performed by the compiler. This is useful for debugging the code and to check on the efficiency of the compiler when translating certain source code constructs. In the example below each Cline is given a line number and the assembler listed after. (this example is for the AVR32 processor).
171 /**********************************************************
172 * Test for a receive interrupt
173 **********************************************************/
174 if ( USART_CHANNEL[ Channel ] -> CSR.rxrdy )
000008 F8051502 LSL R5,R12,0x2
00000C ........ MOV R7,LWRD(USART_CHANNEL)
000010 EA17.... ORH R7,HWRD(USART_CHANNEL)
000014 EE0C0027 ADD R7,R7,R12<<0x2
000018 6E0C LD.w R12,R7[0x0]
00001A ........ MOV R6,LWRD(Serial_Receive_Queue)
00001E EA16.... ORH R6,HWRD(Serial_Receive_Queue)
000022 785B LD.w R11,R12[0x14]
000024 A19B LSR R11,0x1
000026 C0B2 BRCC ??USART_Process_Interrupt_1:C
The HEX values that are shown as "...." above are addresses that are not known at compile time, they are symbols that will be resolved at link time.
The list file will also typically give some statistics regarding the code size, RAM requirements and the stack usage for the module being compiled. Again IAR toolset for the AVR32
Maximum stack usage in bytes:
Function CSTACK
-------- ------
Serial_Ports_Initialise 36
-> gpio_enable_module 36
-> usart_init_rs232 36
-> Indirect call 36
-> Indirect call 36
-> Indirect call 36
-> Indirect call 36
Serial_Transmit_With_Length 20
-> xQueueGenericSend 20
-> vTaskDelay 20
USART0_INT_Handler 0
-> USART_Process_Interrupt 0
USART1_INT_Handler 0
-> USART_Process_Interrupt 0
USART2_INT_Handler 0
-> USART_Process_Interrupt 0
USART_Process_Interrupt 32
-> xQueueGenericSendFromISR 32
-> xQueueReceiveFromISR 32
Segment part sizes:
Function/Label Bytes
-------------- -----
Serial_Receive_Queue 24
Serial_Transmit_Queue
USART_CHANNEL 12
USART0_INT_Handler 8
USART1_INT_Handler 8
USART2_INT_Handler 12
USART_Process_Interrupt 112
Serial_Ports_Initialise 172
USART_Channel_In_Use 56
USART_GPIO_MAP
USART_OPTIONS
Serial_Transmit_With_Length 116
?<Initializer for USART_CHANNEL> 12
??USART1_INT_Handler??handle 4
Others 24
400 bytes in segment CODE32
56 bytes in segment DATA32_C
12 bytes in segment DATA32_I
12 bytes in segment DATA32_ID
24 bytes in segment DATA32_Z
28 bytes in segment EVSEG
4 bytes in segment HTAB
24 bytes in segment INITTAB
400 bytes of CODE memory
100 bytes of CONST memory (+ 24 bytes shared)
36 bytes of DATA memory
Errors: none
Warnings: 1
There will also be any error messages or warnings generated inserted at the relevant line of code.
The List file can therefore be used as an aid to estimate stack and memory usage, although stack usage is a highly intractable problem in any embedded system and to see the assembler level code produced by the compiler.
From experience, the list file is not particularly useful when using a source level debugging tool - generally this shows the relevant disassembled code directly.
A list file (.LST) contains a block of C code [commented out by a sequence of period characters] followed by the assembly code for that block.
For example:
.................... return FALSE;
0046: MOVLW 00
0047: MOVWF 21
0048: GOTO 049

Resources