Smallest method of turning a string into an integer(and vice-versa) - c

I am looking for an extremely small way of turning a string like "123" into an integer like 123 and vice-versa.
I will be working in a freestanding environment. This is NOT a premature optimization. I am creating code that must fit in 512 bytes, so every byte does actually count. I will take both x86 assembly(16 bit) and C code though(as that is pretty easy to convert)
It does not need to do any sanity checks or anything..
I thought I had seen a very small C implementation implemented recursively, but I can't seem to find anything for size optimization..
So can anyone find me(or create) a very small atoi/itoa implementation? (it only needs to work with base 10 though)
Edit: (the answer) (edited again because the first code was actually wrong)
in case someone else comes upon this, this is the code I ended up creating. It could fit in 21 bytes!
;ds:bx is the input string. ax is the returned integer
_strtoint:
xor ax,ax
.loop1:
imul ax, 10 ;ax serves as our temp var
mov cl,[bx]
mov ch,0
add ax,cx
sub ax,'0'
inc bx
cmp byte [bx],0
jnz .loop1
ret
Ok, last edit I swear!
Version weighing in at 42 bytes with negative number support.. so if anyone wants to use these they can..
;ds:bx is the input string. ax is the returned integer
_strtoint:
cmp byte [bx],'-'
je .negate
;rewrite to negate DX(just throw it away)
mov byte [.rewrite+1],0xDA
jmp .continue
.negate:
mov byte [.rewrite+1],0xD8
inc bx
.continue
xor ax,ax
.loop1:
imul ax, 10 ;ax serves as our temp var
mov dl,[bx]
mov dh,0
add ax,dx
sub ax,'0'
inc bx
cmp byte [bx],0
jnz .loop1
;popa
.rewrite:
neg ax ;this instruction gets rewritten to conditionally negate ax or dx
ret

With no error checking, 'cause that's for wussies who have more than 512B to play with:
#include <ctype.h>
// alternative:
// #define isdigit(C) ((C) >= '0' && (C) <= '9')
unsigned long myatol(const char *s) {
unsigned long n = 0;
while (isdigit(*s)) n = 10 * n + *s++ - '0';
return n;
}
gcc -O2 compiles this into 47 bytes, but the external reference to __ctype_b_loc is probably more than you can afford...

I don't have an assembler on my laptop to check the size, but offhand, it seems like this should be shorter:
; input: zero-terminated string in DS:SI
; result: AX
atoi proc
xor cx, cx
mov ax, '0'
##:
imul cx, 10
sub al, '0'
add cx, ax
lodsb
jnz #b
xchg ax, cx
ret
atoi endp

Write it yourself. Note that subtracting '0' from a digit gets the power-of-ten. So, you loop down the digits, and every time you multiply the value so far by 10, subtract '0' from the current character, and add it. Codable in assembly in no time flat.

atoi(p)
register char *p;
{
register int n;
register int f;
n = 0;
f = 0;
for(;;p++) {
switch(*p) {
case ' ':
case '\t':
continue;
case '-':
f++;
case '+':
p++;
}
break;
}
while(*p >= '0' && *p <= '9')
n = n*10 + *p++ - '0';
return(f? -n: n);
}

And here is another one without any checking. It assumes a null terminated string. As a bonus, it checks for a negative sign. This takes 593 bytes with a Microsoft compiler (cl /O1).
int myatoi( char* a )
{
int res = 0;
int neg = 0;
if ( *a == '-' )
{
neg = 1;
a++;
}
while ( *a )
{
res = res * 10 + ( *a - '0' );
a++;
}
if ( neg )
res *= -1;
return res;
}

Are any of the sizes smaller if you use -Os (optimize for space) instead of -O2 ?

You could try packing the string into BCD(0x1234) and then using x87 fbld and fist instructions for a 1980s solution but I am not sure that will be smaller at all as I don't remember there being any packing instruction.

How in the world are you people getting the executables so small?! This code generates a 316 byte .o file when compiled with gcc -Os -m32 -c -o atoi.o atoi.c and a 8488 byte executable when compiled and linked (with an empty int main(){} added) with gcc -Os -m32 -o atoi atoi.c. This is on Mac OS X Snow Leopard...
int myatoi(char *s)
{
short retval=0;
for(;*s!=0;s++) retval=retval*10+(*s-'0');
return retval;
}

Related

What is the best way to get integer's negative sign and store it as char?

How to get an integer's sign and store it in a char? One way is:
int n = -5
char c;
if(n<0)
c = '-';
else
c = '+';
Or:
char c = n < 0 ? '-' : '+';
But is there a way to do it without conditionals?
There's the most efficient and portable way, but it doesn't win any beauty awards.
We can assume that the MSB of a signed integer is always set if it is negative. This is a 100% portable assumption even when taking exotic signedness formats in account (one's complement, signed magnitude). Therefore the fastest way is to simply mask out the MSB from the integer.
The MSB of any integer is found at location CHAR_BIT * sizeof(n) - 1;. On a typical 32 bit mainstream system, this would for example be 8 * 4 - 1 = 31.
So we can write a function like this:
_Bool is_signed (int n)
{
const unsigned int sign_bit_n = CHAR_BIT * sizeof(n) - 1;
return (_Bool) ((unsigned int)n >> sign_bit_n);
}
On x86-64 gcc 9.1 (-O3), this results in very efficient code:
is_signed:
mov eax, edi
shr eax, 31
ret
The advantage of this method is also that, unlike code such as x < 0, it won't risk getting translated into "branch if negative" instructions when ported.
Complete example:
#include <limits.h>
#include <stdio.h>
_Bool is_signed (int n)
{
const unsigned int sign_bit_n = CHAR_BIT * sizeof(n) - 1;
return (_Bool) ((unsigned int)n >> sign_bit_n);
}
int main (void)
{
int n = -1;
const char SIGNS[] = {' ', '-'};
char sign = SIGNS[is_signed(n)];
putchar(sign);
}
Disassembly (x86-64 gcc 9.1 (-O3)):
is_signed:
mov eax, edi
shr eax, 31
ret
main:
sub rsp, 8
mov rsi, QWORD PTR stdout[rip]
mov edi, 45
call _IO_putc
xor eax, eax
add rsp, 8
ret
This creates branchless code with gcc/clang on x86-64:
void storeneg(int X, char *C)
{
*C='+';
*C += (X<0)*('-'-'+');
}
https://gcc.godbolt.org/z/yua1go
char c = 43 + signbit(n) * 2 ;
char 43 is '+'
char 45 is '-'
signbit(NEGATIVE INTEGER) is true, converted to 1
int signbit(int) is included in cmath in C++ and math.h in C

ARMCC 5 optimization of strtol and strtod

I have a board based on STM32L4 MCU (Ultra Low Power Cortex-M4) for GNSS tracking purposes. I don't use RTOS, so I use a custom scheduler. Compiler and environment is KEIL uVision 5 (compiler 5.05 and 5.06, behavior doesn't change)
The MCU speaks with GNSS module via plain UART and the protocol is a mix of NMEA and AT. GNSS position is given as plain text that must be converted to a pair of float/double coordinates.
To get the double/float value from text, I use strtod (or strtof).
Note that string operations are made in a separate buffer, different from the UART RX one.
The typical string for a latitude on the UART is
4256.45783
which means 42° 56.45783'
to get absolute position in degrees, I use the following formula
42 + 56.45783 / 60
When there is no optimization the code works fine and the position is converted right. When I turn on level 1 optimization (or higher), if I use standard C library I can convert the integer part (42 in the example) and when it comes to convert 56.45783, I get only 56 (so the integer part of minutes until the dot).
If I get rid of standard library and I use a custom strtod function downloaded from ANSI C source library I simply get 0 with ERANGE error.
In other parts of the code I use strtol, which has a strange behavior when L1 optimization is turned ON: when the first digit is 9 and conversion base is 10 it simply skips that 9 going on with the other digits.
So if in the buffer I have 92, I will get just 2 parsed. To get rid of this I simply prepended a sign + to the number and the result is always OK (as far as I can tell). This WA doesn't work with strtod.
Note that I tried to use static, volatile and on-stack variables, behavior doesn't change.
EDIT: I simplified the code in order to get where it goes wrong, as per comments hereafter
C code is like this:
void GnssStringToLatLonDegMin(const char* str, LatLong_t* struc)
{
double dbl = 0.0;
dbl = strtod("56.45783",NULL);
if(struc != NULL)
{
struc->Axis = (float)((dbl / 60.0) + 42.0);
}
}
Level 0 optimization:
559: void GnssStringToLatLonDegMin(const char* str, LatLong_t* struc)
0x08011FEE BDF8 POP {r3-r7,pc}
560: {
0x08011FF0 B570 PUSH {r4-r6,lr}
0x08011FF2 4605 MOV r5,r0
0x08011FF4 ED2D8B06 VPUSH.64 {d8-d10}
0x08011FF8 460C MOV r4,r1
561: double dbl = 0.0;
0x08011FFA ED9F0BF8 VLDR d0,[pc,#0x3E0]
0x08011FFE EEB08A40 VMOV.F32 s16,s0
0x08012002 EEF08A60 VMOV.F32 s17,s1
562: dbl = strtod("56.45783",NULL);
0x08012006 2100 MOVS r1,#0x00
0x08012008 A0F6 ADR r0,{pc}+4 ; #0x080123E4
0x0801200A F7FDFED1 BL.W __hardfp_strtod (0x0800FDB0)
0x0801200E EEB08A40 VMOV.F32 s16,s0
0x08012012 EEF08A60 VMOV.F32 s17,s1
563: if(struc != NULL)
564: {
0x08012016 B1A4 CBZ r4,0x08012042
565: struc->Axis = (float)((dbl / 60.0) + 42.0);
566: }
0x08012018 ED9F0BF5 VLDR d0,[pc,#0x3D4]
0x0801201C EC510B18 VMOV r0,r1,d8
0x08012020 EC532B10 VMOV r2,r3,d0
0x08012024 F7FEF880 BL.W __aeabi_ddiv (0x08010128)
0x08012028 EC410B1A VMOV d10,r0,r1
0x0801202C ED9F0BF2 VLDR d0,[pc,#0x3C8]
0x08012030 EC532B10 VMOV r2,r3,d0
0x08012034 F7FDFFBC BL.W __aeabi_dadd (0x0800FFB0)
0x08012038 EC410B19 VMOV d9,r0,r1
0x0801203C F7FDFF86 BL.W __aeabi_d2f (0x0800FF4C)
0x08012040 6020 STR r0,[r4,#0x00]
567: }
LEVEL 1 optimization
557: void GnssStringToLatLonDegMin(const char* str, LatLong_t* struc)
0x08011FEE BDF8 POP {r3-r7,pc}
558: {
559: double dbl = 0.0;
0x08011FF0 B510 PUSH {r4,lr}
0x08011FF2 460C MOV r4,r1
560: dbl = strtod("56.45783",NULL);
0x08011FF4 2100 MOVS r1,#0x00
0x08011FF6 A0F7 ADR r0,{pc}+2 ; #0x080123D4
0x08011FF8 F7FDFEDA BL.W __hardfp_strtod (0x0800FDB0)
561: if(struc != NULL)
562: {
0x08011FFC 2C00 CMP r4,#0x00
0x08011FFE D010 BEQ 0x08012022
563: struc->Axis = (float)((dbl / 60.0) + 42.0);
564: }
0x08012000 ED9F1BF7 VLDR d1,[pc,#0x3DC]
0x08012004 EC510B10 VMOV r0,r1,d0
0x08012008 EC532B11 VMOV r2,r3,d1
0x0801200C F7FEF88C BL.W __aeabi_ddiv (0x08010128)
0x08012010 ED9F1BF5 VLDR d1,[pc,#0x3D4]
0x08012014 EC532B11 VMOV r2,r3,d1
0x08012018 F7FDFFCA BL.W __aeabi_dadd (0x0800FFB0)
0x0801201C F7FDFF96 BL.W __aeabi_d2f (0x0800FF4C)
0x08012020 6020 STR r0,[r4,#0x00]
565: }
I looked at the disassembly of __hardfp_strtod and __strtod_int called by these functions and, as they are incorporated as binaries, they don't change with respect of optimization level.
Due to optimization, strtod didn't work.
Thanks to #old_timer, I had to make my own strtod function, which works even with optimization level set at level 2.
double simple_strtod(const char* str)
{
int8 inc;
double result = 0.0;
char * c_tmp;
c_tmp = strchr(str, '.');
if(c_tmp != NULL)
{
c_tmp++;
inc = -1;
while(*c_tmp != 0 && inc > -9)
{
result += (*c_tmp - '0') * pow(10.0, inc);
c_tmp++; inc--;
}
inc = 0;
c_tmp = strchr(str, '.');
c_tmp--;
do
{
result += (*c_tmp - '0') * pow(10.0,inc);
c_tmp--; inc++;
}while(c_tmp >= str);
}
return result;
}
It can be further optimized by not calling 'pow' and use something more clever, but just like this it works perfectly.

divide and store quotient and reminder in different arrays

The standard div() function returns a div_t struct as parameter, for example:
/* div example */
#include <stdio.h> /* printf */
#include <stdlib.h> /* div, div_t */
int main ()
{
div_t divresult;
divresult = div (38,5);
printf ("38 div 5 => %d, remainder %d.\n", divresult.quot, divresult.rem);
return 0;
}
My case is a bit different; I have this
#define NUM_ELTS 21433
int main ()
{
unsigned int quotients[NUM_ELTS];
unsigned int remainders[NUM_ELTS];
int i;
for(i=0;i<NUM_ELTS;i++) {
divide_single_instruction(&quotient[i],&reminder[i]);
}
}
I know that the assembly language for division does everything in single instruction, so I need to do the same here to save on cpu cycles, which is bassicaly move the quotient from EAX and reminder from EDX into a memory locations where my arrays are stored. How can this be done without including the asm {} or SSE intrinsics in my C code ? It has to be portable.
Since you're writing to the arrays in-place (replacing numerator and denominator with quotient and remainder) you should store the results to temporary variables before writing to the arrays.
void foo (unsigned *num, unsigned *den, int n) {
int i;
for(i=0;i<n;i++) {
unsigned q = num[i]/den[i], r = num[i]%den[i];
num[i] = q, den[i] = r;
}
}
produces this main loop assembly
.L5:
movl (%rdi,%rcx,4), %eax
xorl %edx, %edx
divl (%rsi,%rcx,4)
movl %eax, (%rdi,%rcx,4)
movl %edx, (%rsi,%rcx,4)
addq $1, %rcx
cmpl %ecx, %r8d
jg .L5
There are some more complicated cases where it helps to save the quotient and remainder when they are first used. For example in testing for primes by trial division you often see a loop like this
for (p = 3; p <= n/p; p += 2)
if (!(n % p)) return 0;
It turns out that GCC does not use the remainder from the first division and therefore it does the division instruction twice which is unnecessary. To fix this you can save the remainder when the first division is done like this:
for (p = 3, q=n/p, r=n%p; p <= q; p += 2, q = n/p, r=n%p)
if (!r) return 0;
This speeds up the result by a factor of two.
So in general GCC does a good job particularly if you save the quotient and remainder when they are first calculated.
The general rule here is to trust your compiler to do something fast. You can always disassemble the code and check that the compiler is doing something sane. It's important to realise that a good compiler knows a lot about the machine, often more than you or me.
Also let's assume you have a good reason for needing to "count cycles".
For your example code I agree that the x86 "idiv" instruction is the obvious choice. Let's see what my compiler (MS visual C 2013) will do if I just write out the most naive code I can
struct divresult {
int quot;
int rem;
};
struct divresult divrem(int num, int den)
{
return (struct divresult) { num / den, num % den };
}
int main()
{
struct divresult res = divrem(5, 2);
printf("%d, %d", res.quot, res.rem);
}
And the compiler gives us:
struct divresult res = divrem(5, 2);
printf("%d, %d", res.quot, res.rem);
01121000 push 1
01121002 push 2
01121004 push 1123018h
01121009 call dword ptr ds:[1122090h] ;;; this is printf()
Wow, I was outsmarted by the compiler. Visual C knows how division works so it just precalculated the result and inserted constants. It didn't even bother to include my function in the final code. We have to read in the integers from console to force it to actually do the calculation:
int main()
{
int num, den;
scanf("%d, %d", &num, &den);
struct divresult res = divrem(num, den);
printf("%d, %d", res.quot, res.rem);
}
Now we get:
struct divresult res = divrem(num, den);
01071023 mov eax,dword ptr [num]
01071026 cdq
01071027 idiv eax,dword ptr [den]
printf("%d, %d", res.quot, res.rem);
0107102A push edx
0107102B push eax
0107102C push 1073020h
01071031 call dword ptr ds:[1072090h] ;;; printf()
So you see, the compiler (or this compiler at least) already does what you want, or something even more clever.
From this we learn to trust the compiler and only second-guess it when we know it isn't doing a good enough job already.

INT 13 Extension Read in C

i can use extended read functions of bios int 13h well from assembly,
with the below code
; *************************************************************************
; Setup DISK ADDRESS PACKET
; *************************************************************************
jmp strtRead
DAPACK :
db 010h ; Packet Size
db 0 ; Always 0
blkcnt:
dw 1 ; Sectors Count
db_add :
dw 07e00h ; Transfer Offset
dw 0 ; Transfer Segment
d_lba :
dd 1 ; Starting LBA(0 - n)
dd 0 ; Bios 48 bit LBA
; *************************************************************************
; Start Reading Sectors using INT13 Func 42
; *************************************************************************
strtRead:
mov si, OFFSET DAPACK; Load DPACK offset to SI
mov ah, 042h ; Function 42h
mov dl, 080h ; Drive ID
int 013h; Call INT13h
i want to convert this to be a c callable function but i have no idea about how to transfer the parameters from c to asm like drive id , sectors count, buffer segment:offset .... etc.
i am using msvc and masm and working with nothing except bios functions.
can anyone help ?!!
update :
i have tried the below function but always nothing loaded into the buffer ??
void read_sector()
{
static unsigned char currentMBR[512] = { 0 };
struct disk_packet //needed for int13 42h
{
byte size_pack; //size of packet must be 16 or 16+
byte reserved1; //reserved
byte no_of_blocks; //nof blocks for transfer
byte reserved2; //reserved
word offset; //offset address
word segment; //segment address
dword lba1;
dword lba2;
} disk_pack;
disk_pack.size_pack = 16; //set size to 16
disk_pack.no_of_blocks = 1; //1 block ie read one sector
disk_pack.reserved1 = 0; //reserved word
disk_pack.reserved2 = 0; //reserved word
disk_pack.segment = 0; //segment of buffer
disk_pack.offset = (word)&currentMBR[0]; //offset of buffer
disk_pack.lba1 = 0; //lba first 32 bits
disk_pack.lba2 = 0; //last 32 bit address
_asm
{
mov dl, 080h;
mov[disk_pack.segment], ds;
mov si, disk_pack;
mov ah, 42h;
int 13h
; jc NoError; //No error, ignore error code
; mov bError, ah; // Error, get the error code
NoError:
}
}
Sorry to post this as "answer"; I want to post this as "comment" but it is too long...
Different compilers have a different syntax of inline assembly. This means that the correct syntax of the following lines:
mov[disk_pack.segment], ds;
mov si, disk_pack;
... depends on the compiler used. Unfortunately I do not use 16-bit C compilers so I cannot help you in this point.
The next thing I see in your program is the following one:
disk_pack.segment = 0; //segment of buffer
disk_pack.offset = (word)&currentMBR[0]; //offset of buffer
With a 99% chance this will lead to a problem. Instead I would to the following:
struct disk_packet //needed for int13 42h
{
byte size_pack;
byte reserved;
word no_of_blocks; // note that this is 16-bit!
void far *data; // <- This line is the main change!
dword lba1;
dword lba2;
} disk_pack;
...
disk_pack.size_pack = 16;
disk_pack.no_of_blocks = 1;
disk_pack.reserved = 0;
disk_pack.data = &currentMBR[0]; // also note the change here
disk_pack.lba1 = 0;
disk_pack.lba2 = 0;
...
Note that some compilers name the keyword "_far" or "__far" instead of "far".
A third problem is that some (buggy) BIOSes require ES to be equal to the segment value from the disk_pack and a fourth one is that many compilers require the inline assembly code not to modify any registers (AX, CX and DX is normally OK).
These two could be solved the following way:
push ds;
push es;
push si;
mov dl, 080h;
// TODO here: Set ds:si to disk_pack in a compiler-specific way
mov es,[si+6];
mov ah, 42h;
int 13h;
...
pop si;
pop es;
pop ds;
In my opinion the "#pragma pack" should not be neccessary because all elements in the structure are propperly aligned.

Peculiar instruction sequence generated from straightforward C "if" lack condition

I am trying to debug some simple C code under gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 for x86-64. The code is built with CFLAGS += -std=c99 -g -Wall -O0
#include <errno.h>
#include <stdio.h>
#include <string.h>
#pragma pack(1)
int main (int argc, char **argv)
{
FILE *f = fopen ("the_file", "r"); /* error checking removed for clarity */
struct {
short len;
short itm [4];
char nul;
} f00f;
int n = fread (&f00f, 1, sizeof f00f, f);
if (f00f.nul ||
f00f.len != 0x900 ||
f00f.itm [0] != 0xf00f ||
f00f.itm [1] != 0xf00f ||
f00f.itm [2] != 0xf00f ||
f00f.itm [3] != 0xf00f)
{
fprintf (stderr, "bitfile_hdr F00F data err:\n"
"\tNUL: 0x%x\n"
"\tlen: 0x%hx should be 0x900\n"
"\tf00f: 0x%hx\n"
"\tf00f: 0x%hx\n"
"\tf00f: 0x%hx\n"
"\tf00f: 0x%hx\n"
, f00f.nul, f00f.len,
f00f.itm[0], f00f.itm[1], f00f.itm[2], f00f.itm[3]
);
return 1;
}
return 0;
}
The data matches what the test expects, and—weirdly—the error message displays the correct data:
$ ./bit_parse
bitfile_hdr F00F data err:
NUL: 0x0
len: 0x900 should be 0x900
f00f: 0xf00f
f00f: 0xf00f
f00f: 0xf00f
f00f: 0xf00f
Running it under gdb and examining the structure also shows correct data.
(gdb) p /x f00f
$1 = {len = 0x900, itm = {0xf00f, 0xf00f, 0xf00f, 0xf00f}, nul = 0x0}
Since that didn't make sense, I examined the instructions from inside gdb to reveal coding pathologies. The instructions corresponding to the non-functioning if are:
0x0000000000400736 <+210>: movzwl -0x38(%rbp),%eax
0x000000000040073a <+214>: movswl %ax,%r8d
0x000000000040073e <+218>: movzwl -0x3a(%rbp),%eax
0x0000000000400742 <+222>: movswl %ax,%edi
0x0000000000400745 <+225>: movzwl -0x3c(%rbp),%eax
0x0000000000400749 <+229>: movswl %ax,%r9d
0x000000000040074d <+233>: movzwl -0x3e(%rbp),%eax
0x0000000000400751 <+237>: movswl %ax,%r10d
0x0000000000400755 <+241>: movzwl -0x40(%rbp),%eax
0x0000000000400759 <+245>: movswl %ax,%ecx
0x000000000040075c <+248>: movzbl -0x36(%rbp),%eax
0x0000000000400760 <+252>: movsbl %al,%edx
0x0000000000400763 <+255>: mov $0x4008d8,%esi
0x0000000000400768 <+260>: mov 0x2008d1(%rip),%rax # 0x601040 <stderr##GLIBC_2.2.5>
0x000000000040076f <+267>: mov %r8d,0x8(%rsp)
0x0000000000400774 <+272>: mov %edi,(%rsp)
0x0000000000400777 <+275>: mov %r10d,%r8d
0x000000000040077a <+278>: mov %rax,%rdi
0x000000000040077d <+281>: mov $0x0,%eax
0x0000000000400782 <+286>: callq 0x400550 <fprintf#plt>
0x0000000000400787 <+291>: mov $0x6,%eax
0x000000000040078c <+296>: add $0x50,%rsp
0x0000000000400790 <+300>: pop %rbx
0x0000000000400791 <+301>: pop %r12
0x0000000000400793 <+303>: pop %rbp
0x0000000000400794 <+304>: retq
It is really hard to see how this could implement a conditional.
Anyone see why this (mis)behaves as it does?
Probably on your platform, short is 16-bit wide. Therefore no short can equal 0xf00f and the condition f00f.itm [0] != 0xf00f is always true. The compiler optimized accordingly.
You may have meant unsigned short in the definition of struct f00f, but this is only one way to fix it, of course. You could also compare f00f.itm [0] to (short)0xf00f, but if you meant f00f.itm[i] to be compared to 0xf00f, you definitely should have used unsigned short in the definition.
short val = 0xf00f; assigns the value -4081 to val.
You get hit by integer promotion rules.
f00f.itm [0] != 0xf00f
converts the short in f00f.itm [0] to an int, and that's -4081. 0xf00f as an int is 61455, and those two are not equal. Since the value is converted to an unsigned short when you print out the values (by using %hx), the issue isn't visible in the output.
Use unsigned values in your struct since you seem to treat the values as unsigned:
struct {
unsigned short len;
unsigned short itm [4];
char nul;
} f00f;
This sample program might make you understand what's going on a bit better:
#include <stdio.h>
int main(int argc,char *arga[])
{
short x = 0xf00f;
int y = 0xf00f;
printf("x = 0x%hx y = 0x%x\n", x, y);
printf("x = %d y = %d\n", x, y);
printf("x==y: %d\n", x == y);
return 0;
}

Resources