Can some one write assembly code for the c program above that converts into machine code that is less than 100 bytes? - c

I want to overflow the array buffer[100] and I will be passing python script on bash shell on FreeBSD. I need machine code to pass as a string to overflow that buffer buffer[100] and make the program print its hostname to stdout.
Here is the code in C that I tried and gives the host name on the console. :
#include <stdio.h>
int main()
{
char buff[256];
gethostname(buff, sizeof(buff));
printf(""%s", buff);
return 0;
}
Here is the code in assembly that I got using gcc but is longer than I need becuase when I look for the machine code of the text section of the c program it is longer than 100 bytes and I need a machine code for the c program above that is less than 100 bytes.
.type main, #function
main:
pushl %ebp; saving the base pointer
movl %esp, %ebp; Taking a snapshot of the stack pointer
subl $264, %esp;
addl $-8, %esp
pushl $256
leal -256(%ebp), %eax
pushl %eax
call gethostname
addl $16, %esp
addl $-8, %esp
leal -256(%ebp), %eax
pushl %eax
pushl $.LCO
call printf
addl $16, %esp
xorl %eax, %eax
jmp .L6
.p2align 2, 0x90
.L6:
leave
ret
.Lfe1:
.size main, .Lfe1-main
.ident "GCC: (GNU) c 2.95.4 20020320 [FreeBSD]"
A person has already done it on another computer and he has given me the ready made machine code which is 37 bytes and he is passing it in the format below to the buffer using perl script. I tried his code and it works but he doesn't tell me how to do it.
“\x41\xc1\x30\x58\x6e\x61\x6d\x65\x23\x23\xc3\xbc\xa3\x83\xf4\x69\x36\xw3\xde\x4f\x2f\x5f\x2f\x39\x33\x60\x24\x32\xb4\xab\x21\xc1\x80\x24\xe0\xdb\xd0”
I know that he did it on a differnt machine so I can not get the same code but since we both are using exactly the same c function so the size of the machine code should be almost the same if not exactly the same. His machine code is 37 bytes which he will pass on shell to overflow the gets() function in a binary file on FreeBSD 2.95 to print the hostname on stdout. I want to do the same thing and I have tried his machine code and it works but he will not tell me how did he get this machine code. So I am concerned actually about the procedure of getting that code.
OK I tried the methods suggested in the posts here but just for the function gethostname() I got a 130 character of machine code. It did not include the printf() machine code. As I need to print the hostname to console so that should also be included but that will make the machine code longer. I have to fit the code in an array of 100 bytes so the code should be less than 100 bytes.
Can some one write assembly code for the c program above that converts into machine code that is less than 100 bytes?

To get the machine code, you need to compile the program then disassemble. Using gcc for example do something like this:
gcc -o hello hello.c
objdump -D hello
The dump will show the machine code in bytes and the disassembly of that machine code.
A simple example, that is related, you have to understand the difference between an object file and an executable file but this should still demonstrate what I mean:
unsigned int myfun ( unsigned int x )
{
return(x+5);
}
gcc -O2 -c -o hello.o hello.c
objdump -D hello.o
Disassembly of section .text:
00000000 <myfun>:
0: e2800005 add r0, r0, #5
4: e12fff1e bx lr

FreeBSD is an operating system, not a compiler or assembler.
You want to assemble the assembly source into machine code, so you should use an assembler.
You can typically use GCC, since it's smart enough to know that for a filename ending in .s, it should run the assembler.
If you already have the code in an object file, you can use objdump to read out the code segment of the file.

The 37 bytes posted are completely junk.
If run under any version of Windows ( windows 2000 or later ), I believe, that
the "outsb" and "insd" instructions (in an userland program) will cause a fault,
because userland programs are not allowed directly doing port -level I/O.
Since machine code will not end in "vacuum", I added some \x90 -bytes (again NOP) after the posted code. That merely affects the argument of the last rcl -instruction (which in the given code ends prematurely; eg the code posted is not only rubbish, but also ends prematurely).
But, microprocessors do not have their own intelligence, so they will (try to) execute whatever junk code you feed them. And, the code starts with "inc ecx", a stupid move since we do not know what value the ecx had before. Also "shl dword ptr [eax],$58" is a "good"
way to randomly corrupt memory (since value if eax is also unknown).
And, one of them is NOT even valid byte (should be represented as two hexadecimal digits).
The invalid "byte" is \xw3.
I replaced that invalid byte as \x90 ( a NOP, if it is at start of instruction), and got:
00451B51 41 inc ecx
00451B52 C13058 shl dword ptr [eax],$58
00451B55 6E outsb
00451B56 61 popad
00451B57 6D insd
00451B58 652323 and esp,gs:[ebx]
00451B5B C3 ret
// code below is NEVER executed, since the line above does a RET.
00451B5C BCA383F469 mov esp,$69f483a3
00451B61 3690 nop // 36, w3 ????
00451B63 DE4F2F fimul word ptr [edi+$2f]
00451B66 5F pop edi
00451B67 2F das
00451B68 3933 cmp [ebx],esi
00451B6A 60 pushad
00451B6B 2432 and al,$32
00451B6D B4AB mov ah,$ab
00451B6F 21C1 and ecx,eax
00451B71 8024E0DB and byte ptr [eax],$db
00451B75 D09090909090 rcl [eax-$6f6f6f70],1

You get a nice hexdump of the text section of your object file with objdump -s -j .text.
Edited some more details:
You need to find out what the address of the function in your object code is. This is what objdump -t is for. In this case I am looking for the function main in a program "hello".
> objdump -t hello|grep main
> 0000000000400410 g F .text 000000000000002f main
Now I create a hexdump with objdump -s -j .text hello:
400410 4881ec08 010000be 00010000 31c04889 H...........1.H.
400420 e7e8daff ffff4889 e6bff405 400031c0 ......H.....#.1.
400430 e8abffff ff31c048 81c40801 0000c390 .....1.H........
400440 31ed4989 d15e4889 e24883e4 f0505449 1.I..^H..H...PTI
400450 c7c0e005 400048c7 c1500540 0048c7c7 ....#.H..P.#.H..
...
The first row are the addresses. It starts with 400410, the address of the main function, but this may not always be the case. The following 4 rows are 16 bytes of machinecode in hex, the last row are the same 16 bytes of machine code in ASCII. Because a lot of bytes have no representation in ASCII, there are a lot of dots. You need to use the 4 hexadecimal colums: \x48 \x81 \xec....
I have done this on a linux system, but for FreeBSD you can do exactly the same - only the resulting machindecode will be different.

Related

Disassembling simple C function

I'm trying to understand the underlying assembly for a simple C function.
program1.c
void function() {
char buffer[1];
}
=>
push %ebp
mov %esp, %ebp
sub $0x10, %esp
leave
ret
Not sure how it's arriving at 0x10 here? Isn't a character 1 byte, which is 8 bits, so it should be 0x08?
program2.c
void function() {
char buffer[4];
}
=>
push %ebp
mov %esp, %ebp
sub $0x18, %esp
mov ...
mov ...
[a bunch of random instructions]
Not sure how it's arriving at 0x18 here either? Also, why are there so many additional instructions after the SUB instruction? All I did was change the length of the array from 1 to 4.
gcc uses -mpreferred-stack-boundary=4 by default for x86 32 and 64bit ABIs, so it keeps %esp 16B-aligned.
I was able to reproduce your output with gcc 4.8.2 -O0 -m32 on the Godbolt Compiler Explorer
void f1() { char buffer[1]; }
pushl %ebp
movl %esp, %ebp # make a stack frame (`enter` is super slow, so gcc doesn't use it)
subl $16, %esp
leave # `leave` is not terrible compared to mov/pop
ret
You must be using a version of gcc with -fstack-protector enabled by default. Newer gcc isn't usually configured to do that, so you don't get the same sentinel value and check written to the stack. (Try a newer gcc in that godbolt link)
void f4() { char buffer[4]; }
pushl %ebp #
movl %esp, %ebp # make a stack frame
subl $24, %esp # IDK why it reserves 24, rather than 16 or 32B, but prob. has something to do with aligning the stack for the possible call to __stack_chk_fail
movl %gs:20, %eax # load a value from thread-local storage
movl %eax, -12(%ebp) # store it on the stack
xorl %eax, %eax # tmp59
movl -12(%ebp), %eax # D.1377, tmp60
xorl %gs:20, %eax # check that the sentinel value matches what we stored
je .L3 #,
call __stack_chk_fail #
.L3:
leave
ret
Apparently gcc considers char buffer[4] a "vulnerable object", but not char buffer[1]. Without -fstack-protector, there'd be little to no difference in the asm even at -O0.
Isn't a character 1 byte, which is 8 bits, so it should be 0x08?
This values are not bits, they are bytes.
Not sure how it's arriving at 0x10 here?
This lines:
push %ebp
mov %esp, %ebp
sub $0x10, %esp
Are allocating space on the stack, 16 bytes of memory are being reserved for the execution of this function.
All those bytes are needed to store information like:
A 4 byte memory address for the instruction that will be jumped to in the ret instruction
The local variables of the functions
Data structure alignment
Other stuff i can't remember right now :)
In your example, 16 bytes were allocated. 4 of them are for the address of the next instruction that will be called, so we have 12 bytes left. 1 byte is for the char array of size 1, which is probably optimized by the compiler to a single char. The last 11 bytes are probably to store some of the stuff i can't remember and the padding's added by the compiler.
Not sure how it's arriving at 0x18 here either?
Each of the additional bytes in your second example increased the stack size in 2 bytes, 1 byte for the char, and 1 likely for memory alignment purposes.
Also, why are there so many additional instructions after the SUB instruction?
Please update the question with the instructions.
This code is just setting up the stack frame. This is used as scratch space for local variables, and will have some kind of alignment requirement.
You haven't mentioned your platform, so I can't tell you exactly what the requirements are for your system, but obviously both values are at least 8-byte aligned (so the size of your local variables is rounded up so %esp is still a multiple of 8).
Search for "c function prolog epilog" or "c function call stack" to find more resources in this area.
Edit - Peter Cordes' answer explains the discrepancy and the mysterious extra instructions.
And for completeness, although Fábio already answered this part:
Not sure how it's arriving at 0x10 here? Isn't a character 1 byte, which is 8 bits, so it should be 0x08?
On x86, %esp is the stack pointer, and pointers store addresses, and these are addresses of bytes. Sub-byte addressing is rarely used (cf. Peter's comment). If you want to examine individual bits inside a byte, you'd usually use bitwise (&,|,~,^) operations on the value, but not change the address.
(You could equally argue that sub-cache-line addressing is a convenient fiction, but we're rapidly getting off-topic).
Whenever you allocate memory, your operating system almost never actually gives you exactly that amount, unless you use a function like pvalloc, which gives you a page-aligned amount of bytes (usually 4K). Instead, your operating system assumes that you might need more in the future, so goes ahead and gives you a bit more.
To disable this behavior, use a lower-level system call that doesn't do buffering, like sbrk(). These lecture notes are an excellent resource:
http://web.eecs.utk.edu/~plank/plank/classes/cs360/360/notes/Malloc1/lecture.html

segmentation fault with .text .data and main (main in .data section)

I'm just trying to load the value of myarray[0] to eax:
.text
.data
# define an array of 3 words
array_words: .word 1, 2, 3
.globl main
main:
# assign array_words[0] to eax
mov $0, %edi
lea array_words(,%edi,4), %eax
But when I run this, I keep getting seg fault.
Could someone please point out what I did wrong here?
It seems the label main is in the .data section.
It leads to a segmentation fault on systems that doesn't allow to execute code in the .data section. (Most modern systems map .data with read + write but not exec permission.)
Program code should be in the .text section. (Read + exec)
Surprisingly, on GNU/Linux systems, hand-written asm often results in an executable .data unless you're careful to avoid that, so this is often not the real problem: See Why data and stack segments are executable? But putting code in .text where it belongs can make some debugging tools work better.
Also you need to ret from main or call exit (or make an _exit system call) so execution doesn't fall off the end of main into whatever bytes come next. See What happens if there is no exit system call in an assembly program?
You need to properly terminate your program, e.g. on Linux x86_64 by calling the sys_exit system call:
...
main:
# assign array_words[0] to eax
mov $0, %edi
lea array_words(,%edi,4), %eax
mov $60, %rax # System-call "sys_exit"
mov $0, %rdi # exit code 0
syscall
Otherwise program execution continues with the memory contents following your last instruction, which are most likely in all cases invalid instructions (or even invalid memory locations).

some clang-generated assembly not working in real mode (.COM, tiny memory model)

First, this is kind of a follow-up to Custom memory allocator for real-mode DOS .COM (freestanding) — how to debug?. But to have it self-contained, here's the background:
clang (and gcc, too) has an -m16 switch so long instructions of the i386 instruction set are prefixed for execution in "16bit" real mode. This can be exploited to create DOS .COM 32bit-realmode-executables using the GNU linker, as described in this blog post. (of course still limited to the tiny memory model, means everything in one 64KB segment) Wanting to play with this, I created a minimal runtime that seems to work quite nice.
Then I tried to build my recently-created curses-based game with this runtime, and well, it crashed. The first thing I encountered was a classical heisenbug: printing the offending wrong value made it correct. I found a workaround, only to face the next crash. So the first thing to blame I had in mind was my custom malloc() implementation, see the other question. But as nobody spotted something really wrong with it so far, I decided to give my heisenbug a second look. It manifests in the following code snippet (note this worked flawlessly when compiling for other platforms):
typedef struct
{
Item it; /* this is an enum value ... */
Food *f; /* and this is an opaque pointer */
} Slot;
typedef struct board
{
Screen *screen;
int w, h;
Slot slots[1]; /* 1 element for C89 compatibility */
} Board;
[... *snip* ...]
size = sizeof(Board) + (size_t)(w*h-1) * sizeof(Slot);
self = malloc(size);
memset(self, 0, size);
sizeof(Slot) is 8 (with clang and i386 architecture), sizeof(Board) is 20 and w and h are the dimensions of the game board, in case of running in DOS 80 and 24 (because one line is reserved for the title/status bar). To debug what's going on here, I made my malloc() output its parameter, and it was called with the value 12 (sizeof(board) + (-1) * sizeof(Slot)?)
Printing out w and h showed the correct values, still malloc() got 12. Printing out size showed the correctly calculated size and this time, malloc() got the correct value, too. So, classical heisenbug.
The workaround I found looks like this:
size = sizeof(Board);
for (int i = 0; i < w*h-1; ++i) size += sizeof(Slot);
Weird enough, this worked. Next logical step: compare the generated assembly. Here I have to admit I'm totally new to x86, my only assembly experience was with the good old 6502. So, In the following snippets, I'll add my assumptions and thoughts as comments, please correct me here.
First the "broken" original version (w, h are in %esi, %edi):
movl %esi, %eax
imull %edi, %eax # ok, calculate the product w*h
leal 12(,%eax,8), %eax # multiply by 8 (sizeof(Slot)) and add
# 12 as an offset. Looks good because
# 12 = sizeof(Board) - sizeof(Slot)...
movzwl %ax, %ebp # just use 16bit because my size_t for
# realmode is "unsigned short"
movl %ebp, (%esp)
calll malloc
Now, to me, this looks good, but my malloc() sees 12, as mentioned. The workaround with the loop compiles to the following assembly:
movl %edi, %ecx
imull %esi, %ecx # ok, w*h again.
leal -1(%ecx), %edx # edx = ecx-1? loop-end condition?
movw $20, %ax # sizeof(Board)
testl %edx, %edx # I guess that sets just some flags in
# order to check whether (w*h-1) is <= 0?
jle .LBB0_5
leal 65548(,%ecx,8), %eax # This seems to be the loop body
# condensed to a single instruction.
# 65548 = 65536 (0x10000) + 12. So
# there is our offset of 12 again (for
# 16bit). The rest is the same ...
.LBB0_5:
movzwl %ax, %ebp # use bottom 16 bits
movl %ebp, (%esp)
calll malloc
As described before, this second variant works as expected. My question after all this long text is as simple as ... WHY? Is there something special about realmode I'm missing here?
For reference: this commit contains both code versions. Just type make -f libdos.mk for a version with the workaround (crashing later). To compile the code leading to the bug, remove the -DDOSREAL from the CFLAGS in libdos.mk first.
Update: given the comments, I tried to debug this myself a bit deeper. Using dosbox' debugger is somewhat cumbersome, but I finally got it to break at the position of this bug. So, the following assembly code intended by clang:
movl %esi, %eax
imull %edi, %eax
leal 12(,%eax,8), %eax
movzwl %ax, %ebp
movl %ebp, (%esp)
calll malloc
ends up as this (note intel syntax used by dosbox' disassembler):
0193:2839 6689F0 mov eax,esi
0193:283C 660FAFC7 imul eax,edi
0193:2840 668D060C00 lea eax,[000C] ds:[000C]=0000F000
0193:2845 660FB7E8 movzx ebp,ax
0193:2849 6766892C24 mov [esp],ebp ss:[FFB2]=00007B5C
0193:284E 66E8401D0000 call 4594 ($+1d40)
I think this lea instruction looks suspicious, and indeed, after it, the wrong value is in ax. So, I tried to feed the same assembly source to the GNU assembler, using .code16 with the following result (disassembly by objdump, I think it is not entirely correct because it might misinterpret the size prefix bytes):
00000000 <.text>:
0: 66 89 f0 mov %si,%ax
3: 66 0f af c7 imul %di,%ax
7: 67 66 8d 04 lea (%si),%ax
b: c5 0c 00 lds (%eax,%eax,1),%ecx
e: 00 00 add %al,(%eax)
10: 66 0f b7 e8 movzww %ax,%bp
14: 67 66 89 2c mov %bp,(%si)
The only difference is this lea instruction. Here it starts with 67 meaning "address is 32bit" in 16bit real mode. My guess is, this is actually needed because lea is meant to operate on addresses and just "abused" by the optimizer to do data calculation here. Are my assumptions correct? If so, could this be a bug in clangs internal assembler for -m16? Maybe someone can explain where this 668D060C00 emitted by clang comes from and what may be the meaning? 66 means "data is 32bit" and 8D probably is the opcode itself --- but what about the rest?
Your objdump output is bogus. It looks like it's disassembling with the assumption of 32bit address and operand sizes, rather than 16. So it thinks lea ends sooner than it does, and disassembles some of the address bytes into lds / add. And then miraculously gets back into sync, and sees a movzww that zero extends from 16b to 16b... Pretty funny.
I'm inclined to trust your DOSBOX disassembly output. It perfectly explains your observed behaviour (malloc always called with an arg of 12). You are correct that the culprit is
lea eax,[000C] ; eax = 0x0C = 12. Intel/MASM/NASM syntax
leal 12, %eax #or AT&T syntax:
It looks like a bug in whatever assembled your DOSBOX binary (clang -m16 I think you said), since it assembled leal 12(,%eax,8), %eax into that.
leal 12(,%eax,8), %eax # AT&T
lea eax, [12 + eax*8] ; Intel/MASM/NASM syntax
I could probably dig through some instruction encoding tables / docs and figure out exactly how that lea should have been assembled into machine code. It should be the same as the 32bit-mode encoding, but with 67 66 prefixes (address size and operand size, respectively). (And no, the order of those prefixes doesn't matter, 66 67 would work, too.)
Your DOSBOX and objdump outputs don't even have the same binary, so yes, they did come out differently. (objdump is misinterpreting the operand-size prefix in previous instructions, but that didn't affect the insn length until LEA.)
Your GNU as .code16 binary has 67 66 8D 04 C5, then the 32bit 0x0000000C displacement (little-endian). This is LEA with both prefixes. I assume that's the correct encoding of leal 12(,%eax,8), %eax for 16bit mode.
Your DOSBOX disassembly has just 66 8D 06, with a 16bit 0x0C absolute address. (Missing the 32bit address size prefix, and using a different addressing mode.) I'm not an x86 binary expert; I haven't had problems with disassemblers / instruction encoding before. (And I usually only look at 64bit asm.) So I'd have to look up the encodings for the different addressing modes.
My go-to source for x86 instructions is Intel's Intel® 64 and IA-32 Architectures
Software Developer’s Manual
Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z. (linked from https://stackoverflow.com/tags/x86/info, BTW.)
It says: (section 2.1.1)
The operand-size override prefix allows a program to switch between
16- and 32-bit operand sizes. Either size can be the default; use of
the prefix selects the non-default size.
So that's easy, everything is pretty much the same as normal 32bit protected mode, except 16bit operand-size is the default.
The LEA insn description has a table describing exactly what happens with various combinations of 16, 32, and 64bit address (67H prefix) and operand sizes (66H prefix). In all cases, it truncates or zero extend the result when there's a size mismatch, but it's an Intel insn ref manual so it has to lay out every case separately. (This is helpful for more complex instruction behaviour.)
And yes, "abusing" lea by using it on non-address data is a common and useful optimization. You can do a non-destructive add of 2 registers, placing the result in a 3rd. And at the same time add a constant, and scale one of the inputs by 2, 4, or 8. So it can do things that would take up to 4 other instructions. (mov / shl / add r,r / add r,i). Also, it doesn't affect flags, which is a bonus if you want to preserve flags for another jump or especially cmov.

Assembler / GAS / Linux x86_64 - error while reading a file

I am writing a simple program in assembler on Linux x86_64 (GAS syntax). I have to read a number that coded in binary system and saved in a text file. So, I have my text file "data.txt" (it's in the same directory as my source file) and below is the most important fragment of my code:
SYS_WRITE = 4
EXIT_SUCCESS = 0
SYS_READ = 3
SYS_OPEN = 5
.data
BIN_LEN = 24
.comm BIN, BIN_LEN
BIN: .space BIN_LEN, 0
.text
PATH: .ascii "data.txt\0"
.global _start
_start:
mov $SYS_OPEN, %eax # open
mov $PATH, %ebx # path
mov $0, %ecx # read only
mov $0666, %edx # mode
int $0x80 # call (open file)
mov $SYS_READ, %eax # reading
mov $3, %ebx # descriptor
mov $BIN, %ecx # bufor
mov $BIN_LEN, %edx # bufor size
int $0x80 # call (read line from file)
After calling the second syscall, the %eax register should contain the number of read bytes.
In my file "data.txt" I have "10101", but when I debug my program with gdb, it shows that the is -11 in %eax, so there was some kind of an error. But I am sure that "10101" was loaded to the buffer (BIN), because when I want to display what the buffer has inside, there is properly written number from the file. I need the number of read bytes to the further algorithm. I have no idea why %eax contains error code instead of the number of bytes loaded to the buffer. I wonder if it may be connected with calling syscall with 32-bit registers, but in all other cases it works properly.
Please, help me.
I entered your code and compiled it on my x64 running fedora 20 using the as and ld 32 bit options to assemble and link it, and it ran perfectly, placing 0x18 into the %eax reg after syscall. If you solved the problem I would like to know what caused it and how you fixed it.
cheers

Where the value of variables are stored in C

In the following code segment:
int func()
{
int a=7;
return a;
}
Is the code segment where the value 7 is stored in the executable? Or is it in data segment or in the code segment? Will the answer depends on the Operating system or the compiler?
Each executable format has some sections. One of them is text, contains the assembly - binary code. One of them is heap where malloc-ed data is found and on is stack where local variables are stored. There are several others but it doesn't matter now. The above three are common everywhere.
Now, local data like your a resides on the stack. In the executable file, the value is stored in the text section.
I've added a main to your code (returning 0), compiled with -g then did objdump -CDgS a.out and searched for 0x424242 (I've replaced your 7 with a value with lesser chance of randomly occurring in code).
00000000004004ec <func>:
int func()
{
4004ec: 55 push %rbp
4004ed: 48 89 e5 mov %rsp,%rbp
int a=0x42424242;
4004f0: c7 45 fc 42 42 42 42 movl $0x42424242,-0x4(%rbp)
return a;
4004f7: 8b 45 fc mov -0x4(%rbp),%eax
}
4004fa: 5d pop %rbp
4004fb: c3 retq
As you see, c7 45 fc 42 42 42 42 means that the value is stored in the generated file. Indeed, this is the case when looking at the binary via xxd:
$ xxd a.out | grep 4242
00004f0: c745 fc42 4242 428b 45fc 5dc3 5548 89e5 .E.BBBB.E.].UH..
You can recognize the above assembly line in the xxd snippet.
Since a is implicitly auto (i.e. is not extern or static) it is stored in the call stack frame.
In fact, the compiler may optimize that: probably, in your case, when optimizing, it will stay in a register (or be constant propagated and constant folded): no need to allocate a call stack slot for your a
This is of course compiler, target platform, and operating system dependent. For the GCC compiler, understand the Gimple internal representation (thru -fdump-tree-all, or using the MELT probe) and look at the generated assembler code (use -fverbose-asm -S -O)
See also this answer which gives a lot of references.
GCC 4.8 on Linux/x86-64 compiles (with gcc -S -fverbose-asm -O) your function into:
.globl func
.type func, #function
func:
.LFB0:
.cfi_startproc
movl $7, %eax #,
ret
.cfi_endproc
.LFE0:
.size func, .-func
So you see that in your particular case no additional space is used for 7, it is directly stored in%eax which is the register (defined in the ABI conventions) to hold its returned result.
The value 7 is stored in the machine code, inside the movl machine instruction. When func is executed, that 7 is loaded into register %eax containing the returned result of func.
Depending on the example code, variable "a" goes in call stack, place to store local variables along with function call information like program counter, return addr etc

Resources