Data-only static libraries with GCC - c

How can I make static libraries with only binary data, that is without any object code, and make that data available to a C program? Here's the build process and simplified code I'm trying to make work:
./datafile:
abcdefghij
Makefile:
libdatafile.a:
ar [magic] datafile
main: libdatafile.a
gcc main.c libdatafile.a -o main
main.c:
#define TEXTPTR [more magic]
int main(){
char mystring[11];
memset(mystring, '\0', 11);
memcpy(TEXTPTR, mystring, 10);
puts(mystring);
puts(mystring);
return 0;
}
The output I'm expecting from running main is, of course:
abcdefghijabcdefghij
My question is: what should [magic] and [more magic] be?

You can convert a binary file to a .o file using objcopy; the generated file then defines symbols for the start address, end address and size of the binary data.
objcopy -I binary -O elf32-little data data.o
The data can be referenced from a program via
extern char const _binary_data_start[];
extern char const _binary_data_end[];
The data lives between those two pointers (note that declaring them as pointers does not work).
The "elf32-little" part needs to be adapted according to your target platform. There are many other options for fine control over the processing.

Put the data in global variables.
char const text[] = "abcdefghij";
Don't forget to declare text in a header. If the data is currently in a file, the FreeBSD file2c tool can convert it to C source code for you (manpage).

Related

Text file linked to ELF file - _binary_file_size information is garbage

I'm trying to revive some old code that links text files (.glsl etc.) into an executable. With my current computer & Kubuntu OS, compiling in 64 bits, I can't read size information anymore. I found a simple example that fails for me in the same way at How do I add contents of text file as a section in an ELF file? . It is further simplified below.
myfile.txt:
Annon edhellon, edro hi ammen
Fennas nogothrim, lasto beth lammen
Objectified with, as in the example,
objcopy --input binary --output elf64-x86-64 --binary-architecture i386:x86-64 --rename-section .data=.rodata,CONTENTS,ALLOC,LOAD,READONLY,DATA myfile.txt myfile.o
I also tried ld -r -b binary -o myfile.o myfile.txt with the same result.
This is my main.c,
#include <stdlib.h>
#include <stdio.h>
/* These are external references to the symbols created by OBJCOPY */
extern char _binary_myfile_txt_start[];
extern char _binary_myfile_txt_end[];
extern char _binary_myfile_txt_size[];
int main() {
char *data_start = _binary_myfile_txt_start;
char *data_end = _binary_myfile_txt_end;
size_t data_size = (size_t)_binary_myfile_txt_size;
printf ("data_start %p\n", data_start);
printf ("data_end %p\n", data_end);
printf ("data_size %zu\n", data_size);
}
compiled with
gcc main.c myfile.o
When I run the code, the result is as follows:
data_start 0x55cd23b88032
data_end 0x55cd23b88074
data_size 94339555942466
The start and end pointers work, but data_size is nonsense. I'd expect it to be 66, as shown by wc. I've tried many obvious things but nothing seems to work.

how to use generated code from matlab

I want to use the C-coder in Matlab. This translates an m-code to C-code.
I use a simple function that adds 5 numbers.
When the code is generated there are a lot of C- and H-files.
of course you could just pick the code you need and import it in your code, but that's not the point of this exercise, as this will no longer be possible when the matlab-code will get more difficult.
Matlab delivers a main.c file and a .mk file.
/* Include Files */
#include "rt_nonfinite.h"
#include "som.h"
#include "main.h"
#include "som_terminate.h"
#include "som_initialize.h"
//Declare all the functions
int main(int argc, const char * const argv[]){
(void)argc;
(void)argv;
float x1=10;
float x2=20;
float x3=30;
float x4=40;
float x5=50;
float result;
/* Initialize the application.
You do not need to do this more than one time. */
som_initialize();
main_som();
result=som(x1,x2,x3,x4,x5);
printf("%f", result);
som_terminate();
return 0;
}
When I run this on a raspberry-pi with
gcc -o test1 main.c
It gives me undefined references to all the functions...
Any ideas what went wrong?
You have to build it with the generated makefile (the mk file) so it links with the correct Matlab libraries - that's where those functions are defined:
$ make -f test.mk
You also need to compile the other C files along with your main.c. If main.c is in the same directory as the generated code, you should be able to just do:
gcc -o test1 *.c
If the generated code is in another directory, then you can do something like:
gcc -o test1 /path/to/code/*.c -I/path/to/code main.c

Why isn't my char* passing correctly?

Problem statement (using a contrived example):
Working as expected ('b' is printed to screen):
void Foo(const char* bar);
void main()
{
const char bar[4] = "bar";
Foo(bar);
}
void Foo(const char* bar)
{
// Pointer to first text cell of video memory
char* memory = (char*) 0xb8000;
*memory = bar[0];
}
Not working as expected (\0 is printed to screen):
void Foo(const char* bar);
void main()
{
Foo("bar");
}
void Foo(const char* bar)
{
// Pointer to first text cell of video memory
char* memory = (char*) 0xb8000;
*memory = bar[0];
}
In other words, if I pass the const char* directly, it doesn't pass correctly. The const char* I get in Foo points to zeroed out memory somehow. What am I doing wrong?
Background info (as requested):
I am developing an operating system for fun, using a guide I found here. The guide generally assumes you are on a unix-based machine, but I'm developing on a PC, so I'm using MinGW so that I have access to gcc, ld, etc.
In the guide, I am currently on page 54, where you have just bootstrapped your custom kernel. Rather than simply displaying an 'X' as the guide teaches, I decided to use my existing knowledge of C/C++ to attempt to write my own rudimentary print string function. The function is supposed to take a const char* and write it, char by char, into video memory.
Three files are currently involved in the project:
The boot sector - compiled through NASM to a .bin file
The kernel entry routine - compiled without linking through NASM to a .o, linked against the kernel
The kernel - compiled through gcc, linked along with the kernel entry routine through the ld command, which produces a .bin which is appended to the .bin file produced by the boot sector
Once the combined .bin file is generated, I am converting it to .VDI (VirtualBox Disk Image) and running it in a VM I have set up.
Additional info:
I just noticed that when VirtualBox is converting the .bin file to .vdi, it is reporting different sizes for the two examples. I had a hunch that maybe the string was getting omitted entirely from the compiled product. Sure enough, when I look at .bin for the first example in a hex editor, I can find the text "bar", but I can't when I look at a hex dump for the .bin of the second example.
This leads me to believe that the compilation process I'm using has a flaw in it somewhere. Here are the commands I'm using:
nasm boot_sector.asm -f bin -o boot_sector.bin
nasm kernel_entry.asm -f elf -o kernel_entry.o
gcc -ffreestanding -c kernel.c -o kernel.o
ld -T NUL -o kernel.tmp -Ttext 0x1000 kernel_entry.o kernel.o
objcopy -O binary -j .text kernel.tmp kernel.bin
copy /b boot_sector.bin+kernel.bin os_image.bin
os_image.bin is what is converted to the .vdi file which is used in the vm.
With your first example, the compiler will (or at least, can) put the data to initialize the automatic array right in the code (.text section - moves with immediate values are used when I try this out).
With your second example, the string literal is put in the .rodata section, and the code will contain a reference to that section.
Your objcopy command only copies the .text section, so the string will be missing in the final binary. You should add the .rodata section, or remove the -j .text entirely.

making small c project

everybody out there
i write a very simple c code which is following:
#include<stdio.h>
int main()
{
int a,b,s,m,d;
system("clear");
int a =20;
int b =40;
s=sum(a,b);
m=mul(a,b);
d=div(a,b);
printf("\n the sum of given no. = %d\nThe product of given no. = %d\nThe division of given no = %d",s,m,d);
return 0;
}
the name of the file is exp.c
than i write the following code:
#include<stdio.h>
int sum( int x ,int y)
{
int z;
z=x+y;
return z;
}
i saved it as sum.c
than i write the following code :
#include<stdio.h>
int mul( int z ,int u)
{
int v ;
v=z+u;
return v;
}
save it as mul.c
than i write the following code
#include<stdio.h>
int div (int a, int b)
{
int f;
f=a/b;
return f;
}
save it as div .c
now my problem is that i want to use all file as a single project.
i want exp.c use the function defined in mul.c,div.c,sum.c
i want to know how to do this?
how to make library form mul.c,div.c,sum.c?
how to associate these library with exp.c ?
can any body explain me the detail process of making project ?
i 'm using ubuntu as my operating system. please help me
The easiest way is to not make a library, but just compile them all together into a single executable:
$ gcc -o myprogram sum.c mul.c div.c
This has the drawback that you will re-compile all the code all the time, so as the files grow large, the penalty (build time) goes up since even changing just div.c (for example) will force you to re-compile sum.c and mul.c too.
The next step is to compile them separately, and leave the object files around. For this, we can use a Makefile like so:
myprogram: sum.o mul.o div.o
sum.o: sum.c
mul.o: mul.c
div.o: div.c
This will leave the object files around, and when you type make the make tool will compare the timestamps of the object files to those of the C files, and only re-compile that which changed. Note that for the above to work, there must be a physical TAB after each colon.
There are a few steps you need to do for this:
Declare the functions in your main file When you compile your main file (exp.c) the compiler will output an error because he does not know what kind of functions sum, mul etc. are. So you have to declare them via int sum( int x ,int y); in this file. A more general approach (which is clearer) is to write all the functions you have in a file (not all, but those that will be accessed from other files) into a header file and then include the header file.
Compile each file You need to compile each file. This can be done via a simple gcc -c mul.c etc. This will create a mul.o - a machine language file.
Link them Once every file is compiled you need to put them together in one executable. This is done via gcc -o outputname mul.o sum.o ...
Note that steps 2 and 3 can also be combined, I just wanted to explain the steps clearly. This is usually done via a Makefile to speed things up a bit
Firstly, you will need to declare each of your functions in a corresponding header file (you don't have to use header files, but it's the most common way of doing this). For instance, div.h might look like:
#ifndef DIV_H_
#define DIV_H_
int div(int a, int b);
#endif
You will then to #include the header files in source files where the corresponding functions are used.
Then, to compile and link:
gcc -o my_prog exp.c sum.c mul.c div.c
As others have suggested, you make want to read up on Make, as it helps simplify the build process once your project gets more complicated.
You need to declare the functions in the file they are used. The common way to do this is to put the declarations in a header file, lets say funcs.h:
#ifndef FUNCS_H
#define FUNCS_H
int sum( int, int );
int mul( int, int );
int div( int, int );
#endif
Now #include this in your main source file. Then to build the executable:
gcc exp.c sum.c div.c mul.c
To create a library, you need to compile the files separately:
gcc -c sum.c div.c mul.c
and then run ar to build the library:
ar rvs sum.o div.o mul.o mylib.a
And then use it from gcc:
gcc exp.c mylib.a
A good practise to organize the code could be put all the functions prototypes inside a .h file, and the implementations into a related .c file, using include guards to avoid multiple inclusion.
Example module.h file:
#ifndef MODULE_NAME
#define MODULE_NAME
void module_func();
#endif
Example module.c :
#include "module.h"
void module_func(){
//implementation
}
read up on make - this will answer your questions about building/compilation/etc
You should have a .h file that will include your function prototypes. It's not strictly needed (as your functions return int) but you must get in the habit now, because it won't come easy later

Embedding binary blobs using gcc mingw

I am trying to embed binary blobs into an exe file. I am using mingw gcc.
I make the object file like this:
ld -r -b binary -o binary.o input.txt
I then look objdump output to get the symbols:
objdump -x binary.o
And it gives symbols named:
_binary_input_txt_start
_binary_input_txt_end
_binary_input_txt_size
I then try and access them in my C program:
#include <stdlib.h>
#include <stdio.h>
extern char _binary_input_txt_start[];
int main (int argc, char *argv[])
{
char *p;
p = _binary_input_txt_start;
return 0;
}
Then I compile like this:
gcc -o test.exe test.c binary.o
But I always get:
undefined reference to _binary_input_txt_start
Does anyone know what I am doing wrong?
In your C program remove the leading underscore:
#include <stdlib.h>
#include <stdio.h>
extern char binary_input_txt_start[];
int main (int argc, char *argv[])
{
char *p;
p = binary_input_txt_start;
return 0;
}
C compilers often (always?) seem to prepend an underscore to extern names. I'm not entirely sure why that is - I assume that there's some truth to this wikipedia article's claim that
It was common practice for C compilers to prepend a leading underscore to all external scope program identifiers to avert clashes with contributions from runtime language support
But it strikes me that if underscores were prepended to all externs, then you're not really partitioning the namespace very much. Anyway, that's a question for another day, and the fact is that the underscores do get added.
From ld man page:
--leading-underscore
--no-leading-underscore
For most targets default symbol-prefix is an underscore and is defined in target's description. By this option it is possible to disable/enable the default underscore symbol-prefix.
so
ld -r -b binary -o binary.o input.txt --leading-underscore
should be solution.
I tested it in Linux (Ubuntu 10.10).
Resouce file:
input.txt
gcc (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5 [generates ELF executable, for Linux]
Generates symbol _binary__input_txt_start.
Accepts symbol _binary__input_txt_start (with underline).
i586-mingw32msvc-gcc (GCC) 4.2.1-sjlj (mingw32-2) [generates PE executable, for Windows]
Generates symbol _binary__input_txt_start.
Accepts symbol binary__input_txt_start (without underline).
Apparently this feature is not present in OSX's ld, so you have to do it totally differently with a custom gcc flag that they added, and you can't reference the data directly, but must do some runtime initialization to get the address.
So it might be more portable to make yourself an assembler source file which includes the binary at build time, a la this answer.

Resources