Address space for shared libraries loaded multiple times in the same process

Address space for shared libraries loaded multiple times in the same process - c

First off, I've already found a few references which might answer my question. While I plan on reading them soon (i.e. after work), I'm still asking here in case the answer is trivial and does not require too much supplementary knowledge.
Here is the situation: I am writing a shared library (let's call it libA.so) which needs to maintain a coherent internal (as in, static variables declared in the .c file) state within the same process.
This library will be used by program P (i.e. P is compiled with -lA). If I understand everything so far, the address space for P will look something like this:
______________
| Program P |
| < |
| variables, |
| functions |
| from P |
| > |
| |
| < |
| libA: |
| variables, |
| functions |
| loaded (ie |
| *copied*) |
| from shared |
| object |
| > |
| < |
| stuff from |
| other |
| libraries |
| > |
|______________|
Now P will sometimes call dlopen("libQ.so", ...). libQ.so also uses libA.so (i.e. was compiled with -lA). Since everything happens within the same process, I need libA to somehow hold the same state whether the calls come from P or Q.
What I do not know is how this will translate in memory. Will it look like this:
______________
| Program P |
| < |
| P stuff |
| > |
| |
| < |
| libA stuff, |
| loaded by P |
| > | => A's code and variables are duplicated
| |
| < |
| libQ stuff |
| < |
| libA stuff,|
| loaded by Q|
| > |
| > |
|______________|
... or like this?
______________
| Program P |
| < |
| P stuff |
| *libA |
| *libQ |
| > |
| |
| < |
| libA stuff, |
| loaded by P |
| > | => A's code is loaded once, Q holds some sort of pointer to it
| |
| < |
| libQ stuff |
| *libA |
| > |
|______________|
In the second case, keeping a consistent state for a single process is trivial; in the first case, it will require some more effort (e.g. some shared memory segment, using the process id as the second argument to ftok()).
Of course, since I have a limited knowledge on how linking and loading works, the diagrams above may be completely wrong. For all I know, the shared libraries could be at a fixed space in memory, and every process accesses the same data and code. The behaviour could also depends on how A and/or P and/or Q were compiled. And this behaviour is probably not platform independent.

The code segment of a shared library exists in memory in a single instance per system. Yet, it can be mapped to different virtual addresses for different processes, so different processes see the same function at different addresses (that's why the code that goes to a shared library must be compiled as PIC).
Data segment of a shared library is created in one copy for each process, and initialized by whatever initial values where specified in the library.
This means that the callers of a library do not need to know if it is shared or not: all callers in one process see the same copy of the functions and the same copy of external variables defined in the library.
Different processes execute the same code, but have their individual copies of data, one copy per process.

Related

Is there any gcc compiler warning which could have caught this memory bug?

I haven't programmed C for quite some time and my pointer-fu had degraded. I made a very elementary mistake and it took me well over an hour this morning to find what I'd done. The bug is minimally reproduced here: https://godbolt.org/z/3MdzarP67 (I am aware the program is absurd memory-management wise, just showing what happens).
The first call to realloc() breaks because of course, the pointer it's given points to stack memory, valgrind made this quite obvious.
I have a rule with myself where any time I track down a bug, if there is a warning that could have caught it I enable it on my projects. Often times this is not the case, since many bugs come from logic errors the compiler can't be expected to check.
However here I am a bit surprised. We malloc() and then immediately reassign that pointer which leaves the allocated memory inaccessible. It's obvious the returned pointer does not live outside the scope of that if block, and is never free()'d. Maybe it's too much to expect the compiler to analyze the calls and realize we're attempting to realloc() stack memory but I am surprised that I can't find anything to yell at me about the leaking of the malloc() returned pointer. Even clang's static analyzer scan-build doesn't pick up on it, I've tried various relevant options.
The best I could find was -fsanitize=address which at least prints out some cluing information during the crash instead of:
mremap_chunk(): invalid pointer
on Godbolt, or
realloc(): invalid old size
Aborted (core dumped)
on my machine, both of which are somewhat cryptic (although yes they do show plainly that there is some memory issue occurring). Still, this compiles without issues.
Since Godbolt links don't live forever here is the critical section of the code:
void add_foo_to_bar(struct Bar** b, Foo* f) {
if ((*b)->foos == NULL) {
(*b)->foos = (Foo*)malloc(sizeof(Foo));
// uncomment for correction
//(*b)->foos[(*b)->n_foos] = *f;
// obvious bug here, we leak memory by losing the returned pointer from malloc
// and assign the pointer to a stack address (&f1)
// comment below line for correction
(*b)->foos = f; // completely wrong
(*b)->n_foos++;
} else {
(*b)->foos = (Foo*)realloc((*b)->foos, ((*b)->n_foos + 1) * sizeof(Foo));
(*b)->foos[(*b)->n_foos] = *f;
(*b)->n_foos++;
}
}
the error occurs because f is a pointer to stack memory (intentional) but we obviously can't assign something that was supposed to have been malloc()'d to that.

Try -fanalyzer if your compiler is recent enough. When running it I get:
../main.c:30:28: warning: ‘realloc’ of ‘&f1’ which points to memory not on the heap [CWE-590] [-Wanalyzer-free-of-non-heap]
30 | (*b)->foos = (Foo*)realloc((*b)->foos, ((*b)->n_foos + 1) * sizeof(Foo));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
‘main’: events 1-2
|
| 37 | int main() {
| | ^~~~
| | |
| | (1) entry to ‘main’
|......
| 45 | add_foo_to_bar(&b, &f1);
| | ~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (2) calling ‘add_foo_to_bar’ from ‘main’
|
+--> ‘add_foo_to_bar’: events 3-5
|
| 19 | void add_foo_to_bar(struct Bar** b, Foo* f) {
| | ^~~~~~~~~~~~~~
| | |
| | (3) entry to ‘add_foo_to_bar’
| 20 | if ((*b)->foos == NULL) {
| | ~
| | |
| | (4) following ‘true’ branch...
| 21 | (*b)->foos = (Foo*)malloc(sizeof(Foo));
| | ~~~~
| | |
| | (5) ...to here
|
<------+
|
‘main’: events 6-7
|
| 45 | add_foo_to_bar(&b, &f1);
| | ^~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (6) returning to ‘main’ from ‘add_foo_to_bar’
| 46 | add_foo_to_bar(&b, &f2);
| | ~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (7) calling ‘add_foo_to_bar’ from ‘main’
|
+--> ‘add_foo_to_bar’: events 8-11
|
| 19 | void add_foo_to_bar(struct Bar** b, Foo* f) {
| | ^~~~~~~~~~~~~~
| | |
| | (8) entry to ‘add_foo_to_bar’
| 20 | if ((*b)->foos == NULL) {
| | ~
| | |
| | (9) following ‘false’ branch...
|......
| 30 | (*b)->foos = (Foo*)realloc((*b)->foos, ((*b)->n_foos + 1) * sizeof(Foo));
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | | |
| | | (10) ...to here
| | (11) call to ‘realloc’ here
|

No, but, runtime testing can save you.
If you can spare the execution overhead, I have seen many applications add an extra layer to memory allocation to track the allocations made and find leaks/errors.
Usually they replace malloc() and free() with a macros that include FILE and LINE
One example can be seen here (check the Heap.c and Heap.h files)
https://github.com/eclipse/paho.mqtt.c/tree/master/src
Googling "memory heap debugger" will probably turn up other examples. Or you could roll your own.

memory map adress to function call

TLDR:
Is it possible to map an adresspace to a function call in a kernel module?
So similar to mmap. But mmap is for userspace and will only call the functions if you access the next page.
+---+---+---+---+--------------------+
| | | | | |
+-------+---+---+--------------------+
|
+-------------------------------------------------+
|
+---------------------v----------------+
| void my_driver_function(int offset); |
+--------------------------------------+
EDIT:
here the Long Story
Old world
In our old world we had N Devices that where controlled by N independent device drivers. Each registers of each device where memory mapped at a location. And every driver just ioremaped the registers and controlled the hardware directly.
+----------------+ +--------------+ +------------+ +--------------+
| Driver A | | Driver B | | Driver C | | Driver E |
+----------------+ +--------------+ +------------+ +--------------+
|ioremap |ioremap |ioremap |ioremap
| | | |
+----+-----------+ +--------------+ +------------+ +--------------+
| Device A | | Device B | | Device C | | Device E |
| | | | | | | |
+----------------+ +--------------+ +------------+ +--------------+
New World
In our new world we merged the device hardware. But the drivers are still separate. Due to hardware limitations, it is now necessary to mutex all access to the one new device. Also there are now some more constraints that where not there in the old world (alignment, byteorder, timing, ...). But because the drivers are independent they do not know about code or access happening in another driver. So it leads to violations of this constraints.
+----------------+ +--------------+ +------------+ +--------------+
| Driver A | | Driver B | | Driver C | | Driver E |
+----------------+ +--------------+ +------------+ +--------------+
|ioremap |ioremap |ioremap |ioremap
| | | |
+----+----------------------------------------+-----------------+---------+
| Device A/B/C/D/E |
| |
+-------------------------------------------------------------------------+
the idea
because we do not want to rework all the drivers. go through all the code and seek for every pointer that my be pointing to a hardware register to guard all this accesses by mutexes, my idea was to add a virtual device memory. This should be a memory area there each access will be routed to a function. This function will then perform locking and tracking and things and access to the hardware.
+----------------+ +--------------+ +------------+ +--------------+
| Driver A | | Driver B | | Driver C | | Driver E |
+----------------+ +--------------+ +------------+ +--------------+
|ioremap |ioremap |ioremap |ioremap
| | | |
+----+----------------------------------------+-----------------+---------+
| Virtual Device A/B/C/D/E |
| |
+----------------------+--------------------------------------------------+
|
|my_mapper_function(...)
| /* do (un)locking, check constraints, ... */
|
+----------------------+--------------------------------------------------+
| Device A/B/C/D/E |
| |
+-------------------------------------------------------------------------+
the question
is there a mechanism in the linux kernel, that allows to map every access to a specific memory region to be routed trough a function? Similar to what mmap does, but actually quite different because you cannot hook every arbitrary function to mmap. Also it will not route every request through this function but only if a request crosses a page border.

I am thinking of moving Driver A/B/C... to UserSpace using UIO, and all these driver using same kernel space code which used to control all related devices.
User Space | Driver A | Driver B | Driver C |
|================================|
Kernel Space | KObject ABC |
|================================|
Hardware | Device A|B|C |

What is the significance of sbss2/sdata2?

I am working with PPC microcontroller (e200z4 specifically) using GCC based compiler. The PPC EABI support small data allocation if we define a variable whose size less than defined number (my case is 8). I understand that:
sdata is for small initialized data and it is modificable (will be located on RAM section).
sbss is the same as sdata that will be located on RAM, but it is for unitialized or zero variable.
these 2 section is access by only one instruction that is referenced by a 16bit signed offset + base register.
What I dont know is that the significance of sbss2 and sdata2, by reading the PPC EABI specification? Will they be small variables on RAM or Flash and if they are difference from sdata and sbss?

From the EABI
External variables that are scalars of 8 or fewer bytes, whose values might be changed by
the program, and whose values will not be changed outside of the program, shall be accessed as .sdata or .sbss entries...
When the object file is not to be part of a shared object file, external variables that are scalars
of 8 or fewer bytes, whose values cannot be changed by the program, and whose values will
not be changed outside of the program, shall be accessed as .sdata2 or .sbss2 entries...
The special section .sdata2 is intended to hold initialized read-only small data that contribute to
the program memory image. The section can, however, be used to hold writable data. The special
section .sbss2 is intended to hold writable small data that contribute to the program memory image and whose initial values are 0.

My previous e200 projects were setup like that:
ROM
+----------+
| |
| .text | code
| |
- +----------+
^ | |
| | .sdata2 | constant small initialized data (max 32k)
| | |
max 64k +----------+ <~~ _SDA2_BASE_ (r2)
| | |
| | .sbss2 | constant small not (or zero) initialized data (max 32k)
v | | ALWAYS EMPTY!
- +----------+
RAM
+----------+
| |
| .data | normal initialized data
| |
- +----------+
^ | |
| | .sdata | normal small initialized data (max 32k)
| | |
max 64k +----------+ <~~ _SDA_BASE_ (r13)
| | |
| | .sbss | normal small not (or zero) initialized data (max 32k)
v | |
- +----------+
| |
| .bss | normal not (or zero) initialized data
| |
+----------+

Issues with ER model design

I am trying to design a model for our future database of our toys and certain measurements that have to be done post-production. I have trouble grasping how to model this. I have tried multiple ways, but none of them seem optimal and in the end I've always lost out on the connectivity between entities.
What I need to achieve is some kind of meaningful relationship between the following:
A toy (with some trivial properties).
A series of toys (multiple toys can be related to one series and a toy can only belong to one series).
Measurement steps. There are currently 6 of these steps. Each step has its own input parameters and these vary in type as well as in number (eg. only 3 parameters for measurement step 1 and 10 parameters for measurement step 2).
With each series, a sequence of these measurement steps is defined. Duplicates of tests are allowed (eg. measurement step 1 > measurement step 4 > measurement step 1 is a valid sequence). The sequence along with the parameters must be stored somewhere for future reference.
Each toy goes through the sequence of measurements that is defined by its series. All of the results must be stored somewhere (for each individual toy).
If I split the measurement steps into their own tables I can't reference them conditionally (as foreign keys) to some other table.
If I try to serialize part of the data I lose the ability to make connections between individual measurement steps, measurement results (at least with queries) etc.
I know people here generally hate/don't answer these kinds of "discussion-like" questions, but I'd ask of you to at least point out what is a good practice in a system where I need to store this locally on a machine, but need a database to hold the data - to move towards serial-like data and just do general relationships where it is easy to do so or keep trying to normalize it as much as possible?

If measurements steps share most of attributes (or are of the same type, like what you called PARAMETERS), and I understood correctly your definitions, I would make something like this.
It could be a starting point.
+----------------------------+ +------------------------------+
| TOYS | | TOY_SERIES |
+-----+----------------------+ +---------+--------------------+
| PK | ID_TOY | +----------+ PK, FK1 | ID_S +--------+
| | | | +------------------------------+ |
| FK1 | ID_S +---------+ | | ... | |
+----------------------------+ | | | |
| | ... | | | | |
| | | | | | |
+-----+----------------------+ +---------+--------------------+ |
|
|
|
|
+------------------------------+ |
| BR_SER_MEAS | |
+---------+--------------------+ |
| PK, FK1 | ID_S +--------+
| | |
| PK, FK2 | ID_M +--------+
| | | |
| PK | ID_SEQ | |
| | | |
+---------+--------------------+ |
|
|
+------------------------------+ |
| MEASURE_STEPS | |
+------------------------------+ |
| PK ID_M +--------+
+------------------------------+
| PARAM_01 |
| ... |
| PARAM_10 |
| |
| |
+------------------------------+

How can we validate tabular data in robot framework?

In Cucumber, we can directly validate the database table content in tabular format by mentioning the values in below format:
| Type | Code | Amount |
| A | HIGH | 27.72 |
| B | LOW | 9.28 |
| C | LOW | 4.43 |
Do we have something similar in Robot Framework. I need to run a query on the DB and the output looks like the above given table.

No, there is nothing built in to do exactly what you say. However, it's fairly straight-forward to write a keyword that takes a table of data and compares it to another table of data.
For example, you could write a keyword that takes the result of the query and then rows of information (though, the rows must all have exactly the same number of columns):
| | ${ResultOfQuery}= | <do the database query>
| | Database should contain | ${ResultOfQuery}
| | ... | #Type | Code | Amount
| | ... | A | HIGH | 27.72
| | ... | B | LOW | 9.28
| | ... | C | LOW | 4.43
Then it's just a matter of iterating over all of the arguments three at a time, and checking if the data has that value. It would look something like this:
**** Keywords ***
| Database should contain
| | [Arguments] | ${actual} | #{expected}
| | :FOR | ${type} | ${code} | ${amount} | IN | #{expected}
| | | <verify that the values are in ${actual}>
Even easier might be to write a python-based keyword, which makes it a bit easier to iterate over datasets.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight