Copying a 2D array of char * to user space from kernel space? - c

In kernel space, I have the following:
char * myData[MAX_BUF_SIZE][2];
I need to define a kernel method that copies this data into user-space., so how would I go about defining this method? I've got the following, but I'm not quite sure what I'm doing.
asmlinkage int sys_get_my_data(char __user ***data, int rowLen, int bufferSize) {
if (rowLen < 1 || bufferSize < 1 || rowLen > MAX_BUF_SIZE || bufferSize
> MAX_BUF_SIZE) {
return -1;
}
if( copy_to_user( data, myData, rowLen * bufferSize * dataCounter * 2) )
{
printk( KERN_EMERG "Copy to user failure for get all minifiles\n" );
return -1;
}
return 0;
}
Help?

Per your comment, these char * values point to nul-terminated strings.
Now, you can't just go copying that whole fileDataMap memory block to userspace - that'll just give userspace a bunch of char * values that point into kernel space, so it won't actually be able to use them. You need to copy the strings themselves to userspace, not just the pointers (this is a "deep copy").
Now, there's a few ways you can go about this. The easiest it to simply pack all the strings, one after another, into a big char array in userspace. It's then up to userspace to scan through the block, reconstructing the pointers:
asmlinkage int sys_get_my_data(char __user *data, size_t bufferSize)
{
size_t i;
for (i = 0; i < MAX_BUF_SIZE; i++) {
size_t s0_len = strlen(fileDataMap[i][0]) + 1;
size_t s1_len = strlen(fileDataMap[i][1]) + 1;
if (s0_len + s1_len > bufferSize) {
return -ENOSPC;
}
if (copy_to_user(data, fileDataMap[i][0], s0_len)) {
return -EINVAL;
}
data += s0_len;
bufferSize -= s0_len;
if (copy_to_user(data, fileDataMap[i][1], s1_len)) {
return -EINVAL;
}
data += s1_len;
bufferSize -= s1_len;
}
return 0;
}
This will only work if there are always MAX_BUF_SIZE string-pairs, because userspace will need to know how many strings it is expecting to recieve in order to be able to safely scan through them. If that's not the case, you'll have to return that information somehow - perhaps the return value of the syscall could be the number of string-pairs?
If you want the kernel to reconstruct the pointer table in userspace, you'll have to copy the strings as above, and then fill out the pointer table - userspace will have to pass two buffers, one for the strings themselves and one for the pointers.

Related

Splitting a string and store to the heap algorithm question

For this code below that I was writing. I was wondering, if I want to split the string but still retain the original string is this the best method?
Should the caller provided the ** char or should the function "split" make an additional malloc call and memory manage the ** char?
Also, I was wondering if this is the most optimizing method, or could I optimize the code better than this?
I still have not debug the code yet, I am a bit undecided whether if the caller manage the ** char or the function manage the pointer ** char.
#include <stdio.h>
#include <stdlib.h>
size_t split(const char * restrict string, const char splitChar, char ** restrict parts, const size_t maxParts){
size_t size = 100;
size_t partSize = 0;
size_t len = 0;
size_t newPart = 1;
char * tempMem;
/*
* We just reverse a long page of memory
* At reaching the space character that is the boundary of the new
*/
char * mem = (char*) malloc( sizeof(char) * size );
if ( mem == NULL ) return 0;
for ( size_t i = 0; string[i] != 0; i++ ) {
// If it is a split char we at a new part
if ( string[i] == splitChar) {
// If the last character was not the split character
// Then mem[len] = 0 and increase the len by 1.
if (newPart == 0) mem[len++] = 0;
newPart = 1;
continue;
} else {
// If this is a new part
// and not a split character
// we make a new pointer
if ( newPart == 1 ){
// if reach maxpart we break.
// It is okay here, to not worry about memory
if ( partSize == maxParts ) break;
parts[partSize++] = &mem[len];
newPart = 0;
}
mem[len++] = string[i];
if ( len == size ){
// if ran out of memory realloc.
tempMem = (char*)realloc(mem, sizeof(char) * (size << 1) );
// if fail quit loop
if ( tempMem == NULL ) {
// If we can't get more memory the last part could be corrupted
// We have to return.
// Otherwise the code below can seg.
// There maybe a better way than this.
return partSize--;
}
size = size << 1;
mem = tempMem;
}
}
}
// If we got here and still in a newPart that is fine no need
// an additional character.
if ( newPart != 1 ) mem[len++] = 0;
// realloc to give back the unneed memory
if ( len < size ) {
tempMem = (char*) realloc(mem, sizeof(char) * len );
// If the resizing did not fail but yielded a different
// memory block;
if ( tempMem != NULL && tempMem != mem ){
for ( size_t i = 0; i < partSize; i++ ){
parts[i] = tempMem + (parts[i] - mem);
}
}
}
return partSize;
}
int main(){
char * tStr = "This is a super long string just to test the str str adfasfas something split";
char * parts[10];
size_t len = split(tStr, ' ', parts, 10);
for (size_t i = 0; i < len; i++ ){
printf("%zu: %s\n", i, parts[i]);
}
}
What is "best" is very subjective, as well as use case dependent.
I personally would keep the parameters as input only, define a struct to contain the split result, and probably return such by value. The struct would probably contain pointers to memory allocation, so would also create a helper function free that memory. The parts might be stored as list of strings (copy string data) or index&len pairs for the original string (no string copies needed, but original string needs to remain valid).
But there are dozens of very different ways to do this in C, and all a bit klunky. You need to choose your flavor of klunkiness based on your use case.
About being "more optimized": unless you are coding for a very small embedded device or something, always choose a more robust, clear, easier to use, harder to use wrong over more micro-optimized. The useful kind of optimization turns, for example, O(n^2) to O(n log n). Turning O(3n) to O(2n) of a single function is almost always completely irrelevant (you are not going to do string splitting in a game engine inner rendering loop...).

Why are multiple calls to calloc crashing my app?

I have a function that initializes a struct that contains nested structs and arrays.
While initializing the struct I have multiple calls to calloc.
Refer to code bellow:
typedef struct
{
int length;
uint8_t *buffer;
} buffer_a;
typedef struct
{
int length;
uint8_t *buffer;
int *second_buffer_size;
uint8_t **second_buffer;
} buffer_b;
typedef struct
{
int max_length;
buffer_a *buffer_in;
buffer_b *buffer_out;
} state_struct;
state_struct *init(int size, int elements) {
size_t struct_size = sizeof(state_struct);
state_struct *s = (state_struct*) calloc(struct_size, struct_size);
log("Building state with length %d", size);
s->max_length = size;
size_t buffer_in_size = s->max_length * sizeof(buffer_a);
s->buffer_in = (buffer_a*) calloc(buffer_in_size, buffer_in_size);
size_t buffer_out_size = s->max_length * sizeof(buffer_b);
s->buffer_out = (buffer_b*) calloc(buffer_out_size, buffer_out_size);
log("Allocated memory for both buffers structs");
for (int i = 0; i < s->max_length; ++i) {
size_t buf_size = elements * sizeof(uint8_t);
s->buffer_in[i].buffer = (uint8_t*) calloc(buf_size, buf_size);
s->buffer_in[i].length = -1;
log(s, "Allocated memory for in buffer");
s->buffer_out[i].buffer = (uint8_t*) calloc(buf_size, buf_size);
s->buffer_out[i].length = -1;
log(s, "Allocated memory for out buffer");
size_t inner_size = elements * elements * sizeof(uint8_t);
size_t inner_second_buffer_size = elements * sizeof(int);
s->buffer_out[i].second_buffer = (uint8_t**) calloc(inner_size, inner_size);
s->buffer_out[i].second_buffer_size = (int*) calloc(inner_second_buffer_size, inner_second_buffer_size);
log(s, "Allocated memory for inner buffer");
}
return s;
}
Logs just before the for loop are printed but the program crashes and the first log statement inside the loop does not get printed out.
Why is this happening?
So this may not be an answer to your question, but here goes:
When I ran this code (on Ubuntu, gcc 7.4), and replaced all the log functions with printf, it finished succesfuly. I suspect the problem might be in the way you use the log function. You specify that it works up until the first log call inside the loop. You didn't specify what the log function does, or whether it is a function or just a macro wrapper for printf, but you call it in a different manner inside the loop - the first parameter is *state_struct rather than a format string.
Also, the way you call calloc seems to be semantically incorrect. The first parameter should be the number of blocks of second parameter size you want to allocate (presumably 1 in this case)

Is this appender, with realloc function safe?

Just finished putting this function together from some man documentation, it takes a char* and appends a const char* to it, if the size of the char* is too small it reallocates it to something a little bigger and finally appends it. Its been a long time since I used c, so just checking in.
// append with realloc
int append(char *orig_str, const char *append_str) {
int result = 0; // fail by default
// is there enough space to append our data?
int req_space = strlen(orig_str) + strlen(append_str);
if (req_space > strlen(orig_str)) {
// just reallocate enough + 4096
int new_size = req_space;
char *new_str = realloc(orig_str, req_space * sizeof(char));
// resize success..
if(new_str != NULL) {
orig_str = new_str;
result = 1; // success
} else {
// the resize failed..
fprintf(stderr, "Couldn't reallocate memory\n");
}
} else {
result = 1;
}
// finally, append the data
if (result) {
strncat(orig_str, append_str, strlen(append_str));
}
// return 0 if Ok
return result;
}
This is not usable because you never tell the caller where the memory is that you got back from realloc.
You will need to either return a pointer, or pass orig_str by reference.
Also (as pointed out in comments) you need to do realloc(orig_str, req_space + 1); to allow space for the null terminator.
Your code has a some inefficient logic , compare with this fixed version:
bool append(char **p_orig_str, const char *append_str)
{
// no action required if appending an empty string
if ( append_str[0] == 0 )
return true;
size_t orig_len = strlen(*p_orig_str);
size_t req_space = orig_len + strlen(append_str) + 1;
char *new_str = realloc(*p_orig_str, req_space);
// resize success..
if(new_str == NULL)
{
fprintf(stderr, "Couldn't reallocate memory\n");
return false;
}
*p_orig_str = new_str;
strcpy(new_str + orig_len, append_str);
return true;
}
This logic doesn't make any sense:
// is there enough space to append our data?
int req_space = strlen(orig_str) + strlen(append_str);
if (req_space > strlen(orig_str)) {
As long as append_str has non-zero length, you're always going to have to re-allocate.
The main problem is that you're trying to track the size of your buffers with strlen. If your string is NUL-terminated (as it should be), your perceived buffer size is always going to be the exact length of the data in it, ignoring any extra.
If you want to work with buffers like this, you need to track the size in a separate size_t, or keep some sort of descriptor like this:
struct buffer {
void *buf;
size_t alloc_size;
size_t used_amt; /* Omit if strings are NUL-terminated */
}

Most memory-efficient way to read & store list of strings in C

I'd like to know what's the most memory efficient way to read & store a list of strings in C.
Each string may have a different length, so pre-allocating a big 2D array would be wasteful.
I also want to avoid a separate malloc for each string, as there may be many strings.
The strings will be read from a large buffer into this list data-structure I'm asking about.
Is it possible to store all strings separately with a single allocation of exactly the right size?
One idea I have is to store them contiguously in a buffer, then have a char * array pointing to the different parts in the buffer, which will have '\0's in it to delimit. I'm hoping there's a better way though.
struct list {
char *index[32];
char buf[];
};
The data-structure and strings will be strictly read-only.
Here's a mildly efficient format, assuming you know the length of all the strings in advance:
|| total size | string 1 | string 2 | ........ | string N | len(string N) | ... | len(string 2) | len(string 1) ||
You can store the lengths either in fixed-width integers or in variable-width integers, but the point is that you can jump to the end and scan all the lengths relatively efficiently, and from the length sum you can compute the offset of the string. You know when you reached the last string when there is no remaining space.
You can create your single buffer and store them contiguously, expanding the buffer as needed by using realloc(). But then you would need a second array to store string positions and maybe realloc() it as well, so I might simply create a dynamically allocated array and malloc() each string separately.
Find the number and total-length of all strings:
int num = 0;
int len = 0;
char* string = GetNextString(input);
while (string)
{
num += 1;
len += strlen(string);
string = GetNextString(input);
}
Rewind(input);
Then, allocate the following two buffers:
int* indexes = malloc(num*sizeof(int));
char* strings = malloc((num+len)*sizeof(char));
Finally, fill these two buffers:
int index = 0;
for (int i=0; i<num; i++)
{
indexes[i] = index;
string = GetNextString(input);
strcpy(strings+index,string);
index += strlen(string)+1;
}
After that, you can simply use strings[indexes[i]] in order to access the ith string.
Most efficient and memory efficient way is a two pass solution. In the first pass you calculate the total size for all strings, then you allocate the total memory block. In the second pass you read all strings using large buffers.
You can create a pointer array for the strings and calculate the difference between the pointers to get the string sizes. This way you save the null byte as end marker.
Here a complete example:
#include <stdio.h>
#include <memory.h>
#include <stdlib.h>
struct StringMap
{
char *data;
char **ptr;
long cPos;
};
void initStringMap(StringMap *stringMap, long numberOfStrings, long totalCharacters)
{
stringMap->data = (char*)malloc(sizeof(char)*(totalCharacters+1));
stringMap->ptr = (char**)malloc(sizeof(char*)*(numberOfStrings+2));
memset(stringMap->ptr, 0, sizeof(char*)*(numberOfStrings+1));
stringMap->ptr[0] = stringMap->data;
stringMap->ptr[1] = stringMap->data;
stringMap->cPos = 0;
}
void extendString(StringMap *stringMap, char *str, size_t size)
{
memcpy(stringMap->ptr[stringMap->cPos+1], str, size);
stringMap->ptr[stringMap->cPos+1] += size;
}
void endString(StringMap *stringMap)
{
stringMap->cPos++;
stringMap->ptr[stringMap->cPos+1] = stringMap->ptr[stringMap->cPos];
}
long numberOfStringsInStringMap(StringMap *stringMap)
{
return stringMap->cPos;
}
size_t stringSizeInStringMap(StringMap *stringMap, long index)
{
return stringMap->ptr[index+1] - stringMap->ptr[index];
}
char* stringinStringMap(StringMap *stringMap, long index)
{
return stringMap->ptr[index];
}
void freeStringMap(StringMap *stringMap)
{
free(stringMap->data);
free(stringMap->ptr);
}
int main()
{
// The interesting values
long numberOfStrings = 0;
long totalCharacters = 0;
// Scan the input for required information
FILE *fd = fopen("/path/to/large/textfile.txt", "r");
int bufferSize = 4096;
char *readBuffer = (char*)malloc(sizeof(char)*bufferSize);
int currentStringLength = 0;
ssize_t readBytes;
while ((readBytes = fread(readBuffer, sizeof(char), bufferSize, fd))>0) {
for (int i = 0; i < readBytes; ++i) {
const char c = readBuffer[i];
if (c != '\n') {
++currentStringLength;
} else {
++numberOfStrings;
totalCharacters += currentStringLength;
currentStringLength = 0;
}
}
}
// Display the found results
printf("Found %ld strings with total of %ld bytes\n", numberOfStrings, totalCharacters);
// Allocate the memory for the resource
StringMap stringMap;
initStringMap(&stringMap, numberOfStrings, totalCharacters);
// read all strings
rewind(fd);
while ((readBytes = fread(readBuffer, sizeof(char), bufferSize, fd))>0) {
char *stringStart = readBuffer;
for (int i = 0; i < readBytes; ++i) {
const char c = readBuffer[i];
if (c == '\n') {
extendString(&stringMap, stringStart, &readBuffer[i]-stringStart);
endString(&stringMap);
stringStart = &readBuffer[i+1];
}
}
if (stringStart < &readBuffer[readBytes]) {
extendString(&stringMap, stringStart, &readBuffer[readBytes]-stringStart);
}
}
endString(&stringMap);
fclose(fd);
// Ok read the list
numberOfStrings = numberOfStringsInStringMap(&stringMap);
printf("Number of strings in map: %ld\n", numberOfStrings);
for (long i = 0; i < numberOfStrings; ++i) {
size_t stringSize = stringSizeInStringMap(&stringMap, i);
char *buffer = (char*)malloc(stringSize+1);
memcpy(buffer, stringinStringMap(&stringMap, i), stringSize);
buffer[stringSize-1] = '\0';
printf("string %05ld size=%8ld : %s\n", i, stringSize, buffer);
free(buffer);
}
// free the resource
freeStringMap(&stringMap);
}
This example reads a very large text file, splits it into lines and creates an array with a string per line. It only needs two malloc calls. One for the pointer array and one for the sting block.
If it's strictly read-only as you've described, you can store the entire list of strings and their offsets in a single chunk of memory and read the whole thing with a single read.
The first sizeof(long) bytes stores the number of strings, n. The next n longs store the offsets into each string from the start of the string buffer which starts at position (n+1)*sizeof(long). You don't have to store the trailing zero for each string, but if you do, you can access each string with &str_buffer[offset[i]]. If you don't store the trailing '\0' then you would have to copy into a temporary buffer and append it yourself.

Initializing an infinite number of char **

I'm making a raytracing engine in C using the minilibX library.
I want to be able to read in a .conf file the configuration for the scene to display:
For example:
(Az#Az 117)cat universe.conf
#randomcomment
obj:eye:x:y:z
light:sun:100
light:moon:test
The number of objects can vary between 1 and the infinite.
From now on, I'm reading the file, copying each line 1 by 1 in a char **tab, and mallocing by the number of objects found, like this:
void open_file(int fd, struct s_img *m)
{
int i;
char *s;
int curs_obj;
int curs_light;
i = 0;
curs_light = 0;
curs_obj = 0;
while (s = get_next_line(fd))
{
i = i + 1;
if (s[0] == 'l')
{
m->lights[curs_light] = s;
curs_light = curs_light + 1;
}
else if (s[0] == 'o')
{
m->objs[curs_obj] = s;
curs_obj = curs_obj + 1;
}
else if (s[0] != '#')
{
show_error(i, s);
stop_parsing(m);
}
}
Now, I want to be able to store each information of each tab[i] in a new char **tab, 1 for each object, using the ':' as a separation.
So I need to initialize and malloc an undetermined number of char **tab. How can I do that?
(Ps: I hope my code and my english are good enough for you to understand. And I'm using only the very basic function, like read, write, open, malloc... and I'm re-building everything else, like printf, get_line, and so on)
You can't allocate an indeterminate amount of memory; malloc doesn't support it. What you can do is to allocate enough memory for now and revise that later:
size_t buffer = 10;
char **tab = malloc(buffer);
//...
if (indexOfObjectToCreate > buffer) {
buffer *= 2;
tab = realloc(tab, buffer);
}
I'd use an alternative approach (as this is c, not c++) and allocate simply large buffers as we go by:
char *my_malloc(size_t n) {
static size_t space_left = 0;
static char *base = NULL;
if (base==NULL || space_left < n) base=malloc(space_left=BIG_N);
base +=n; return base-n;
}
Disclaimer: I've omitted the garbage collection stuff and testing return values and all safety measures to keep the routine short.
Another way to think this is to read the file in to a large enough mallocated array (you can check it with ftell), scan the buffer, replace delimiters, line feeds etc. with ascii zero characters and remember the starting locations of keywords.

Resources