Collecting data in recursion function in C - c

I am writing an huffman algorithm and I got problem with a collecting data in recursion function. It means I have a recursion function which generates codes from tree, but I would like to have them in array (this allow me processing data later). I wrote the function
void save_code(HuffNode** array, int pos, HuffNode *node, char * table, int depth)
{
if(node->left == NULL){
printf("%d - %c > ", pos, node->sign);
array[pos]->sign = node->sign;
strcpy(array[pos]->code, table);
puts(table);
// save to global table
}
else {
table[depth] = '0';
save_code(array, pos + 1, node->left, table, depth + 1);
table[depth] = '1';
save_code(array, pos + 1 , node->right, table, depth + 1);
}
}
The biggest problem I have with variable pos, I thought if I can increment the pos variable (like in loop for),so I would be able to save it in variable array at position pos. The whole program is here: https://github.com/mtczerwinski/algorithms/blob/master/huffman/huffman.c
Edit:
I ask myself if global variable can solve a problem - after a few moments of coding - the answer is positive.
int pos = 0; // global variable
void save_code(HuffNode** array, HuffNode *node, char * table, int depth) {
if(node->left == NULL){
array[pos]->sign = node->sign;
strcpy(array[pos]->code, table);
pos++;
}
else {
table[depth] = '0';
save_code(array , node->left, table, depth + 1);
table[depth] = '1';
save_code(array, node->right, table, depth + 1);
}
}
I would like to ask how to collect data in recursion function between calls. What are other ways to solve problem like this one.

Pass it by pointer:
void save_code(..., int *pos)
{
// ...
// use and modify (*pos) as you desire
// ...
save_code(..., pos);
// ...
}
This is a good approach, except that it doesn't look too pretty - you have an additional parameter for each recursive call and you have to use *pos instead of pos.
Pass and return it:
int save_code(..., int pos)
{
// ...
// use and modify pos as you desire
// ...
pos = save_code(..., pos);
// ...
return pos;
}
I wouldn't really recommend this (at least not above pass by pointer) as you'd return and pass a value, which seems unnecessary, since you only need to do one of those.
You also can't use this approach with multiple values, but it's easy to fix by using a struct, although if the function already returns something, this gets quite a bit messier.
And, for completeness, global variable:
int pos = 0; // global variable
void save_code(...)
{
// ...
// use and modify pos as you desire
// ...
save_code(...);
// ...
}
This has the disadvantage of having a pos variable floating around in global scope, but this could, in many cases, fairly easily be fixed by making it static so it's limited to one file, or, in the OOP world (e.g. in C++), one could hide this as a private class member.
Using a global variable would be a problem with multi-threading (i.e. multiple calls to the function executing at the same time).
I chose to favour brevity above completeness with regard to my code samples - I hope they're readable enough.

Related

How to preserve pointers between multiple method calls in C

I'm writing a parser for propositional logic (doesn't matter what that is, main point is I'm parsing a simple language) and initially started out with functions of the following form:
int formula() {
int store = step;
if(compound())
return TRUE;
else {
if(atom())
return TRUE;
else if(negation() && formula())
return TRUE;
else {
step = store;
return FALSE;
}
}
}
int compound() {
int store = step;
if(open() && formula() && binary_operator() && formula() && close())
return TRUE;
else {
step = store;
return FALSE;
}
}
The functions above not mentioned are base cases - these are the important parts. Formulas can have sub-formulas, and these sub-formulas in turn can be compound formulas, which contain sub-formulas, and so on.
Instead of ints though, I'm trying to return char sequences of 1s and 0s (true and false). If you return a sequence, it means that the input can generate a sequence (it must be valid). Otherwise, return null.
The issue is that every time I've tried the pointers keep getting lost - I understand this is to do with the stack(?) and the pointer sort of 'dies' when the function returns whatever. I've not tried arrays because I have been told that arrays work best statically, whereas the size of these arrays would be dynamic (size is determined by number of variables, which is only found at runtime).
Is there any way this approach can be done? I can't malloc anything because I won't be able to free it - the sequence of 1s and 0s needs to be returned before I'd be able to free it. Maybe pass structs with a sequence field, although I'm not sure if that suffers from the same issue.
Any help much appreciated. This is a program using C99. Any advice on clarifications welcome!
I'm not entirely following what you want to do, but there is not a clear reason why you couldn't use malloc. The pointer returned by malloc can be freed by another function later. Consider the following valid code:
char* foo(size_t* length)
{
*length = 3;
char* seq = malloc(*length);
seq[0] = 1;
seq[1] = 0;
seq[2] = 1;
return seq;
}
int main()
{
size_t length;
char* seq = foo(&length);
/* use seq */
free(seq);
}
You can also do it without malloc if you know an upper bound for your sequence. By passing a pointer to space you allocated on the stack from main(), you won't lose the data when the function exits:
void foo(char* seq, size_t total_size, size_t* used_size)
{
*used_size = 3;
seq[0] = 1;
seq[1] = 0;
seq[2] = 1;
}
int main()
{
size_t used_size;
char seq[100];
foo(seq, sizeof(seq), &used_size);
/* use seq */
}

How to stock multiple returns from a recursive function

I am working on generating all combinations from a code like ABCD for example, 24 combinations for this one 1 * 2 * 3 * 4.
I have this function:
static char *combi_switch(char *code, int i)
{
char *combi;
int j;
int k;
int l;
int s;
combi = (char *)malloc(sizeof(char) * ft_strlen(code) + 1);
ft_strcpy(combi, code);
k = i;
l = i;
j = ft_strlen(code) - 1;
if (i == j)
{
printf("%s\n", combi);
return (combi);
}
while (l <= j)
{
s = combi[i];
combi_switch(map, combi, k + 1, stock);
while (i < j)
{
combi[i] = combi[i + 1];
i++;
}
i = k;
combi[j] = s;
l++;
}
free(combi);
return (NULL);
}
ini called by this one:
char *combi_mix(char *code)
{
combi_switch(code, 0);
return (NULL);
}
ft_strlen && ft_strcpy are the same as the libc contains.
So with this functions if the code = "ABCD", printf illustrates the 24 combinations that are returned.
I went to stock all returns maybe in a char ** or a linked list.
is there a way to stock all those combinations that I printf?
is there a problem using "while" loops in recursive functions?
This is one of the last functions of my project so thank you so much if you can help me!
No, there's no any special problem with any kind of control construct in any kind of function. Use while or whatever. Now once we've got it out of the system, let's concentrate on the important question. How to accumulate the results of your function instead of printing them? It doesn't matter what the function actually computes, it's only important that it's recursive and each invocation prints something. We want to collect instead of printing.
First, a function should return something. Your current function returns a char* but it is never used. Your new function should return a value you are after, that is, a collection.
typedef struct {
/* whatever */
} string_collection;
We don't specify what sits inside of the collection. It might be a linked list, or a dynamic array together with its length, or whatever. You decide what kind of collection you want.
Now you need a couple of functions:
string_collection* create_empty_collection();
void add_element (string_collection* c, const char* s);
void move_elements (string_collection* c1,
string_collection* c2); // moves all elements from c2 to c1, leaving c2 empty
void destroy_collection (string_collection* c);
These functions modify their arguments. These are only example signatures. You may go for fully immutable interface if you wish:
string_collection* add_element (const string_collection* c, const char* s);
string_collection* concatenate (const string_collection* c1,
const string_collection* c2); //etc
In this variant, you create a brand new collection without touching existing ones. Each style has its place; use whatever works for you.
Now it's simple to modify the function:
string_collection* your_function (whatever parameters)
{
// First, need a collection to return
string_collection* coll = create_empty_collection();
// whatever
// whatever
// ATTN: old code was: printf ("%s", something), now do this:
add_elememt (coll, something);
// whatever
// whatever
// ATTN: old code was: your_function(whatever parameters), now do this:
string_collection* new_coll = your_function(whatever parameters);
move_elements (coll, new_coll);
destroy_collection (new_coll);
// whatever
// whatever
// ATTN: old code was: return something, now do this:
return coll;
}
When you call your function, you now do:
string_collection* coll = your_function (whatever parameters)'
// do something with the collection
destroy_collection (coll);
Here we have just learned to accumulate recursive function results. Awesome!
On a related note, your function mallocs a string each time it's called, but there's no free in sight. This is bad (a memory leak). Please add
free (combi);
where appropriate. In your case this means before any return statement. (It's a good practice to have a single return statement in the end of the function, instead of multiple statements scattered throughout the body; this is one reason for that).
you can simplify the program using below logic
char str[]="ABCD";
int i,j,k,l,count=0;
char temp;
l=strlen(str);
j=0;
k=1;
for(i=0;i<factorial(l);i++)
{
if(j==l)
{
j=0;
}
if(k==l)
{
k=0;
}
temp=str[j];
str[j]=str[k];
str[k]=temp;
printf("%s\n",str);
j++;
k++;
}
for more info you can see here

Returning Two Pointers to Dynamic Arrays

I am having a bunch of problems with pointers and dynamic arrays here.
I have a function that I call, that does a bunch a stuff, like removing an ellement from the dynamic array , which leads me to reallocating memory to one of those dynamic arrays. The problem is I call functions within functions, and I can't return all my values properly to the Main.
Since I can't return 2 values, how can I do this?
structure1* register(structure1 *registerArray,structure2 *waitingList, int counter){
//Bunch of code in here
registerArray = realloc(inspecao, (counter)+1);
waitingList = eliminate(waitingList, 5, counter); //Doesn't matter what it does really
return registerArray;
}
structure1* eliminate(structure1 *arrayToEliminateFrom, int positionToEliminate, int *counter){
//The code for this doesn't matter
//All I do is eliminate an ellement and reallocate it
arrayToEliminateFrom = realloc(arrayToEliminateFrom, (*counter-1)*sizeof(structure1))
return arrayToEliminateFrom;
}
As you can see , I don't know how to return the pointer to the waitingList dynamic array to the Main. How can I do this?
I have searched everywhere.
Help
Okay, here are two ways to do it.
The first is, based upon your comment, what you think your instructor would want:
void
xregister(structure1 **registerArray, int *arrayCount,
structure1 **waitingList, int *waitCount)
{
// Bunch of code in here
*arrayCount += 1;
*registerArray = realloc(inspecao, *arrayCount * sizeof(structure1));
// Doesn't matter what it does really
eliminate(waitingList, 5, waitCount)
}
void
eliminate(structure1 **arrayToEliminateFrom, int positionToEliminate,
int *count)
{
// The code for this doesn't matter
*count -= 1;
// All I do is eliminate an ellement and reallocate it
*arrayToEliminateFrom = realloc(*arrayToEliminateFrom,
*count * sizeof(structure1))
}
Here is what Roberto and I were suggesting. Actually, mine's a general variable length array approach that can be fully generalized with some slight field changes. In a way, since you're already using a struct, I can't see why your instructor would object to this as it's a standard way to do it. Less cumbersome and cleaner.
struct vector {
int vec_count;
structure1 *vec_base;
};
void
xregister(vector *registerArray,vector *waitingList)
{
// Bunch of code in here
registerArray->vec_count += 1;
registerArray->vec_base = realloc(registerArray->vec_base,
registerArray->vec_count * sizeof(structure1));
// Doesn't matter what it does really
eliminate(waitingList, 5)
}
void
eliminate(vector *arrayToEliminateFrom, int positionToEliminate)
{
// The code for this doesn't matter
arrayToEliminateFrom->vec_count -= 1;
// All I do is eliminate an ellement and reallocate it
arrayToEliminateFrom->vec_base = realloc(arrayToEliminateFrom->vec_base,
arrayToEliminateFrom->vec_count * sizeof(structure1))
}
Here's an even more compact way:
struct vector {
int vec_count;
structure1 *vec_base;
};
void
vecgrow(vector *vec,int inc)
{
vec->vec_count += inc;
vec->vec_base = realloc(vec->vec_base,vec->vec_count * sizeof(structure1));
}
void
xregister(vector *registerArray,vector *waitingList)
{
// Bunch of code in here
vecgrow(registerArray,1);
// Doesn't matter what it does really
eliminate(waitingList, 5)
}
void
eliminate(vector *arrayToEliminateFrom, int positionToEliminate)
{
// The code for this doesn't matter
vecgrow(arrayToEliminateFrom,-1);
}
you should try to do an higher structure that contains both pointers and pass and return that structure beetween your functions, because function can return only one object/structure, but your structure/object can contain more objects/structures

Writing a get next token function

Given a C-string: how would I be able to write a function that will get the next token in the string, and a function that will peek the next token and return that without using global variables?
What I'm trying to do is have a static variable that will hold the string, and when called, it would just increment a pointer, and it will reset that static variable throwing out the token that has been retrieved. The problem is: how would I be able to differentiate between the first call (when it will actually store the string) and the other calls, when I am just retrieving it?
Any thoughts on this?
EDIT:
Here's what I have now that "works" but I want to make sure that it should actually work and its not just a coincidence of a pointer being null:
char next_token(char *line) {
static char *p;
if (p == NULL)
p = line;
else {
char next_token = p[0];
p++;
return next_token;
}
}
The code in your edit is wrong. You are handling the NULL case incorrectly.
I initially answered in terms of emulating strtok which seemed to be what you wanted, but you have clarified that you want single characters.
The if-condition should be:
if (line != NULL) p = line;
And you presumably remove the else so that code executes every time... Unless you don't want a result on the first call (you should at least return a value though).
You call like this:
char token = next_token(line);
while( 0 != (token = next_token(NULL)) ) {
// etc
}
typedef struct {
char* raw;
// whatever you need to keep track
} parser_t
void parser_init(parser_t* p, char* s)
{
// init your parser
}
bool parser_get_token(parser_t* p, char* token)
{
// return the token in "token" or return a bool error ( or an enum of various errors)
}
bool parser_peek_token(parser_t* p, char* token)
{
// same deal, but don't update where you are...
}
You have a couple of choices. One would be to use an interface roughly like strtok does, where passing a non-null pointer initializes the static variable, and passing a null pointer retrieves a token. This, however, is fairly ugly, clumsy, error-prone, and problematic in the presence of multithreading.
Another possibility would be to use a file-level static variable with separate functions (both in that file) to initialize the static variable, and to retrieve the next token from the string. This is marginally cleaner, but still have most of the same problems.
A third would be to make it act (for one example) like a file -- the user calls parse_open (for example), passing in the string to parse. You return an opaque handle to them. They then pass that back to (say) get_token each time they want another token.
Basically, there are three ways of a function to pass information back to its caller:
via a global variable
via the return value
via a pointer argument
And, similarly there are ways for the function to maintain state between calls:
via a global or (function-)static variable
by supplying it as a function parameter and returning it after every call
via a pointer argument.
A nice coding convention for a lexer/tokeniser is to use the return value to communicate the number of characters consumed. (and maybe use an extra pointer variable to pass the parser state to and fro calls)
This is wakkerbot's parser:
STATIC size_t tokenize(char *string, int *sp);
Usage:
STATIC void make_words(char * src, struct sentence * target)
{
size_t len, pos, chunk;
STRING word ;
int state = 0; /* FIXME: this could be made static to allow for multi-line strings */
target->size = 0;
len = strlen(src);
if (!len) return;
for(pos=0; pos < len ; ) {
chunk = tokenize(src+pos, &state);
if (!chunk) { /* maybe we should reset state here ... */ pos++; }
if (chunk > STRLEN_MAX) {
warn( "Make_words", "Truncated too long string(%u) at %s\n", (unsigned) chunk, src+pos);
chunk = STRLEN_MAX;
}
word.length = chunk;
word.word = src+pos;
if (word_is_usable(word)) add_word_to_sentence(target, word);
if (pos+chunk >= len) break;
pos += chunk;
}
...
}

How to read from buffer with feedback, so doesn't buffer overflow?

I have this code
#define BUFFER_LEN (2048)
static float buffer[BUFFER_LEN];
int readcount;
while ((readcount = sf_read_float(handle, buffer, BUFFER_LEN))) {
// alsa play
}
which reads BUFFER_LEN floats from buffer, and returns the number of floats it actually read. "handle" tells sf_rad_float how big buffer is.
E.g. if buffer contains 5 floats, and BUFFER_LEN is 3, readcount would first return 3, and next time 2, and the while-loop would exit.
I would like to have a function that does the same.
Update
After a lot of coding, I think this is the solution.
#include <stdio.h>
int copy_buffer(double* src, int src_length, int* src_pos,
float* dest, int dest_length) {
int copy_length = 0;
if (src_length - *src_pos > dest_length) {
copy_length = dest_length;
printf("copy_length1 %i\n", copy_length);
} else {
copy_length = src_length - *src_pos;
printf("copy_length2 %i\n", copy_length);
}
for (int i = 0; i < copy_length; i++) {
dest[i] = (float) src[*src_pos + i];
}
// remember where to continue next time the copy_buffer() is called
*src_pos += copy_length;
return copy_length;
}
int main() {
double src[] = {1,2,3,4,5};
int src_length = 5;
float dest[] = {0,0};
int dest_length = 2;
int read;
int src_pos = 0;
read = copy_buffer(src, src_length, &src_pos, dest, dest_length);
printf("read %i\n", read);
printf("src_pos %i\n", src_pos);
for (int i = 0; i < src_length; i++) {
printf("src %f\n", src[i]);
}
for (int i = 0; i < dest_length; i++) {
printf("dest %f\n", dest[i]);
}
return 0;
}
Next time copy_buffer() is called, dest contains 3,4. Running copy_buffer() again only copies the value "5". So I think it works now.
Although it is not very pretty, that I have int src_pos = 0; outside on copy_buffer().
It would be a lot better, if I instead could give copy_buffer() a unique handle instead of &src_pos, just like sndfile does.
Does anyone know how that could be done?
If you would like to create unique handles, you can do so with malloc() and a struct:
typedef intptr_t HANDLE_TYPE;
HANDLE_TYPE init_buffer_traverse(double * src, size_t src_len);
int copy_buffer(HANDLE_TYPE h_traverse, double * dest, size_t dest_len);
void close_handle_buffer_traverse(HANDLE_TYPE h);
typedef struct
{
double * source;
size_t source_length;
size_t position;
} TRAVERSAL;
#define INVALID_HANDLE 0
/*
* Returns a new traversal handle, or 0 (INVALID_HANDLE) on failure.
*
* Allocates memory to contain the traversal state.
* Resets traversal state to beginning of source buffer.
*/
HANDLE_TYPE init_buffer_traverse(double *src, size_t src_len)
{
TRAVERSAL * trav = malloc(sizeof(TRAVERSAL));
if (NULL == trav)
return INVALID_HANDLE;
trav->source = src;
trav->source_len = src_len;
trav->position = 0;
return (HANDLE_TYPE)trav;
}
/*
* Returns the system resources (memory) associated with the traversal handle.
*/
void close_handle_buffer_traverse(HANDLE_TYPE h)
{
TRAVERSAL * trav = NULL;
if (INVALID_HANDLE != h)
free((TRAVERSAL *)h);
}
int copy_buffer(HANDLE_TYPE h,
float* dest, int dest_length)
{
TRAVERSAL * trav = NULL;
if (INVALID_HANDLE == h)
return -1;
trav = (TRAVERSAL *)h;
int copy_length = trav->source_length - trav->position;
if (dest_length < copy_length)
copy_length = dest_length;
for (int i = 0; i*emphasized text* < copy_length; i++)
dest[i] = trav->source[trav->position + i];
// remember where to continue next time the copy_buffer() is called
trav->position += copy_length;
return copy_length;
}
This sort of style is what some C coders used before C++ came into being. The style involves a data structure, which contains all the data elements of our 'class'. Most API for the class takes as its first argument, a pointer to one of these structs. This pointer is similar to the this pointer. In our example this parameter was named trav.
The exception for the API would be those methods which allocate the handle type; these are similar to constructors and have the handle type as a return value. In our case named init_buffer_traverse might as well have been called construct_traversal_handle.
There are many other methods than this method for implementing an "opaque handle" value. In fact, some coders would manipulate the bits (via an XOR, for example) in order to obscure the true nature of the handles. (This obscurity does not provide security where such is needed.)
In the example given, I'm not sure (didn't look at sndlib) whether it would make most sense for the destination buffer pointer and length to be held in the handle structure or not. If so, that would make it a "copy buffer" handle rather than a "traversal" handle and you would want to change all the terminology from this answer.
These handles are only valid for the lifetime of the current process, so they are not appropriate for handles which must survive restarts of the handle server. For that, use an ISAM database and the column ID as handle. The database approach is much slower than the in-memory/pointer approach but for persistent handles, you can't use in-memory values, anyway.
On the other hand, it sounds like you are implementing a library which will be running within a single process lifetime. In which case, the answer I've written should be usable, after modifying to your requirements.
Addendum
You asked for some clarification of the similarity with C++ that I mention above. To be specific, some equivalent (to the above C code) C++ code might be:
class TRAVERSAL
{
double * source;
size_t source_length;
size_t position;
public TRAVERSAL(double *src, size_t src_len)
{
source = src;
source_length = src_len;
position = 0;
}
public int copy_buffer(double * dest, size_t dest_len)
{
int copy_length = source_length - position;
if (dest_length < copy_length)
copy_length = dest_length;
for (int i = 0; i < copy_length; i++)
dest[i] = source[position + i];
// remember where to continue next time the copy_buffer() is called
position += copy_length;
return copy_length;
}
}
There are some apparent differences. The C++ version is a little bit less verbose-seeming. Some of this is illusory; the equivalent of close_handle_buffer_traverse is now to delete the C++ object. Of course delete is not part of the class implementation of TRAVERSAL, delete comes with the language.
In the C++ version, there is no "opaque" handle.
The C version is more explicit and perhaps makes more apparent what operations are being performed by the hardware in response to the program execution.
The C version is more amenable to using the cast to HANDLE_TYPE in order to create an "opaque ID" rather than a pointer type. The C++ version could be "wrapped" in an API which accomplished the same thing while adding another layer. In the current example, users of this class will maintain a copy of a TRAVERSAL *, which is not quite "opaque."
In the function copy_buffer(), the C++ version need not mention the trav pointer because instead it implicitly dereferences the compiler-supplied this pointer.
sizeof(TRAVERSAL) should be the same for both the C and C++ examples -- with no vtable, also assuming run-time-type-identification for C++ is turned off, the C++ class contains only the same memory layout as the C struct in our first example.
It is less common to use the "opaque ID" style in C++, because the penalty for "transparency" is lowed in C++. The data members of class TRAVERSAL are private and so the TRAVERSAL * cannot be accidentally used to break our API contract with the API user.
Please note that both the opaque ID and the class pointer are vulnerable to abuse from a malicious API user -- either the opaque ID or class pointer could be cast directly to, e.g., double **, allowing the holder of the ID to change the source member directly via memory. Of course, you must trust the API caller already, because in this case the API calling code is in the same address space. In an example of a network file server, there could be security implications if "opaque ID" based on a memory address is exposed to the outside.
I would not normally make the digression into trustedness of the API user, but I want to clarify that the C++ keyword private has no "enforcement powers," it only specifies an agreement between programmers, which the compiler respects also unless told otherwise by the human.
Finally, the C++ class pointer can be converted to an opaque ID as follows:
typedef intptr_t HANDLE_TYPE;
HANDLE_TYPE init_buffer_traverse(double *src, size_t src_len)
{
return (HANDLE_TYPE)(new TRAVERSAL(src, src_len));
}
int copy_buffer(HANDLE_TYPE h_traverse, double * dest, size_t dest_len)
{
return ((TRAVERSAL *)h_traverse)->copy_buffer(dest, dest_len);
}
void close_handle_buffer_traverse(HANDLE_TYPE h)
{
delete ((TRAVERSAL *)h);
}
And now our brevity of "equivalent" C++ may be further questioned.
What I wrote about the old style of C programming which relates to C++ was not meant to say that C++ is better for this task. I only mean that encapsulation of data and hiding of implementation details could be done in C via a style that is almost isomorphic to a C++ style. This can be good to know if you find yourself programming in C but unfortunately having learned C++ first.
PS
I just noticed that our implementation to date had used:
dest[i] = (float)source[position + i];
when copying the bytes. Because both dest and source are double * (that is, they both point to double values), there is no need for a cast here. Also, casting from double to float may lose digits of precision in the floating-point representation. So this is best removed and restated as:
dest[i] = source[position + i];
I started to look at it, but you could probably do it just as well: libsndfile is open source, so one could look at how sf_read_float() works and create a function that does the same thing from a buffer. http://www.mega-nerd.com/libsndfile/ has a download link.

Resources