Given a C-string: how would I be able to write a function that will get the next token in the string, and a function that will peek the next token and return that without using global variables?
What I'm trying to do is have a static variable that will hold the string, and when called, it would just increment a pointer, and it will reset that static variable throwing out the token that has been retrieved. The problem is: how would I be able to differentiate between the first call (when it will actually store the string) and the other calls, when I am just retrieving it?
Any thoughts on this?
EDIT:
Here's what I have now that "works" but I want to make sure that it should actually work and its not just a coincidence of a pointer being null:
char next_token(char *line) {
static char *p;
if (p == NULL)
p = line;
else {
char next_token = p[0];
p++;
return next_token;
}
}
The code in your edit is wrong. You are handling the NULL case incorrectly.
I initially answered in terms of emulating strtok which seemed to be what you wanted, but you have clarified that you want single characters.
The if-condition should be:
if (line != NULL) p = line;
And you presumably remove the else so that code executes every time... Unless you don't want a result on the first call (you should at least return a value though).
You call like this:
char token = next_token(line);
while( 0 != (token = next_token(NULL)) ) {
// etc
}
typedef struct {
char* raw;
// whatever you need to keep track
} parser_t
void parser_init(parser_t* p, char* s)
{
// init your parser
}
bool parser_get_token(parser_t* p, char* token)
{
// return the token in "token" or return a bool error ( or an enum of various errors)
}
bool parser_peek_token(parser_t* p, char* token)
{
// same deal, but don't update where you are...
}
You have a couple of choices. One would be to use an interface roughly like strtok does, where passing a non-null pointer initializes the static variable, and passing a null pointer retrieves a token. This, however, is fairly ugly, clumsy, error-prone, and problematic in the presence of multithreading.
Another possibility would be to use a file-level static variable with separate functions (both in that file) to initialize the static variable, and to retrieve the next token from the string. This is marginally cleaner, but still have most of the same problems.
A third would be to make it act (for one example) like a file -- the user calls parse_open (for example), passing in the string to parse. You return an opaque handle to them. They then pass that back to (say) get_token each time they want another token.
Basically, there are three ways of a function to pass information back to its caller:
via a global variable
via the return value
via a pointer argument
And, similarly there are ways for the function to maintain state between calls:
via a global or (function-)static variable
by supplying it as a function parameter and returning it after every call
via a pointer argument.
A nice coding convention for a lexer/tokeniser is to use the return value to communicate the number of characters consumed. (and maybe use an extra pointer variable to pass the parser state to and fro calls)
This is wakkerbot's parser:
STATIC size_t tokenize(char *string, int *sp);
Usage:
STATIC void make_words(char * src, struct sentence * target)
{
size_t len, pos, chunk;
STRING word ;
int state = 0; /* FIXME: this could be made static to allow for multi-line strings */
target->size = 0;
len = strlen(src);
if (!len) return;
for(pos=0; pos < len ; ) {
chunk = tokenize(src+pos, &state);
if (!chunk) { /* maybe we should reset state here ... */ pos++; }
if (chunk > STRLEN_MAX) {
warn( "Make_words", "Truncated too long string(%u) at %s\n", (unsigned) chunk, src+pos);
chunk = STRLEN_MAX;
}
word.length = chunk;
word.word = src+pos;
if (word_is_usable(word)) add_word_to_sentence(target, word);
if (pos+chunk >= len) break;
pos += chunk;
}
...
}
Related
This function is to split string based on \n and see if the row number is selected. If the row number matched, this string should be copied and used by other function:
void selectDeparment(char* departments, int selectedNum, char* selectedDepartment){
char* copyOfDepartments = malloc(strlen(departments)+1);
strcpy(copyOfDepartments,departments);
char* sav1 = NULL;
char* token = strtok_s(copyOfDepartments,"\n",&sav1);
int counter = 0;
while(token != NULL){
if(counter == selectedNum){
selectedDepartment = malloc(strlen(token)+1);
strcpy(selectedDepartment,token);
}
++counter;
token = strtok_s(NULL, "\n", &sav1);
}
}
This function is called in main like:
char* selectedDepartment;
selectDeparment(recordsPtr[0], 1, selectedDepartment);
printf(selectedDepartment);
recordsPtr[0] contains four strings with \n at the end:
aDeparment
anotherDepartment
newDepartment
otherDepartment
In C, we are encouraged to use pointer to get a value from function instead of returning a string from a function. However, the prinft in main function gives random output
I believe there is some confusion in the way you are using pointers here. Let me clarify.
In the main function, the character pointer selectedDepartment holds a certain memory in the computer. But when a function call is made to void selectDeparment(char* departments, int selectedNum, char* selectedDepartment), a new copy of selectedDepartment is created. Henceforth any changes which are made to selectedDepartment are done only at the scope of the called function and does not impact the original pointer in the main function.
Thus one clear way to solve this problem will be to pass a pointer to the character pointer defined in the main function. This will then give the correct/expected results.
Here is the modified version of the function -
void selectDeparment(char* departments, int selectedNum, char** selectedDepartment){
char* copyOfDepartments = malloc(strlen(departments)+1);
strcpy(copyOfDepartments,departments);
char* sav1 = NULL;
char* token = strtok_s(copyOfDepartments,"\n",&sav1);
int counter = 0;
while(token != NULL){
if(counter == selectedNum){
(*selectedDepartment) = malloc(strlen(token)+1);
strcpy(*selectedDepartment,token);
}
++counter;
token = strtok_s(NULL, "\n", &sav1);
}
}
And this is how it is called from the main function -
int main() {
char* recordsPtr[] = {"aDeparment\nanotherDepartment\nnewDepartment\notherDepartment"};
char* selectedDepartment;
selectDeparment(recordsPtr[0], 1, &selectedDepartment);
printf(selectedDepartment);
}
I think you are getting confused with the "A Pointer To What?" you are supposed to return. In your selectDeparment() function, if I understand what is needed, is you simply need to return a pointer to the correct department within recordsPTR. You do not need to allocate or tokenize to do that. You already have the index for the department. So simply change the return-type to char * and return departments[selectedNum];.
For example, you can whittle-down your example to:
#include <stdio.h>
char *selectDeparment (char **departments, int selectedNum){
return departments[selectedNum];
}
int main (void) {
char *selectedDepartment = NULL;
char *recordsPTR[] = { "aDepartment\n",
"anotherDepartment\n",
"newDepartment\n",
"otherDepartment\n" };
selectedDepartment = selectDeparment (recordsPTR, 1);
fputs (selectedDepartment, stdout);
}
Note: the '*' generally goes with the variable name and not the type. Why? Because:
int* a, b, c;
certainly does NOT declare three-pointers to int,
int *a, b, c;
makes clear that you have declared a single-pointer to int and two integers.
Example Use/Output
Running the example above you would have:
$ ./bin/selectedDept
anotherDepartment
You will want to add array bounds protection to ensure the index passed does not attempt to read past the array bounds. That is left to you.
If You Must Use void
If you must use a void type function, then you can pass the Address Of the pointer to the function so the function receives the original address for the pointer in main(). You can then assign the correct department to the original pointer address so the change is visible back in main(). When you pass the Address Of the pointer, it will require one additional level of indirection, e.g.
#include <stdio.h>
void selectDeparment (char **departments, int selectedNum, char **selectedDeparment) {
*selectedDeparment = departments[selectedNum];
}
int main (void) {
char *selectedDepartment = NULL;
char *recordsPTR[] = { "aDepartment\n",
"anotherDepartment\n",
"newDepartment\n",
"otherDepartment\n" };
selectDeparment (recordsPTR, 1, &selectedDepartment);
fputs (selectedDepartment, stdout);
}
(same result, same comment on adding array bounds protection)
Look this over and let me know if I filled in the missing pieces correctly. If not, just drop a comment and I'm happy to help further.
so my first question would be. Does fgets overwrite other char* values?
Otherwise, I'm not really sure how I have messed up my mallocs. Below is the code where the value is changing. First line is where the variable is being created.
data[dataIndex++] = createVariable(varName, 1, value, -1, line, NULL);
The code where the variable is being created
Variable *createVariable(char *name, int type, int val, int len, int line, char *string)
{
Variable *var = malloc(sizeof(Variable));
var->name = name;
var->setting = type;
var->num = val;
var->length = len;
var->line = line;
var->string = string;
return var;
}
What data looks like and how it was created.
Variable **data;
data = malloc(4 * sizeof(Variable *));
Forgot to add this, but below is my fgets code
if (fgets(line, MAX_LINE_LENGTH, in) == NULL)
{
break;
}
The problem is this line in your createVariable function:
var->name = name;
What this does is copy the pointer given as the first argument to the name field in the var structure; it doesn't make a (separate) copy of the data that is pointed to! So, assuming you call createVariable many times with the same variable as the first argument, then every object created will have the same address in its name field, and any modifications you make to any of them (via fgets) will change all of them.
To get round this, you need to allocate new memory for the name field, each time you call the createVariable function, then copy the string data to it. The simplest way to do this is using the strdup function:
Variable *createVariable(char *name, int type, int val, int len, int line, char *string)
{
Variable *var = malloc(sizeof(Variable));
var->name = strdup(name);
//...
var->string = strdup(string);
//...
But note, you will now need to be sure to free that memory from each object when you (eventually) delete it. Something like this:
void deleteVariable(Variable** var)
{
free((*var)->name); // free the name memory
free((*var)->string); // free the string memory
free(*var); // free the actual structure
*var = NULL; // set the pointer to NULL - to prevent multiple frees
}
EDIT: Just re-read your question, and noticed that you are making the same mistake with the string field! The same fix needs to be applied to that!
I can use the following function to iterate through some text and grab it line by line:
int nextline(char * text, unsigned int * start_at, char * buffer) {
/*
it will return the length of the line if there is a line, or -1 otherwise.
it will fill the character buffer with the line
and return where the pointer has 'finished' for that line so it can be used again
*/
int i;
char c;
if (*start_at > strlen(text)) return -1;
for (i=0; (c = * (text + *start_at + i)); i++) {
buffer[i] = c;
if (c == '\0') break;
if (c == '\n') {
buffer[i+1] = '\0';
break;
}
}
* start_at = * start_at + i + 1;
return i;
}
However, this function requires passing in an offset, for example:
char * longtext = "This is what I went to do\nWhen I came over\nto the place and thought that\nhere we go again";
char buffer[60];
unsigned int line_length, start_at=0;
for (int i=1; (line_length = nextline(longtext, &start_at, buffer)) != -1; i++)
printf("Line %2d. %s\n", i, buffer);
How would I write an equivalent function where it "remembers" where the cursor is and I don't need to keep passing it back into the function?
If you don't want to pass an offset, you could implement this by using a static variable inside of the function to keep track of the current offset. Doing so has disadvantages, however.
Suppose you were to use a static variable for the offset. You then process a complete string. Now what happens when you want to process another string? You need to somehow tell the function to "start over". Also, suppose you wanted to process two separate strings alternately, i.e. you call the function first on one string, then the other, then back to the first. The internal state wouldn't be able to manage that.
The best way to manage these types of issues is to do exactly what you're doing now: passing the address of a variable to keep track of the state. That way it's up to the calling function to keep track of the current state while your nextline function is stateless.
There are a number of older functions in the standard library that use internal state that were superseded by newer function that don't. One notable example is strtok. This function uses internal state to tokenize a string. The POSIX function strtok_r was created later that receives an additional parameter for state.
So keep your function the way it is. It's generally considered better design to not depend on internal state.
I am working on porting a C FIFO queue implementation into my code and I don't feel comfortable including functions that I don't understand. What does this function do? I see that it returns an integer value, but I don't know what it means. What is the (*iter) parameter? I don't see this type declared in the header or implementation file.
You can find the header hereYou can find the implementation here
The function in question is copied below:
int fifo_iter(fifo_t *f, int (*iter)(void *data, void *arg), void *arg)
{
fifonode_t *fn;
int rc;
int ret = 0;
for (fn = f->f_head; fn; fn = fn->fn_next) {
if ((rc = iter(fn->fn_data, arg)) < 0)
return (-1);
ret += rc;
}
return (ret);
}
int (*iter)(void *data, void *arg) is a function pointer. When you call fifo_iter, you pass it a callback function in the second parameter. This function must have a signature like:
int my_callback(void* data, void* arg)
This function will then be called for every item in the fifo_t. It will be passed the fn_data member in the data argument. Whatever you pass for arg to fifo_iter will also be passed to iter as his arg argument. This is a common way to just pass generic "context" information through to the callback, without having to resort to ugly, thread-unsafe global variables.
So you can use it like this:
int my_callback(void* data, void* arg) {
printf("my_callback(%p, %p)\n", data, arg);
return 0; // always continue
}
void test(void) {
fifo_t myfifo; // Needs initialized and populated...
fifo_iter(&myfifo, my_callback, NULL);
}
Additionally, we see that it uses the return value of iter in a special way. First of all, if iter ever returns a negative value, the iteration stops immediately, and fifo_iter returns -1. This can be an "early failure". Otherwise, it accumulates the (positive) return values into ret, and then returns that value.
Expanding my example. This assumes that the fifo fn_data members point to strings. This will count the total number of capital and lowercase letters in all strings in the FIFO, and also return the total length of all strings.
// The context that we'll maintain during the iteration
struct my_context {
int caps;
int lowers;
};
// The callback function, called for every string in the FIFO.
int my_callback(void* data, void* arg) {
const char* str = data; // Node data is a string
struct my_context *ctx = arg; // Arg is my context
// If any string in the FIFO has a !, abort immediately.
if (strchr(str, '!'))
return -1;
// Update the context to include the counts for this string
ctx->caps += count_capital_letters(str);
ctx->lowers += count_lowercase_letters(str);
// fifo_iter will accumulate this length
return strlen(str);
}
// Test driver function
void test(void) {
fifo_t myfifo;
struct my_context ctx;
int total;
// Assuming these functions exist:
fifo_init(&myfifo);
fifo_append(&myfifo, "Stack");
fifo_append(&myfifo, "Overflow");
// Initialize the context
ctx.caps = 0;
ctx.lowers = 0;
// Iterate over myfifo, passing it a pointer to the context
total = fifo_iter(&myfifo, my_callback, &ctx);
if (total < 0) {
// Any string had a '!'
printf("Iteration failed!\n");
return;
}
printf("total=%d caps=%d lowers=%d \n", total, ctx.caps, ctx.lowers);
}
If you take a gander through the Linux kernel source code, you'll see this construct all over the place.
For a simple FIFO like we have here, it may not seem worth it. But when you're dealing with more complex data structures like hash lists, and RCU lists, it makes more sense to maintain the iteration logic in just one place, and utilize callbacks to handle the data in whatever way you need to.
It is iterating over the structure, calling the argument function iter() on each element; if an element returns negative, the iteration stops and -1 is returned.
I am using some C with an embedded device and currently testing some code to read file details from an SD card. I am using a proprietary API but I will try to remove that wherever possible.
Rather than explaining, I will try to let me code speak for itself:
char* getImage() {
int numFiles = //number of files on SD card
for(int i=0; i<numFiles;i++) {
\\lists the first file name in root of SD
char *temp = SD.ls(i, 1, NAMES);
if(strstr(temp, ".jpg") && !strstr(temp, "_")) {
return temp;
}
}
return NULL;
}
void loop()
{
\\list SD contents
USB.println(SD.ls());
const char * image = getImage();
if(image != NULL) {
USB.println("Found an image!");
USB.println(image);
int byte_start = 0;
USB.print("Image Size: ");
**USB.println(SD.getFileSize(image));
USB.println(SD.getFileSize("img.jpg"));**
}
The two lines at the bottom are the troublesome ones. If I pass a literal string then I get the file size perfectly. However, if I pass the string (as represented by the image variable) then I am given a glorious -1. Any ideas why?
For clarity, the print out of image does display the correct file name.
EDIT: I know it is frowned upon in C to return a char and better to modify a variable passed to the function. I have used this approach as well and an example of the code is below, with the same result:
char * image = NULL;
getSDImage(&image, sizeof(image));
void getSDImage(char ** a, int length) {
int numFiles = SD.numFiles();
for(int i=0; i<numFiles;i++) {
char *temp = SD.ls(i, 1, NAMES);
if(strstr(temp, ".jpg") && !strstr(temp, "_")) {
*a = (char*)malloc(sizeof(char) * strlen(temp));
strcpy(*a, temp);
}
}
}
EDIT 2: The link to the entry is here: SD.ls and the link for the file size function: SD.getFileSize
From the return, it seems like the issue is with the file size function as the return is -1 (not 0) and because a result is returned when listing the root of the SD.
Thanks!
UPDATE: I have added a check for a null terminated string (it appears that this was an issue) and this has been addressed in the getSDImage function, with the following:
void getSDImage(char ** a, int length) {
int numFiles = SD.numFiles();
for(int i=0; i<numFiles;i++) {
char *temp = SD.ls(i, 1, NAMES);
if(strstr(temp, ".jpg") && !strstr(temp, "_")) {
*a = (char*)malloc(sizeof(char) * strlen(temp));
strncpy(*a, temp, strlen(temp)-1);
*a[strlen(*a)-1] = '\0';
}
}
}
This seems to work and my results to standard output are fine, the size is now not shown as the error-indicating -1 but rather -16760. I thought I should post the update in case anyone has any ideas but my assumption is that this is something to do with the filename string.
There are several things that could be wrong with your code:
1) You might be passing "invisible" characters such as whitespaces. Please make sure that the string you are passing is exactly the same, i.e. print character by character including null termination and see if they are the same.
2) The value that is getting returned by API and latter used by other API may not be as expected. I would advise that (if possible) you look at the API source code. If you can compile the API itself then it should be easy to find the problem (check what API getFileSize() gets from parameters). Based on the API documentation you have sent check the value stored in buffer[DOS_BUFFER_SIZE] after you get -1 from.
EDIT (after looking at the API source code):
On line 00657 (func find_file_in_dir) you have:
if(strcmp(dir_entry->long_name, name) == 0)
it seems as the only reason why you would have different reply when using string literal as opposed to the getting name from your function. So it is very likely that you are not passing the same values (i.e. you are either passing invisible chars or you are missing string termination).
As final note: Check the content of buffer[DOS_BUFFER_SIZE] before each code to SD API.
I hope this helps.
Kind regards,
Bo
This:
if(strstr(temp, ".jpg") && !strstr(temp, "_")) {
*a = (char*)malloc(sizeof(char) * strlen(temp));
strcpy(*a, temp);
}
is broken, it's not allocating room for the terminator and is causing a buffer overflow.
You should use:
*a = malloc(strlen(temp) + 1);
There's no need to cast the return value of malloc() in C, and sizeof (char) is always 1.