C - how does sprintf() work? - c

I'm new to coding in C and was writing a program which needs to create custom filenames that increment from 0. To make the filenames, I use sprintf() to concatenate the counter and file extension like so:
int main(void)
{
// do stuff
int count = 0;
while (condition == true)
{
char filename[7];
sprintf(filename, "0%02d.txt", count) //count goes up to a max of 50;
count++;
//check condition
}
return 0;
}
However, every time
sprintf(filename, "0%02d.txt", count);
runs, count gets reset to 0.
My question is, what does sprintf() do with count? Why does count change after being passed to sprintf()?
Any help would be much appreciated.
EDIT: Sorry, I haven't been too clear with the code in my question - I'm writing the program for an exercise on an online course, and count goes up to a max of 50. I've now changed my code to reflect that. Also, thanks for telling me about %04d, I was using a complicated if statement to determine how many zeroes to add to my filename to make it 3-digit.

Despite the title of the question, this has nothing to do with sprintf(), which probably works as expected, but everything with count.
If count is a global variable (i.e. outside any functions), then it should keep its value between function calls. So that is probably not the case.
If it is a local variable (declared inside the function), then it can have any value, since these lose their value when the function ends and don't get initialized when the function is run again. It can be always 0, but under different circumstances, it can just as well be something else. In other words, the value is more or less undetermined.
To have a local variable keep its value between function calls, make it static.
static int count = 0;
But note that when you stop and run the program again, it will start as 0 again. That means you would possibly overwrite 000.txt, then 001.txt, etc.
If you really want to avoid duplicate file names, you will have to be more sophisticated, and see which files are already there, determine the highest number, and increment that by one. So you don't use a variable, you check the files that already exist. That is far more work, but the only reliable way to avoid overwriting existing files with such numbered file names.
FWIW, I would use something like "00%04d.txt" as format string, so you get files 000000.txt, 000001.txt, etc. which look better in an alphabetically sorted file listing than 000.txt, 001.txt, 0010.txt, 0011.txt, 002.txt, etc. They are also easier to parse for their number.
As Weather Vane noticed, be sure to make your buffer a little larger, e.g.
char filename[20];
A buffer that is too small is a problem. One that is too large is not, unless it is huge and clobbers up the stack. That risk is very small with 20 chars.

I think it is likely the sprintf. "0%02d.txt" is 7 chars. The null at the end of the string will go in the next location, which is likely the count on the stack. On a little endian machine (x86), that likely means the bottom byte of count gets zeroed out in every sprintf().
As other folks said. Make the filename buffer larger.

Related

Reading the same input multiple times - C

I want to ask you if is it possible to read the same input (stdin) multiple times? I am about to get really big number, containing thousands of digits (so I am unable to store it in variable, (and also I can not use folders!). My idea is to put the digits into int array, but I don't know how big the array should be, because amount of digits in input may vary. I have to write general solution.
So my question is, how to solve it, and how to find out amount of digits (so I can initialize array), before I copy digits into array. I tried using scanf(), multiple times, or scanf() and getchar, but it is not working. See my code:
int main(){
int c;
int amountOfDigits=5;
while(scanf("%1d",&c)!=' '){//finding out number of digits with scanf
if(isdigit(c)==0){
break;
}
amountOfDigits++;
}
int digits[amountOfDigits];//now i know lenght of array, and initialize it
for(int i=0;i<amountOfDigits;i++){//putting digits into array
digits[i]=getchar();
}
for(int i=0;i<amountOfDigits;i++){//printing array
printf("%d",digits[i]);
}
printf("\n");
return 0;
}
is it possible to read the same input (stdin) multiple times?
(I am guessing you are a student beginning to learn programming, and you are using Linux; adapt my answer if not)
For your homework, you don't need to read the same input several times. In some cases, it is possible (when the standard input is a genuine file -seekable-, that is when you use some redirection in your command). In other cases (e.g. when the standard input is a pipe, e.g. with a command pipeline; or with here documents in your shell command...) it is not possible to read several times stdin (but you don't need to). In general, don't expect stdin to be seekable with fseek or rewind (it usually is not).
(I am not going to do your homework, but here are useful hints)
so I am unable to store it in variable, (and also I can not use folders!)
You could do several things:
(since you mentioned folders....) you might use some more sophisticated ways of storing data on the disk (but in your particular case, I don't recommend that ...). These ways could be some direct-accessed file (ugly), or some indexed file à la gdbm, or some database à la sqlite or even some RDBMS server like PostGreSQL.
In your case, you don't need any of these; I'm mentioning it since you mentioned "folders" and you meant "directories"!
you really should use some heap allocated memory, so read about C dynamic memory allocation and read carefully the documentation of each standard memory management functions like malloc, realloc, free. Your program should probably use all these three functions (don't forget that malloc & realloc could fail).
Read this and that answers. Both are surprisingly relevant.
You probably should keep somehow:
a pointer to heap allocated int-s (actually, you could use char-s)
the allocated size of that pointer
the used length of that thing, that is the actual number of useful digits.
You certainly don't want to grow your array by repeated realloc at each loop (that is inefficient). In practice, you would adapt some growing scheme like newsize = 3*oldsize/2 + 10 to avoid reallocating memory at each step (of your input loop).
you should thank your teacher for a so useful exercise, but you should not expect StackOverflow to do your homework!
Be also aware of arbitrary-precision arithmetic (called bignums or bigints). It is actually hard to code efficiently, so in real-life you would use some library like GMPlib.

Flex - Function that compares strings in C

I'm programing with flex using C, a C code compiler and I want to compare strings on a file, in this case my symbol table, with yytext. If yytext and the respective string of the table are the same one it should exit the function and if there are no instances on the table then the function will write the string down on the symbol table.
This is my function:
search (char *x){
int c;
int n = 0;
char *cdn;
while ((c = fgetc(comp)) != EOF){
fscanf(comp, "%s", cdn);
if (strcmp(cdn, yytext) == 0){
n++; //if n>0 when it finishes searching the file then there's a copy on the file
}else{}
return 0;
}
if (n==0){
fprintf(comp, "%d\t %s\n", pos++, yytext); //will write if there's no copy in the table
}else return 0;}
The input for the function is yytext, yytext will have for example "a".
After running this, the program doesn't write anything and it need to be closed manually. (More like program.extension has stopped working.)
Can someone help me with this?
First of all, putting your symbol table into a file is a very debatable design choice:
symbols are checked very frequently, so file accesses will slow down your compiler a lot.
symbols will be associated with a lot of grammatical informations (for instance, they may represent a variable with an associated type), so storing only the names will not be enough to make later stages of your compiler work.
If you store all symbol informations into a file, you will have to re-read the entire file and convert each bit of information into a memory representation each time you will want to access a given symbol.
This is not only inefficient, it will also force you to write tons of unnecessary and complicated code.
Now for your search function.
Regardless of the current bugs, what your function does is not a search, though you need to search your file to make it work.
What your function does is create a unique list of yytext values. The "search" you're performing inside it simply makes sure an already present value is not duplicated.
The very first thing to do would be to give it a less misleading name, or modify it so that it does what its name implies.
Now for the bugs
If for some reason you still want to use a file, I suppose you will put each name on a single line.
So why not use fgets(), that will take care of the line endings for you?
Whatever method you are using to read each name, you will have to provide a buffer with actual storage space for the string, not just an unitialized pointer.
If your input string is yytext, your x parameter will never be used.
Lastly, your search function (which inserts the current yytext value into an unsorted list), has no reason to return anything (except an error code if your disk gets full and you can't add new names to the list).

Getting character attributes

Using WinAPI to get the attribute of a character located in y line and x column of the screen console.
This is what I am trying to do after a call to GetConsoleScreenBufferInfo(GetStdHandle(STD_OUTPUT_HANDLE), &nativeData); where the console cursor is set to the specified location. This won't work. It will return the last used attribute change instead.
How do I obtain the attributes used on all the characters on their locations?
EDIT:
The code I used to test ReadConsoleOutput() : http://hastebin.com/atohetisin.pl
It throws garbage values.
I see several problems off the top of my head:
No error checking. You must check the return value for ReadConsoleOutput and other functions, as documented. If the function fails, you must call GetLastError() to get the error code. If you don't check for errors, you're flying blind.
You don't allocate a buffer to receive the data in. (Granted, the documentation confusingly implies that it allocates the buffer for you, but that's obviously wrong since there's no way for it to return a pointer to it. Also, the sample code clearly shows that you have to allocate the buffer yourself. I've added a note.)
It looks as if you had intended to read the characters you had written, but you are writing to (10,5) and reading from (0,0).
You're passing newpos, which is set to (10,5), as dwBufferCoord when you call ReadConsoleOutput, but you specified a buffer size of (2,1). It doesn't make sense for the target coordinates to be outside the buffer.
Taking those last two points together I think perhaps you have dwBufferCoord and lpReadRegion confused, though I'm not sure what you meant the coordinates (200,50) to do.
You're interpreting CHAR_INFO as an integer in the final printf statement. The first element of CHAR_INFO is the character itself, not the attribute. You probably wanted to say chiBuffer[0].Attributes rather than just chiBuffer[0]. (Of course, this is moot at the moment, since chiBuffer points to a random memory address.)
If you do want to retrieve the character, you'll first need to work out whether the console is in Unicode or ASCII mode, and retrieve UnicodeChar or AsciiChar accordingly.

Value in C variable disappears

I have a C program that is giving me trouble. It is a plugin for the X-Plane flight simulator. You can view the whole code here. The basic function is pulling some information from curl, running and recording information about a flight, and then finally compiling a report of the flight.
The problem is that there is one variable that is misbehaving. I declare it at the beginning of the code because it is needed across multiple functions.
char fltno[9];
The content is set using a plugin function to get the value from a text box. It takes the value from the FltNoText widget, 8 characters long, and assigns it to the fltno variable.
XPGetWidgetDescriptor( FltNoText, fltno, 8);
I build some messages using the following method that include the fltno variable.
messg = malloc(snprintf(NULL, 0, "Message stuff %s", fltno) + 1);
sprintf(messg, "Message stuff %s", fltno);
This works just fine throughout the running of the program. Right up until the last time it is needed. This is the section that begins with:
if (inParam1 == (long)SendButton)
This will run at the end. When running this section, the fltno variable returns no text. There are many other variables which I am using in the same way, and they all seem to work fine. I ran the program over a short flight and it worked fine, but the two times I have run it on a longer flight, the variable has returned blank in that section.
Let me know if I need to explain more. You can probably tell I haven't written much C so suggestions are appreciated.
More Info
It was suggested that the adjacently decalared variables could be overflowing into fltno.
The Ppass[64] variable is copied from Ppass_def, which is set in the code and is definitely smaller than 64.
strcpy(Ppass, Ppass_def);
The tailnum[41] variable is read from a plugin function.
XPLMGetDatab( tailnum_ref, tailnum, 0, 40 );
Both of these variables are set at the beginning of the program, so it doesn't make sense that fltno only misbehaves at the very end.
Update 1
Thanks for the comments. As suggested by rodrigo, I replaced the longest malloc/sprintf instance with calls to some functions. Not sure if what I did is what you had in mind, but it seems to work at least as well as it did before. You can see the new code on the Github link.
I also did more testing and narrowed down a bit where the problem could be. The fltno variable is fine when run the last time at line 1193, but by the end of the case at line 1278, it is blank. I will try to do more testing later to narrow it down further.
Update 2
After more testing I narrowed it down to line 1292. Before that line the fltno is fine, after that it becomes blank.
strcpy(INlon, dlon);
The dlon variable is declared within the if statement where the switch is.
char dlon[12];
This variable is set every time that section runs, so it's strange that there are no problems until the end. It's set from a function that takes the decimal degrees latitude or longitude and returns a string like "N33 45.9622".
strcpy(dlon, degdm(lon, 1));
The INlon variable is declared at the beginning of the program, and this is the only time it is set.
char INlon[12];
Any ideas about why this part messes up the fltno variable?
Update 3
Thanks for the suggestions about strncpy and an alternative to the double sprintf calls. I changed a few other things and it seems to work now, I will do a full test later to be sure.
A key part that I really should have caught before now is that the latitude/longitude strings were only 12 long, which is too short. When the longitude is over 100 degrees, the string could be "W100 22.5678", which is 12 characters. This caused the end NULL to be cut off and was the source of some of my problems.
char INlat[13];
I noticed this when I was getting something like "W100 22.5678ABC1234" from those variables, where "ABC1234" is the fltno variable.
I used a pointer for dlat to avoid size problems, and use strncpy to make sure to not spill over to other variables.
strncpy(INlat, dlat, sizeof(INlat));
Finally, I found the asprintf function to replace the double instances of sprintf. I understand this won't work on all systems but it's a simple fix for now.
char * purl = NULL;
asprintf(&purl, "DATA1=%s&DATA2=%s", DATA1v1, DATA2);
So thanks for your help in fixing my code (assuming that it's working now). I feel like I knew enough to be able to fix it (once I was pointed in the right direction) but I'm still not sure exactly what was going on. If anyone wants to post an explanation of what was going wrong (as far as you can tell) I would be happy to accept an answer.

Buffer overflow that overwrites local variables

I'm doing a buffer overflow exercise where the source code is given. The exercise allows you to change the number of argument vectors you feed into the program so you can get around the null problem making it easy.
However the exercise also mentions that it is possible to use just 1 argument vector to compromise this code. I'm curious to see how this can be done. Any ideas on how to approach this would be greatly appreciated.
The problem here is that length needs to be overwritten in order for the overflow to take place and the return address to be compromised. To my knowledge, you can't really use NULLs in the string since they are being passed in via execve arguments. So the length ends up being a very large number as you have to write some non zero number causing the entire stack to go boom, it's the same case with the return address. Am I missing something obvious? Does strlen need to be exploited. I saw some references to arithmetic overflow of signed numbers but I'm not sure if turning the local variables does anything.
The code is posted below and returns to a main function which then ends the program and runs on a little endian system with all stack protection turned off as this is an introductory exercise for infosec:
int TrickyOverflowSeq ( char *in )
{
char to_be_exploited[128];
int c;
int limit;
limit = strlen(in);
if (limit > 144)
limit = 144;
for (c = 0; c <= limit; c++)
to_be_exploited[c] = in[c];
return(0);
}
I don't know where arg comes from, but since your buffer is only 128 bytes, and you cap the max length to 144, you need only pass in a string longer than 128 bytes to cause a buffer overrun when copying in to to_be_exploited. Any malicious code would be in the input buffer from positions 129 to 144.
Whether or not that will properly set up a return to a different location depends on many factors.
However the exercise also mentions that it is possible to use just 1 argument vector to compromise this code. I'm curious to see how this can be done.
...
The problem here is that length needs to be overwritten in order for the overflow to take place and the return address to be compromised.
It seems pretty straightforward to me. That magic number 144 makes sense if sizeof(int) == 8, which it would if you are building for 64-bit.
So assuming a stack layout where to_be_exploited comes before c and limit, you can simply pass in a very long string with junk in the bytes starting at offset 136 (i.e., 128 + sizeof(int)), and then carefully crafted junk in the bytes starting with offset 144. This will overwrite limit starting with that byte, thus disabling the length check. Then the carefully crafted junk overwrites the return address.
You could put almost anything into the 8 bytes starting at offset 136 and have them make a number that is large enough to disable the security check. Just make sure you don't end up with a negative number. For example, the string "HAHAHAHA" would evaluate, as an integer, to 5206522089439316033. This number is larger than 144... actually, it's too large as you want this function to stop copying once your string is copied. So you just need to figure out how long your attack string actually is and put the correct bytes for that length into that position, and the attack will be copied in.
Note that normal string-handling functions in C use a NUL byte as a terminator, and stop copying. This function doesn't do that; it just trusts limit. So you could put any junk you want in the input string to exploit this function. However, if normal C library functions need to copy the input data, you might end up needing to avoid NUL bytes.
Of course nobody should put code this silly into production.
EDIT: I wrote the above in a hurry. Now that I have more time, I re-read your question and I think I better understand what you wanted to have explained.
You are wondering how a string can correctly clobber limit with a correct length without having strlen() chop it off short. This is impossible on a big-endian computer, but perfectly possible on a little-endian computer.
On a little-endian computer, the first byte is the least significant byte. See the Wikipedia entry:
http://en.wikipedia.org/wiki/Endianness
Any number that is not ridiculously large must have zero in its most significant bytes. On a big-endian computer that means the first several bytes will all be zero, will act like a NUL, and will cause strlen() to chop the string before the function can clobber limit. However, on a little-endian computer, the important bytes you want copied will all come before the NUL bytes.
In the early days of the Internet, it was common for big-endian computers (often bought from Sun Microsystems) to run Internet server apps. These days, commodity x86 server hardware is most common, and x86 is little-endian. In practice, anyone deploying such exploitable code as the TrickyOverflowSeq() function will get 0wned.
If you don't think this answer is thorough enough, please post a comment explaining what part you think I need to cover better and I'll update the answer.
I am aware that this is quite an old post, however I stumbled on your question because I found myself in the same situation with exactly the same questions as the ones you ask in your post and in the comments.
A few minutes later, I solved the problem. I don't know how much of it I should "spoil" here, since AFAIK this is a typical problem in many Computer Security courses. I can say however that the solution can indeed be achieved with exactly one argument... and with a couple of environment variables. Additional hint: environment variables are stored after function arguments on the stack (as in in higher addresses than the function arguments).

Resources