Semantically identical codes give different results [closed] - c

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I couldn't wrap my head around this one, hope the title isn't too misleading. Why does write behave differently when it comes to it's third argument count in the two snippets of code? It would seem that calling a function instead of specifying a number in write is a bad thing, but it doesn't seem like a big deal.
Wrong version:
int main(int argc, char const *argv[])
{
char format[50];
char formattedTime[50];
time_t t;
if (read(STDIN_FILENO, format, 50) < 0)
fatalError("read() error");
time(&t);
strftime(formattedTime, 50, format, localtime(&t));
if (write(STDOUT_FILENO, formattedTime, strlen(formattedTime) + 1) != strlen(formattedTime) + 1)
fatalError("write() error");
return 0;
}
Right version:
int main(int argc, char const *argv[])
{
char format[50]; //zeljeni format
char formattedTime[50]; //formatirano vreme
time_t t; // trenutno vreme
// citamo s ulaza zeljeni format vremena
if (read(STDIN_FILENO, format, 50) < 0)
fatalError("read() error");
// zapisujemo trenutno vreme
time(&t);
strftime(formattedTime, 50, format, localtime(&t));
int n;
n = strlen(formattedTime) + 1;
// ispisujemo na izlaz
if (write(STDOUT_FILENO, formattedTime, n) != n)
fatalError("write() error");
return 0;
}
Right output:
%a %b %d
Wed Jan 16
Wrong output:
%a %b %d
Wed Jan 16
0
Why would calcuating n just a step before the call to write make all the difference?
EDIT:
Hope this satisfies all the info. The gibberish is different every time, but the point remains.

If you really have that behavior this probably means the null character is missing in formattedTime and by chance n is just after in the stack and introduces a null char by its presence, or an equivalent because of the data saved in the stack

The read function is primarily intended to read binary data, not strings. As such, it reads only the characters you enter (i.e. a sequence of characters followed by a newline) without adding a null terminating byte. As a result, you don't have a properly formatted string, so using strftime can read past what was written into bytes that were not initialized and possibly past the end of the array. This invokes undefined behavior.
The "right version" seems to work because you got "lucky". That's one of the ways undefined behavior can manifest itself. Other people could see the opposite results of what you see.
You need to capture how many bytes were read and manually add a terminating null byte to the array:
int rval;
if ((rval=read(STDIN_FILENO, format, 49)) < 0)
fatalError("read() error");
format[rval] = 0;

Related

C reading from file: Reading only first char of line [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I have to get node ids from DIMES ASNodes.csv (http://netdimes.org/new/?q=node/65) files.
File looks like this:
6067,UNKNOWN,2007-02-03 10:03:53.0,2007-01-02 02:54:13.0,12,6,0
29287,UNKNOWN,2007-02-03 21:11:07.0,2007-01-02 07:33:35.0,1,0,0
...
So far I came up with this code, but it doesn't work quite right. Althought it prints out all the numbers I needed, it also prints out the node id twice and sometimes prints zeroes in between. Thanks for any ideas
void loadNodes(const char* filename)
{
FILE* nodes = fopen(filename, "r");
unsigned int id = 0;
char line[64];
while (fgets(line, sizeof(line), nodes) != NULL) {
sscanf(line, "%u%*[^\n]", &id);
printf("id = %u\n", id);
}
fclose(nodes);
}
output
I think the trouble is that your lines have 63 characters plus a newline, which means that the fgets() reads up to, but not including, the newline (and you process that and get the correct number), then the next fgets() reads the newline that was left behind on the previous input (and you process that — it is surprising that you get zeros rather than a repeat of the previous number).
Here's your code converted into an MCVE (How to create a Minimal, Complete, and Verifiable Example?) main() program that reads from standard input (which saves me from having to validate, open and close files):
#include <stdio.h>
int main(void)
{
unsigned id = 0;
char line[64];
while (fgets(line, sizeof(line), stdin) != NULL)
{
printf("Line: [%s]\n", line);
sscanf(line,"%u", &id);
printf("id = %u\n", id);
}
return 0;
}
Note the diagnostic printing of the line just read. The code should really check the return value from sscanf(). (There was no virtue in skipping the trailing debris, so I removed that from the format string.)
Given the data file (data):
6067,UNKNOWN,2007-02-03 10:03:53.0,2007-01-02 02:54:13.0,12,6,0
29287,UNKNOWN,2007-02-03 21:11:07.0,2007-01-02 07:33:35.0,1,0,0
The output I get from so.37103830 < data is:
Line: [6067,UNKNOWN,2007-02-03 10:03:53.0,2007-01-02 02:54:13.0,12,6,0]
id = 6067
Line: [
]
id = 6067
Line: [29287,UNKNOWN,2007-02-03 21:11:07.0,2007-01-02 07:33:35.0,1,0,0]
id = 29287
Line: [
]
id = 29287
Avoiding the problem
The simplest fix is to use a longer buffer length; I normally use 4096 when I don't care about what happens if a really long line is read, but you might decide that 128 or 256 is sufficient.
Otherwise, I use POSIX getline() which will read arbitrarily long lines (subject to not running out of memory).
With a longer line length, I get the output:
Line: [6067,UNKNOWN,2007-02-03 10:03:53.0,2007-01-02 02:54:13.0,12,6,0
]
id = 6067
Line: [29287,UNKNOWN,2007-02-03 21:11:07.0,2007-01-02 07:33:35.0,1,0,0
]
id = 29287
Assuming you only need the first column from the file (since you mention node ids), you could use:
unsigned int node_id;
char str[100];
while(scanf("%u,%[^\n]",&node_id, str) == 2) {
printf("%u\n",node_id);
}
Demo

C- leading zero without printf [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Ethernet starter kit(PIC32MX9795F512L)
language: C
MPLAB IDE 8.92
Compiler: XC32 v1.3
Hello i want to add leading zeros to my variables. At the end i want to use the in an array.
For example: c=10*a+b. When c=5 it should be 05. I cant use any printf function or am I wrong?
You can use printf() to simply print a formatted number to standard output:
int c = 5;
fprintf(stdout, "c [%02d]\n", c);
If you can't use printf(), another option is to store the padded value in a char * or string. You can instead use sprintf() to write the formatted string to a char * buffer.
For example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (int argc, char *argv[]) {
{
char* c_str = NULL;
int c_int = 5;
int c_str_length = 3; /* two bytes for "0", "5", and one byte for the nul terminator */
c_str = malloc(c_str_length);
if (!c_str) {
fprintf(stderr, "Error: Could not allocate space for string!\n");
return EXIT_FAILURE;
}
int n = sprintf(c_str, "%02d", c_int);
if (n != c_str_length) {
fprintf(stderr, "Error: Something went wrong in writing the formatted string!\n");
free(c_str);
return EXIT_FAILURE;
}
fprintf(stdout, "c_str: [%s]\n", c_str);
free(c_str);
return EXIT_SUCCESS;
}
If you go this route, you can see how you could do some error checking along the way. You'll need to think about string length (hint: log10()), or use a static char [] array in place of a char * of sufficiently long length.
It is quite easy to add a leading zero, provided you take care of negative values too. You said you want to write to an array, so I used sprintf but if you want to output directly, you can use printf in a similar way.
char cstr[24];
int c = 10 * a + b;
if (c > 0) {
sprintf(cstr, "0%d", c);
} else if (c < 0) {
sprintf(cstr, "-0%d", -c);
} else {
//sprintf(cstr, "00");
sprintf(cstr, "0"); // depending on your needs
}

C Programming: Using function to return size of array [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I have been trying to create a function in C that reads in doubles that are then stored into an array. I want to be able to return the size of the array so that I can use it in main. So the purpose of this function is to ask the user to input values into an array and type in ^d (ctrl+d) or EOF to end the loop.
#include <stdio.h>
//Prototype Declaration
int getdata(double[], int);
int main(int argc, char* argv[]) {
double array[20];
int count = 0, max = 20;
//Calls the getdata function
count = getdata(array, max);
printf("%s%lf%s","Array1: " ,array[1], "\n");
printf("%s%d%s", "Count is: ", count, "\n");
return 0;
}
----- (I'm linking modules so these are in different files) -------
#include <stdio.h>
//Define getdata.c
int getdata(double values[], int limit){
printf("%s","Please enter your values into the array.\n");
int count = 0;
double n;
while ((count < limit) && (scanf("%lf",&n) != EOF)) {
values[count] = n;
count++;
}
return count;
}
What happens, though, is that if I break the loop early, the array values pass by fine but the count would just print out 20. I am assuming that the program automatically fills in the rest of the empty indexes with something and continues to increment the count value after I type in EOF. What can I do so that the program can correctly get the amount of values inputted? I do not code much in C. I mostly do my work in C++. I am familiar with passing by reference but using C.
You need to be aware that scanf() won't swallow input that can't be converted. If there is a problem then it leaves the (faulty) input in the stream for you to fail to convert on the next iteration of the loop. So it's repeatedly trying to match the same invalid thing as a double.
I suggest using fgets() plus sscanf() as a better solution than using scanf(), or (at minimum) check the return value from scanf() to ensure you got the expected number of fields read (which is 1 in this case). I bet it is returning 0 rather than EOF, so your loop continues until count matches limit.

Check if user input into an array is too long?

I am getting the user to input 4 numbers. They can be input: 1 2 3 4 or 1234 or 1 2 34 , etc. I am currently using
int array[4];
scanf("%1x%1x%1x%1x", &array[0], &array[1], &array[2], &array[3]);
However, I want to display an error if the user inputs too many numbers: 12345 or 1 2 3 4 5 or 1 2 345 , etc.
How can I do this?
I am very new to C, so please explain as much as possible.
//
Thanks for your help.
What I have now tried to do is:
char line[101];
printf("Please input);
fgets(line, 101, stdin);
if (strlen(line)>5)
{
printf("Input is too large");
}
else
{
array[0]=line[0]-'0'; array[1]=line[1]-'0'; array[2]=line[2]-'0'; array[3]=line[3]-'0';
printf("%d%d%d%d", array[0], array[1], array[2], array[3]);
}
Is this a sensible and acceptable way? It compiles and appears to work on Visual Studios. Will it compile and run on C?
OP is on the right track, but needs adjust to deal with errors.
The current approach, using scanf() can be used to detect problems, but not well recover. Instead, use a fgets()/sscanf() combination.
char line[101];
if (fgets(line, sizeof line, stdin) == NULL) HandleEOForIOError();
unsigned arr[4];
int ch;
int cnt = sscanf(line, "%1x%1x%1x%1x %c", &arr[0], &arr[1], &arr[2],&arr[3],&ch);
if (cnt == 4) JustRight();
if (cnt < 4) Handle_TooFew();
if (cnt > 4) Handle_TooMany(); // cnt == 5
ch catches any lurking non-whitespace char after the 4 numbers.
Use %1u if looking for 1 decimal digit into an unsigned.
Use %1d if looking for 1 decimal digit into an int.
OP 2nd approach array[0]=line[0]-'0'; ..., is not bad, but has some shortcomings. It does not perform good error checking (non-numeric) nor handles hexadecimal numbers like the first. Further, it does not allow for leading or interspersed spaces.
Your question might be operating system specific. I am assuming it could be Linux.
You could first read an entire line with getline(3) (or readline(3), or even fgets(3) if you accept to set an upper limit to your input line size) then parse that line (e.g. with sscanf(3) and use the %n format specifier). Don't forget to test the result of sscanf (the number of read items).
So perhaps something like
int a=0,b=0,c=0,d=0;
char* line=NULL;
size_t linesize=0;
int lastpos= -1;
ssize_t linelen=getline(&line,&linesize,stdin);
if (linelen<0) { perror("getline"); exit(EXIT_FAILURE); };
int nbscanned=sscanf(line," %1d%1d%1d%1d %n", &a,&b,&c,&d,&lastpos);
if (nbscanned>=4 && lastpos==linelen) {
// be happy
do_something_with(a,b,c,d);
}
else {
// be unhappy
fprintf(stderr, "wrong input line %s\n", line);
exit(EXIT_FAILURE);
}
free(line); line=NULL;
And once you have the entire line, you could parse it by other means like successive calls of strtol(3).
Then, the issue is what happens if the stdin has more than one line. I cannot guess what you want in that case. Maybe feof(3) is relevant.
I believe that my solution might not be Linux specific, but I don't know. It probably should work on Posix 2008 compliant operating systems.
Be careful about the result of sscanf when having a %n conversion specification. The man page tells that standards might be contradictory on that corner case.
If your operating system is not Posix compliant (e.g. Windows) then you should find another way. If you accept to limit line size to e.g. 128 you might code
char line[128];
memset (line, 0, sizeof(line));
fgets(line, sizeof(line), stdin);
ssize_t linelen = strlen(line);
then you do append the sscanf and following code from the previous (i.e. first) code chunk (but without the last line calling free(line)).
What you are trying to get is 4 digits with or without spaces between them. For that, you can take a string as input and then check that string character by character and count the number of digits(and spaces and other characters) in the string and perform the desired action/ display the required message.
You can't do that with scanf. Problem is, there are ways to make scanf search for something after the 4 numbers, but all of them will just sit there and wait for more user input if the user does NOT enter more. So you'd need to use gets() or fgets() and parse the string to do that.
It would probably be easier for you to change your program, so that you ask for one number at a time - then you ask 4 times, and you're done with it, so something along these lines, in pseudo code:
i = 0
while i < 4
ask for number
scanf number and save in array at index i
E.g
#include <stdio.h>
int main(void){
int array[4], ch;
size_t i, size = sizeof(array)/sizeof(*array);//4
i = 0;
while(i < size){
if(1!=scanf("%1x", &array[i])){
//printf("invalid input");
scanf("%*[^0123456789abcdefABCDEF]");//or "%*[^0-9A-Fa-f]"
} else {
++i;
}
}
if('\n' != (ch = getchar())){
printf("Extra input !\n");
scanf("%*[^\n]");//remove extra input
}
for(i=0;i<size;++i){
printf("%x", array[i]);
}
printf("\n");
return 0;
}

c string construction

I'm trying to parse argv back into a single string to use for a system call.
The string shows up fine through the printf call (even without the \0 terminator), but using it as a parameter to system creates all sorts of undefined behaviour.
How can I ensure the string is properly terminated?
Is there a better and more reliable way to go about parsing char[][] into char[]?
#include <stdio.h>
int main(int argc,char *argv[]){
char cmd[255]="tcc\\tcc.exe ";
char**ptr=argv+1;
while(*ptr){
strcat(cmd,*ptr);
++ptr;
if(*ptr)cmd[strlen(cmd)]=' ';
}
printf("cmd: ***%s***\n",cmd);
system(cmd);
}
I just discovered and corrected another flaw in this code, a system commands needs (escaped) backslashes for file paths
This instruction:
if(*ptr)
cmd[strlen(cmd)]=' ';
else
cmd[strlen(cmd)]='\0';
will break cmd, because it will overwrite its zero termination. Try instead:
l = strlen(cmd);
if (*ptr) {
cmd[l++] = ' ';
}
cmd[l] = 0x0;
This will append a space, and zero terminate the string. Actually, since it is already zero terminated, you could do better:
if (*ptr) {
int l = strlen(cmd);
cmd[l++] = ' ';
cmd[l ] = 0x0;
}
Update
A better alternative could be this:
int main(int argc, char *argv[])
{
char cmd[255]="tcc/tcc.exe";
char **ptr=argv+1;
for (ptr = argv+1; *ptr; ptr++)
{
strncat(cmd, " ", sizeof(cmd)-strlen(cmd));
strncat(cmd, *ptr, sizeof(cmd)-strlen(cmd));
}
printf("String: '%s'.\n", cmd);
return 0;
}
We use strncat() to check that we're not overrunning the cmd buffer, and the space gets applied in advance. This way there's no extra space at the end of the string.
It is true that strncat() is a mite slower than directly assigning cmd[], but factoring the safety and debugging time, I think it's worthwhile.
Update 2
OK, so let's try to do this fast. We keep track of what cmd's length ought to be in a variable, and copy the string with memcpy() which is slightly faster than strcpy() and does neither check string length, nor copy the extra zero at end of string.
(This saves something - remember that strcat() has to implicitly calculate the strlen of both its arguments. Here we save that).
int main(int argc, char *argv[])
{
#define MAXCMD 255
char cmd[MAXCMD]="tcc/tcc.exe";
int cmdlen = strlen(cmd);
char **ptr=argv+1;
for (ptr = argv+1; *ptr; ptr++)
{
/* How many bytes do we have to copy? */
int l = strlen(*ptr);
/* STILL, this check HAS to be done, or the program is going to crash */
if (cmdlen + 1 + l + 1 < MAXCMD)
{
/* No danger of crashing */
cmd[cmdlen++] = ' ';
memcpy(cmd + cmdlen, *ptr, l);
cmdlen += l;
}
else
{
printf("Buffer too small!\n");
}
}
cmd[cmdlen] = 0x0;
printf("String: '%s'.\n", cmd);
return 0;
}
Update 3 - not really recommended, but fun
It is possible to try and be smarter than the compiler's usually built-in strlen and memcpy instructions (file under: "Bad ideas"), and do without strlen() altogether. This translates into a smaller inner loop, and when strlen and memcpy are implemented with library calls, much faster performances (look ma, no stack frames!).
int main(int argc, char *argv[])
{
#define MAXCMD 254
char cmd[MAXCMD+1]="tcc/tcc.exe";
int cmdlen = 11; // We know initial length of "tcc/tcc.exe"!
char **ptr;
for (ptr = argv+1; *ptr; ptr++)
{
cmd[cmdlen++] = ' ';
while(**ptr) {
cmd[cmdlen++] = *(*ptr)++;
if (MAXCMD == cmdlen)
{
fprintf(stderr, "BUFFER OVERFLOW!\n");
return -1;
}
}
}
cmd[cmdlen] = 0x0;
printf("String: '%s'.\n", cmd);
return 0;
}
Discussion - not so fun
Shamelessly cribbed from many a lecture I received from professors I thought shortsighted, until they were proved right each and every time.
The problem here is to exactly circumscribe what are we doing - what's the forest this particular tree is part of.
We're building a command line that will be fed to a exec() call, which means that the OS will have to build another process environment and allocate and track resources. Let's step a bit backwards: an operation will be run that will take about one millisecond, and we're feeding it a loop that might take ten microseconds instead of twenty.
The 20:10 (that's 50%!) improvement we have on the inner loop translates in a 1020:1010 (that's about 1%) just the overall process startup operation. Let's imagine the process takes half a second - five hundred milliseconds - to complete, and we're looking at 500020:500010 or 0.002% improvement, in accord with the never-sufficiently-remembered http://en.wikipedia.org/wiki/Amdahl%27s_law .
Or let's put it another way. One year hence, we will have run this program, say, one billion times. Those 10 microseconds saved now translate to a whopping 10.000 seconds, or around two hours and three quarters. We're starting to talk big, except that to obtain this result we've expended sixteen hours coding, checking and debugging :-)
The double-strncat() solution (which is actually the slowest) supplies code that is easier to read and understand, and modify. And reuse. The fastest solution, above, implicitly relies on the separator being one character, and this fact is not immediately apparent. Which means that reusing the fastest solution with ", " as separator (let's say we need this for CSV or SQL) will now introduce a subtle bug.
When designing an algorithm or piece of code, it is wise to factor not only tightness of code and local ("keyhole") performances, but also things like:
how that single piece affects the performances of the whole. It makes no sense to spend 10% of development time on less than 10% of the overall goal.
how easy it is for the compiler to interpret it (and optimize it with no further effort on our part, maybe even optimize specifically for different platforms -- all at no cost!)
how easy will it be for us to understand it days, weeks, or months down the line.
how non-specific and robust the code is, allowing to reuse it somewhere else (DRY).
how clear its intent is - allowing to reengineer it later, or replace with a different implementation of the same intent (DRAW).
A subtle bug
This in answer to WilliamMorris's question, so I'll use his code, but mine has the same problem (actually, mine is - not completely unintentionally - much worse).
This is the original functionality from William's code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
#define CMD "tcc/tcc.exe"
char cmd[255] = CMD;
char *s = cmd + sizeof CMD - 1;
const char *end = cmd + sizeof cmd - 1;
// Cycle syntax modified to pre-C99 - no consequences on our code
int i;
for (i = 1; i < argc; ++i) {
size_t len = strlen(argv[i]);
if (s + len >= end) {
fprintf(stderr, "Buffer overrun!\n");
exit(1);
}
// Here (will) be dragons
//*s++ = '.';
//*s++ = '.';
//*s++ = '.';
*s++ = ' ';
memcpy(s, argv[i], len);
s += len;
}
*s = '\0';
// Get also string length, which should be at most 254
printf("%s: string length is %d\n", cmd, (int)strlen(cmd));
return 0;
}
The buffer overrun check verifies that the string written so far, plus the string that has yet to be written, together do not exceed the buffer. The length of the separator itself is not counted, but things will work out somehow:
size_t len = strlen(argv[i]);
if (s + len >= end) {
fprintf(stderr, "Buffer overrun!\n");
exit(1);
}
Now we add on the separator in the most expeditious way - by repeating the poke:
*s++ = ', ';
*s++ = ' ';
Now if s + len is equal to end - 1, the check will pass. We now add two bytes. The total length will be s + len + 2, which is equal to end plus one:
tcc/tcc.exe, It, was, the, best, of, times, it, was, the, worst, of,
times, it, was, the, age, of, wisdom, it, was, the, age, of,
foolishness, it, was, the, epoch, of, belief, it, was, the, epoch, of,
incredulity, it, was, the, season, of, Light, it, was: string length is 254
tcc/tcc.exe, It, was, the, best, of, times, it, was, the, worst, of,
times, it, was, the, age, of, wisdom, it, was, the, age, of,
foolishness, it, was, the, epoch, of, belief, it, was, the, epoch, of,
incredulity, it, was, the, season, of, Light, it, ouch: string length
is 255
With a longer separator, such as "... ", the problem is even more evident:
tcc/tcc.exe... It... was... the... best... of... times... it... was...
the... worst... of... times... it... was... the... age... of...
wisdom... it... was... the... age... of... foolishness... it... was...
the... epoch... of... belief... it... was... longer: string length is
257
In my version, the fact that the check requires an exact match leads to catastrophic results, since once the buffer is overrun, the match will always fail and result in a massive memory overwrite.
If we modify my version with
if (cmdlen >= MAXCMD)
we will get a code that always intercepts buffer overruns, but still does not prevent them up to the delimiter's length minus two; i.e., a hypothetical delimiter 20 bytes long could overwrite 18 bytes past cmd's buffer before being caught.
I would point out that this is not to say that my code had a catastrophic bug (and so, once fixed, it'll live happy ever after); the point was that the code was structured in such a way that, for the sake of squeezing a little speed, a dangerous bug could easily go unnoticed, or the same bug could easily be introduced upon reuse of what looked like "safe and tested" code. This is a situation that one would be well advised to avoid.
(I'll come clean now, and confess that I myself rarely did... and too often still don't).
This might be a more complicated than you want but it avoids buffer overflows and it also exits if the buffer is too small. Note that continuing to loop once there is not enough space in the buffer for argv[N] can result in trailing strings (argv[N+1] etc) that are shorter than argv[N] being added to the string even though argv[N] was omitted...
Note that I'm using memcpy, because by that point I already know how long argv[i] is.
int main(int argc, char **argv)
{
#define CMD "tcc/tcc.exe"
char cmd[255] = CMD;
char *s = cmd + sizeof CMD - 1;
const char *end = cmd + sizeof cmd - 1;
for (int i = 1; i < argc; ++i) {
size_t len = strlen(argv[i]);
if (s + len >= end) {
fprintf(stderr, "Buffer overrun!\n");
exit(1);
}
*s++ = ' ';
memcpy(s, argv[i], len);
s += len;
}
*s = '\0';
printf("%s\n", cmd);
return 0;
}

Resources