Program only works if dummy char array is declared [C] - c

The following code will print to the file correctly if char finalstr[2048]; is declared, however if I remove it (since it's not used anywhere) the program prints garbage ascii instead. This makes me believe it's something related to memory, however I have no clue.
#include <stdio.h>
#include <stdlib.h>
int main()
{
FILE *fp;
FILE *fp2;
char str[2048];
char finalstr[2048];
fp = fopen("f_in.txt", "r");
fp2 = fopen("f_out.txt", "w");
while(fgets(str,2047,fp))//read line by line until end of file
{
int i;
for(i=0;i<=strlen(str);i++)//go trough the string cell by cell
{
if(str[i]>47 && str[i]<58 && str[i+1]>47 && str[i+1]<58)//from 0 to 9
{
char temp[2];//to hold temporary two digit string number
temp[0]=str[i];
i++;
temp[1]=str[i];
if(atoi(temp)<27)//if it's an upper case letter
fprintf(fp2,"%c",atoi(temp)+64);
else//if it's lowercase, skip the special characters between Z and a
fprintf(fp2,"%c",atoi(temp)+70);
}
else fprintf(fp2,"%c",str[i]);
}
}
fclose(fp);
fclose(fp2);
}
Input
20343545 3545 27 494140303144324738 343150 404739283144: ffabcd. 094540' 46 3546?
01404146343144 283127474635324738 404739283144 09 453131 3545 abcdefYXWVUTSRQP
2044474546 3931. 09 37404149 27 384146!
Output if finalstr[] is declared
This is a wonderful hex number: ffabcd. Isn' t it?
Another beautiful number I see is abcdefYXWVUTSRQP
Trust me. I know a lot!
Output if finalstr[] is not declared
?99? 9? 9 ?9999?9?9 99? 9?999?: ffabcd. ??9' ? 9??
((((.(( (((((.((. ((.((( ( ((( .( abcdefYXWVUTSRQP
øòòøò øò. ø òòòò ø òòò!
I did notice that the first if() statement could cause an overflow, however replacing <= with < had no effect on the end result.
I really wonder what the explanation behind this is, and whether it's C specific or if it would have happened in C++ too.

The main problem is with the temporary string you're using. It's not long enough to store a null terminating character, so you have an unterminated string.
Make the array 3 bytes long and add the terminator:
char temp[3];//to hold temporary two digit string number
temp[0]=str[i];
i++;
temp[1]=str[i];
temp[2]=0;
Also, you're looking too far off of the end of the array in your for loop. Use < instead of <=:
for(i=0;i<strlen(str);i++)//go trough the string cell by cell
Finally, make sure you #include <string.h> so that you have a proper declaration for strlen.

atoi(temp) causes undefined behaviour. The atoi function expects a pointer to null-terminated string as argument, however you provided a pointer to two characters without a terminator.
The atoi function will read off the end of your array. Your dummy array influences this because it changes what junk is present after the temp array.
BTW you could use (str[i] - '0') * 10 + (str[i+1] - '0') instead of atoi.

To my understanding, the problem is that the program fills the array potentially up to its full capacity by
fgets(str,2047,fp)
which means that the condition
i <= strlen(str)
works as expected only if the location after str is terminated with a zero; this might be the case when declaring finalstr.

Related

What is the point of assigning the size of a string?

For an instance if I store ABCDE from scanf function, the later printf function gives me ABCDE as output. So what is the point of assigning the size of the string(Here 4).
#include <stdio.h>
int main() {
int c[4];
printf("Enter your name:");
scanf("%s",c);
printf("Your Name is:%s",c);
return 0;
}
I'll start with, don't use int array to store strings!
int c[4] allocates an array of 4 integers. An int is typically 4 bytes, so usually this would be 16 bytes (but might be 8 or 32 or something else on some platforms).
Then, you use this allocation first to read characters with scanf. If you enter ABCDE, it uses up 6 characters (there is an extra 0 byte at the end of the string marking the end, which needs space too), which happens to fit into the memory reserved for array of 4 integers. Now you could be really unlucky and have a platform where int has a so called "trap representation", which would cause your program to crash. But, if you are not writing the code for some very exotic device, there won't be. Now it just so happens, that this code is going to work, for the same reason memcpy is going to work: char type is special in C, and allows copying bytes to and from different types.
Same special treatment happens, when you print the int[4] array with printf using %s format. It works, because char is special.
This also demonstrates how very unsafe scanf and printf are. They happily accept c you give them, and assume it is a char array with valid size and data.
But, don't do this. If you want to store a string, use char array. Correct code for this would be:
#include <stdio.h>
int main() {
char c[16]; // fits 15 characters plus terminating 0
printf("Enter your name:");
int items = scanf("%15s",c); // note: added maximum characters
// scanf returns number of items read successfully, *always* check that!
if (items != 1) {
return 1; // exit with error, maybe add printing error message
}
printf("Your Name is: %s\n",c); // note added newline, just as an example
return 0;
}
The size of an array must be defined while declaring a C String variable because it is used to calculate how many characters are going to be stored inside the string variable and thus how much memory will be reserved for your string. If you exceed that amount the result is undefined behavior.
You have used int c , not char c . In C, a char is only 1 byte long, while a int is 4 bytes. That's why you didn't face any issues.
(Simplifying a fair amount)
When you initialize that array of length 4, C goes and finds a free spot in memory that has enough consecutive space to store 4 integers. But if you try to set c[4] to something, C will write that thing in the memory just after your array. Who knows what’s there? That might not be free, so you might be overwriting something important (generally bad). Also, if you do some stuff, and then come back, something else might’ve used that memory slot (properly) and overwritten your data, replacing it with bizarre, unrelated, and useless (to you) data.
In C language the last of the string is '\0'.
If you print with the below function, you can see the last character of the string.
scanf("%s", c); add the last character, '\0'.
So, if you use another function, getc, getch .., you should consider adding the laster character by yourself.
#include<stdio.h>
#include<string.h>
int main(){
char c[4+1]; // You should add +1 for the '\0' character.
char *p;
int len;
printf("Enter your name:");
scanf("%s", c);
len = strlen(c);
printf("Your Name is:%s (%d)\n", c, len);
p = c;
do {
printf("%x\n", *(p++));
} while((len--)+1);
return 0;
}
Enter your name:1234
Your Name is:1234 (4)
31
32
33
34
0 --> last character added by scanf("%s);
ffffffae --> garbage

Problem reading two strings with getchar() and then printing those strings in C

This is my code for two functions in C:
// Begin
void readTrain(Train_t *train){
printf("Name des Zugs:");
char name[STR];
getlinee(name, STR);
strcpy(train->name, name);
printf("Name des Drivers:");
char namedriver[STR];
getlinee(namedriver, STR);
strcpy(train->driver, namedriver);
}
void getlinee(char *str, long num){
char c;
int i = 0;
while(((c=getchar())!='\n') && (i<num)){
*str = c;
str++;
i++;
}
printf("i is %d\n", i);
*str = '\0';
fflush(stdin);
}
// End
So, with void getlinee(char *str, long num) function I want to get user input to first string char name[STR] and to second char namedriver[STR]. Maximal string size is STR (30 charachters) and if I have at the input more than 30 characters for first string ("Name des Zuges"), which will be stored in name[STR], after that I input second string, which will be stored in namedriver, and then printing FIRST string, I do not get the string from the user input (first 30 characters from input), but also the second string "attached" to this, I simply do not know why...otherwise it works good, if the limit of 30 characters is respected for the first string.
Here my output, when the input is larger than 30 characters for first string, problem is in the row 5 "Zugname", why I also have second string when I m printing just first one...:
Name des Zugs:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
i is 30
Name des Drivers:xxxxxxxx
i is 8
Zugname: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaxxxxxxxx
Drivername: xxxxxxxx
I think your issue is that your train->name is not properly terminated with '\0', as a consequence when you call printf("%s", train->name) the function keeps reading memory until it finds '\0'. In your case I guess your structure looks like:
struct Train_t {
//...
char name[STR];
char driver[STR];
//...
};
In getlinee() function, you write '\0' after the last character. In particular, if the input is more than 30 characters long, you copy the first 30 characters, then add '\0' at the 31-th character (name[30]). This is a first buffer overflow.
So where is this '\0' actually written? well, at name[30], even though your not supposed to write there. Then, if you have the structure above when you do strcpy(train->name, name); you will actually copy a 31-bytes long string: 30 chars into train->name, and the '\0' will overflow into train->driver[0]. This is the second buffer overflow.
After this, you override the train->driver buffer so the '\0' disappears and your data in memory basically looks like:
train->name = "aaa...aaa" // no '\0' at the end so printf won't stop reading here
train->driver = "xxx\0" // but there
You have an off-by-one error on your array sizes -- you have arrays of STR chars, and you read up to STR characters into them, but then you store a NUL terminator, requiring (up to) STR + 1 bytes total. So whenever you have a max size input, you run off the end of your array(s) and get undefined behavior.
Pass STR - 1 as the second argument to getlinee for the easiest fix.
Key issues
Size test in wrong order and off-by-one. ((c=getchar())!='\n') && (i<num) --> (i+1<num) && ((c=getchar())!='\n'). Else no room for the null character. Bad form to consume an excess character here.
getlinee() should be declared before first use. Tip: Enable all compiler warnings to save time.
Other
Use int c; not char c; to well distinguish the typical 257 different possible results from getchar().
fflush(stdin); is undefined behavior. Better code would consume excess characters in a line with other code.
void getlinee(char *str, long num) better with size_t num. size_t is the right size type for array sizing and indexing.
int i should be the same type as num.
Better code would also test for EOF.
while((i<num) && ((c=getchar())!='\n') && (c != EOF)){
A better design would return something from getlinee() to indicate success and identify troubles like end-of-file with nothing read, input error, too long a line and parameter trouble like str == NULL, num <= 0.
I believe you have a struct similar to this:
typedef struct train_s
{
//...
char name[STR];
char driver[STR];
//...
} Train_t;
When you attempt to write a '\0' to a string that is longer than STR (30 in this case), you actually write a '\0' to name[STR], which you don't have, since the last element of name with length STR has an index of STR-1 (29 in this case), so you are trying to write a '\0' outside your array.
And, since two strings in this struct are stored one after another, you are writing a '\0' to driver[0], which you immediately overwrite, hence when printing out name, printf doesn't find a '\0' until it reaches the end of driver, so it prints both.
Fixing this should be easy.
Just change:
while(((c=getchar())!='\n') && (i<num))
to:
while(((c=getchar())!='\n') && (i<num - 1))
Or, as I would do it, add 1 to array size:
char name[STR + 1];
char driver[STR + 1];

Separating an input file into two different 2d Arrays

I'm currently working on an assignment that requires me to take an input file, separate it and store the contents into two different arrays.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char DataMem[32][3];
int RegMem[32][10];
char line[100][21]; //Holds the value for each line in the input file
int i = 0;
int j = 0;
while(fgets(line[i], 20, stdin) != NULL)
{
line[i++];
if(line[i] == " ")
DataMem[j] = line[i];
//printf("%s", line[3]);
}
return 0;
}
Suppose the input file looks something like:
95864312
68957425
-136985475
36547566
24957986
1
45
98
where the first values before the 1, are stored into the array named line, and the lines following the blank line need to be put into the array DataMem.
Can anyone point me into the direction as to how to do this? I can fill in the array line correctly, however I am having a hard time stopping the fill at that point and subsequently filling the rest of the file into the array datamem.
Thank you
Several issues; one is that (as Some programmer dude says) is that you do i++ so your test of line[i] == " " references uninitialized memory.
You also have a problem in that you cannot compare strings using ==, you must use strcmp(). The issue is that strings in C are actually pointers, so
char *foo = "Hello";
char *bar = "Hello";
if (foo == bar)
printf("This will never print out\n")
is not going to do what you expect--you will never see the printout, as the way this will (likely) compile is that foo will be set to point to an address in memory (say 0x1000) that has 'H' in 0x1000, 'e' in 0x1001, etc, up to 'o' in 0x1004, and '\0' in 0x1005. The bar variable will point to some other address (say 0x2000) which will hold the string "Hello" also. So while both strings are the same, the if test is actually testing if (0x1000 == 0x2000), which will fail--you need to do if (!strcmp(foo, bar)) to actually test the contents.
[Note: this example is actually flawed in that most modern compilers will simply create one instance of the "Hello" string in read-only memory and point both variables at it, so the if test would actually work in this case. But you shouldn't rely on that and it definitely does not hold for the general case.]
Your test on line[i] looks wrong also, as I suspect you want to copy strings that start with a space, not strings that are exactly " ", so I suspect you actually want to test on line[i][0]. But I can't be sure without knowing your assignment.
Your declaration of DataMem is incorrect also--you have declared it as 32 3-char entries, but you are writing line[i] into it and line[i] is a pointer. You either need to declare all instances big enough to hold the entire string you want (presumably the same 21 bytes that line can hold) and copy into it, or you need to declare it as an array of pointers (char *DataMem[32]). There is a key difference you need to understand: if you copy the string, then if you modify DataMem's view of the string, line's view of the string is unchanged. If you simply copy the pointer then changing one string changes both (because they are both pointing at exactly the same memory). Obviously, copying the string is slower and takes more memory (well, except for very short strings).
The magic numbers are bad as well. Instead of 20 and 21, for example, I would do #define MAX_STRING_LEN 20 and use it in the code. (Good job remembering to declare the array big enough to hold the terminating NIL by the way. However fgets() already is aware of the need and will read in up to one fewer characters so there is room for the NIL. You should be passing in 21 not 20). Also, I would pass sizeof(line[i]) as the argument to fgets(), not MAX_STRING_LEN (and certainly not 20). That way if the size of line[i] ever changes the code will still be correct; if you pass in the same dimension that you used to declare the variable someone might change it without realizing they need to change it here too.
Finally, you need to bounds-check inside your loop. What happens if the input is longer than the 100 entries you declared for line[]? Without a test you run the risk of writing beyond your variable boundary (which tends to lead to really hard-to-find bugs). A very useful macro is
#define NELEM(x) (sizeof(x) / sizeof(*(x)))
which you could use to do the test:
if (i >= NELEM(line)) {
printf("Data overflow\n");
exit(1);
}
so you don't need to embed the 100 (or the #define you replace it with) inside your code. (The #define works by taking the size of the entire data structure and dividing it by the size of the first element in it. And all the parentheses are actually required).
#include <stdio.h>
int main()
{
char DataMem[32][3];
char line[100][21]; //Holds the value for each line in the input file
for(int i = 0; fgets(line[i], 20, stdin) != NULL ; ++i)
{
if('\r' == line[i][0] || '\n' == line[i][0]) {
break;
}
printf("line[%d] = %s",i, line[i]);
}
for(int i = 0; fgets(DataMem[i], 20, stdin) != NULL ; ++i)
{
printf("DataMem[%d] = %s",i, DataMem[i]);
}
return 0;
}

CS50 IDE: printf returns extra characters

I am having problems with the printf function in the CS50 IDE. When I am using printf to print out a string (salt in this code), extra characters are being output that were not present in the original argument (argv).
Posted below is my code. Any help would be appreciated. Thank you.
#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdlib.h>
int main(int argc, string argv[])
{
// ensuring that only 1 command-line argument is inputted
if (argc != 2)
{
return 1;
}
char salt[2];
for (int i = 0; i < 2; i++)
{
char c = argv[1][i];
salt[i] = c;
}
printf("the first 2 characters of the argument is %s\n", salt);
}
You are missing a string terminator in salt.
Somehow the computer needs to know where your string ends in memory. It does so by reading until it encounters a NUL byte, which is a byte with value zero.
Your array salt has exactly 2 bytes of space, and after them, random garbage exists which just happens to be next in memory after your array. Since you don't have a string terminator, the computer will read this garbage as well until it encounters a NUL byte.
All you need to do is include such a byte in your array, like so:
char salt[3] = {0};
This will make salt one byte longer, and the {0} is a shorthand for {0, 0, 0} which will initialize the contents of the array with all zerores. (Alternatively, you could use char salt[3]; and later manually set the last byte to zero using salt[2] = 0;.)
In your case, salt is at least one element shy of being a string, unless the argv[1] is only one element, it does not contain a null-terminator.
You need to allocate space to hold the null-terminator and actually put one there to be able to use salt as string, as expected for the argument to %s conversion specifier in case of printf().
Otherwise, the string related functions and operations, which essentially rely on the fact that there will be a null terminator to mark the end of the char array (i.e., mark the end of valid memory that can be accessed), will try to access past the valid memory which causes undefined behavior. Once you hit UB, nothing is guaranteed.
So, considering the fact that you want to use
"....the first 2 characters of the argument....."
you need to make salt a 3-element char array, and make sure that salt[2] contains a null-terminator, like '\0'.

C number of line in the file (UNIX)

I'm trying to find the total number of lines in a text file, but it's not working (the final line count is 0 - see below). Here's the code:
#define BUFFER_SIZE 1
int lineNumber = 0;
int columnNumber = 0;
char *byteCurrent;
while (read(openFile, &byteCurrent, BUFFER_SIZE) > 0)
{
if (byteCurrent[0] != '\0') columnNumber++;
if (byteCurrent[0] == '\n') lineNumber++;
printf("%c", byteCurrent);
}
You have many problems with this code. The first is that you have an uninitialized pointer byteCurrent, but that doesn't matter since you don't actually use what it points to (which is just some seemingly random location) but you use a pointer to the pointer. When you do &byteCurrent you get a pointer to the variable byteCurrent which is of type char **.
That's just one problem, another is that there is no string terminator in a file. If you get a zero when reading (which is what '\0' is) it's because there is an actual zero in the file, not because you get to the end of something. This leads columnNumber to count the number of characters in the file and not any column number.
The solution to the first problem is to use a plain char variable:
char byteCurrent;
The solution to the second problem I don't know, because I don't know what your columnNumber variable is supposed to count.

Resources