Proper use of strcat() in C - c

I have an archive file that looks like this:
!<arch>
file1.txt/ 1350248044 45503 13036 100660 28 `
hello
this is sample file 1
Now in here, the number 28 in the header is the file1.txt size. To get that number, I use:
int curr_char;
char file_size[10];
int int_file_size;
curr_char = fgetc(arch_file);
while(curr_char != ' '){
strcat(file_size, &curr_char);
curr_char = fgetc(arch_file);
}
// Convert the characters to the corresponding integer value using atoi()
int_file_size = atoi(file_size);
However, values in the file_size array change every time I run my code. Sometimes it's correct, but mostly not. Here are some examples of what I get for file_size:
?28`U
2U8U
28 <--- Correct!
pAi?28
I believe the problem is with my strcat() function, but not sure. Any help would be appreciated.

You shouldn't read the file character wise. There are higher level functions doing this. As larsmans already pointed out, you can use fscanf() for this task:
fscanf(arch_file, "%d", &int_file_size);

&curr_char is an int*, so you're copying over the bits of an int as if they represented a string.
You should be using scanf.

The expression &curr_char points to a single character (well, actually an integer as that's how you declared it). strcat looks for a string, and string as you should know are terminated by a '\0' character. So what strcat does in your case is use the &curr_char pointer as the address of a string and looks for the terminator. Since that is not found weird stuff will happen.
One way of solving this is to make curr_char an array, initialized to zero (the string terminator character) and read into the first entry:
char curr_char[2] = { '\0' }; /* Will make all character in array be zero */
...
curr_char[0] = fgetc(...);
There is also another problem, and that is that you are trying to concatenate into a string that is not initialized. When running your program, the array file_size can contain any data, it's not automatically zeroed out. This leads to the weird characters before the number. This is solved partially the same way as the above problem, by initializing the array:
char file_size[10] = { '\0' };

Related

How to determine how many chars in a character array in C?

I'm using C with no access to libraries or anything like that since this is kernel code for an operating system. So I can not use sizeOf or any built in function like that. name[] is a character array that holds the name of a file, but the file name can be up to 6 characters long, and I want to determine how long the file name actually is.
Right now, my code looks like this:
int length = 0;
while(name[length] != 0x0)
{
length++;
}
I also tried it with the 0x0 replaced with '\0' but it still didn't work.
Ideally, it would iterate through the char array and stop once it reaches the end of the file name, but I'm pretty sure that it keeps going past it.
Probably the array doesn't contain the zero char ('\0') immediately after the last char of your string.
You can solve this directly by adding the '\0' while you are filling-up an array, or by initializing the whole array by '\0' before filling-up.

How to check if there is a `\0` character in a filename using C?

I'd like to write a function like this:
int validate_file_name(char *filename)
{
//...
}
which will:
return 1 if there was no \0 character in the filename,
0 otherwise.
I thought it may be achieved using a simple for(size_t i = 0; i < strlen(filename); i++), but I don't know how to determine how much characters I've got to check?
I can't use strlen() because it will terminate on the first occurrence of a \0 character.
How should I approach this problem?
Clarification:
I am trying to apply these guidelines to a filename I receive. If you should avoid putting a \0 in a filename, how could you validate this if you've got no size parameter.
Moreover, there are strings with multiple \0 characters, like here: http://www.gnu.org/software/libc/manual/html_mono/libc.html#Argz-and-Envz-Vectors. Still, I had no idea that it is impossible to determine their length if it is not explicitly provided.
Conclusion:
There is no way you can determine the length of string which is not NULL-terminated. Unless you know the length of course or you deploy some dirty hacks: Checking if a pointer is allocated memory or not.
You are trying to solve a problem that does not need to be solved.
A file name is a string. In C, a "string" is by definition "a contiguous sequence of characters terminated by and including the first null
character".
It is impossible to have a string or a file name with a null character embedded in it.
It's possible to have a sequence of characters with an embedded null character. For example:
char buf[] = "foo\0bar.txt";
buf is an array of 12 characters; the characters at positions 3 and 11 are both null characters. If you treat buf as a string, for example by calling
fopen(buf, "r")
it will be treated as a string with a length of 3 (the length of a string does not include the terminating null character).
If you're working with character arrays that may or may not contain strings, then it makes sense to do what you're asking. You would need to keep track of the size of the buffer separately from the address of the initial character, either by passing an additional argument or by wrapping the pointer and the length in a structure.
But if you're dealing with file names, it's almost certainly best just to deal with strings and assume that whatever char* value is passed to your function points to a valid string. If it doesn't (if there is no null character anywhere in the array), that's the caller's fault, and not something you can reasonably check.
(Incidentally, Unix/Linux file systems explicitly forbid null characters in file names. The / character is also forbidden, because it's used as a directory name delimiter. Windows file systems have even stricter rules.)
One last point: NULL is (a macro that expands to) a null pointer constant. Please don't use the term NULL to refer to the null character '\0'.
The answer is that you can't write a function that does that if you don't know the length of the string.
To determine the length of the string strlen() searches for the '\0' character which if is not present will cause undefined behavior.
If you knew the length of the string then,
for (int i = 0 ; i < length ; ++i)
{
if (string[i] != '\0')
continue;
return 1;
}
return 0;
would work, if you don't know the length of the string then the condition would be
for (int i = 0 ; string[i] != '\0' ; ++i)
which obviously means that then searching for the '\0' makes no sense because it's presence is what makes all other string related functions to work properly.
If the string is not NULL-terminated, what else it is terminated by? And if you don't know that, what is it length? If you know the answer to these problems, you know the answer to your question.

second memcpy() attaches previous memcpy() array to it

I have a little problem here with memcpy()
When I write this
char ipA[15], ipB[15];
size_t b = 15;
memcpy(ipA,line+15,b);
It copies b bytes from array line starting at 15th element (fine, this is what i want)
memcpy(ipB,line+31,b);
This copies b bytes from line starting at 31st element, but it also attaches to it the result for previous command i.e ipA.
Why? ipB size is 15, so it shouldnt have enough space to copy anything else. whats happening here?
result for ipA is 192.168.123.123
result for ipB becomes 205.123.123.122 192.168.123.123
Where am I wrong? I dont actually know alot about memory allocation in C.
It looks like you're not null-terminating the string in ipA. The compiler has put the two variables next to one another in memory, so string operations assume that the first null terminator is sometime after the second array (whenever the next 0 occurs in memory).
Try:
char ipA[16], ipB[16];
size_t b = 15;
memcpy(ipA,line+15,b);
ipA[15] = '\0';
memcpy(ipB,line+31,b);
ipB[15] = '\0';
printf("ipA: %s\nipB: %s\n", ipA, ipB)
This should confirm whether this is the problem. Obviously you could make the code a bit more elegant than my test code above. As an alternative to manually terminating, you could use printf("%.*s\n", b, ipA); or similar to force printf to print the correct number of characters.
Are you checking the content of the arrays by doing printf("%s", ipA) ? If so, you'll end up with the described effect since your array is interpreted as a C string which is not null terminated. Do this instead: printf("%.*s", sizeof(ipA), ipA)
Character strings in C require a terminating mark. It is the char value 0.
As your two character strings are contiguous in memory, if you don't terminate the first character string, then when reading it, you will continue until memory contains the end-of-string character.

Looping until a specific string is found

I have a simple question. I want to write a program in C that scans the lines of a specific file, and if the only phrase on the line is "Atoms", I want it to stop scanning and report which line it was on. This is what I have and is not compiling because apparently I'm comparing an integer to a pointer: (of course "string.h" is included.
char dm;
int test;
test = fscanf(inp,"%s", &dm);
while (test != EOF) {
if (dm=="Amit") {
printf("Found \"Atoms\" on line %d", j);
break;
}
j++;
}
the file was already opened with:
inp = fopen( .. )
And checked to make sure it opens correctly...
I would like to use a different approach though, and was wondering if it could work. Instead of scanning individual strings, could I scan entire lines as such:
// char tt[200];
//
// fgets(tt, 200, inp);
and do something like:
if (tt[] == "Atoms") break;
Thanks!
Amit
Without paying too much attention to your actual code here, the most important mistake your making is that the == operator will NOT compare two strings.
In C, a string is an array of characters, which is simply a pointer. So doing if("abcde" == some_string) will never be true unless they point to the same string!
You want to use a method like "strcmp(char *a, char *b)" which will return 0 if two strings are equal and something else if they're not. "strncmp(char *a, char *b, size_t n)" will compare the first "n" characters in a and b, and return 0 if they're equal, which is good for looking at the beginning of strings (to see if a string starts with a certain set of characters)
You also should NOT be passing a character as the pointer for %s in your fscanf! This will cause it to completely destroy your stack it tries to put many characters into ch, which only has space for a single character! As James says, you want to do something like char ch[BUFSIZE] where BUFSIZE is 1 larger than you ever expect a single line to be, then do "fscanf(inp, "%s", ch);"
Hope that helps!
please be aware that dm is a single char, while you need a char *
more: if (dm=="Amit") is wrong, change it in
if (strcmp(dm, "Amit") == 0)
In the line using fscanf, you are casting a string to the address of a char. Using the %s in fscanf should set the string to a pointer, not an address:
char *dm;
test = fscanf(inp,"%s", dm);
The * symbol declares an indirection, namely, the variable pointed to by dm. The fscanf line will declare dm as a reference to the string captured with the %s delimiter. It will point to the address of the first char in the string.
What kit said is correct too, the strcmp command should be used, not the == compare, as == will just compare the addresses of the strings.
Edit: What kit says below is correct. All pointers should be allocated memory before they are used, or else should be cast to a pre-allocated memory space. You can allocate memory like this:
dm = (char*)malloc(sizeof(char) * STRING_LENGTH);
where STRING_LENGTH is a maximum length of a possible string. This memory allocation only has to be done once.
The problem is you've declared 'dm' as a char, not a malloc'd char* or char[BUFSIZE]
http://www.cplusplus.com/reference/clibrary/cstdio/fscanf/
You'll also probably report incorrect line numbers, you'll need to scan the read-in buffer for '\n' occurences, and handle the case where your desired string lies across buffer boundaries.

Questions on C strings

I am new to C and I am very much confused with the C strings. Following are my questions.
Finding last character from a string
How can I find out the last character from a string? I came with something like,
char *str = "hello";
printf("%c", str[strlen(str) - 1]);
return 0;
Is this the way to go? I somehow think that, this is not the correct way because strlen has to iterate over the characters to get the length. So this operation will have a O(n) complexity.
Converting char to char*
I have a string and need to append a char to it. How can i do that? strcat accepts only char*. I tried the following,
char delimiter = ',';
char text[6];
strcpy(text, "hello");
strcat(text, delimiter);
Using strcat with variables that has local scope
Please consider the following code,
void foo(char *output)
{
char *delimiter = ',';
strcpy(output, "hello");
strcat(output, delimiter);
}
In the above code,delimiter is a local variable which gets destroyed after foo returned. Is it OK to append it to variable output?
How strcat handles null terminating character?
If I am concatenating two null terminated strings, will strcat append two null terminating characters to the resultant string?
Is there a good beginner level article which explains how strings work in C and how can I perform the usual string manipulations?
Any help would be great!
Last character: your approach is correct. If you will need to do this a lot on large strings, your data structure containing strings should store lengths with them. If not, it doesn't matter that it's O(n).
Appending a character: you have several bugs. For one thing, your buffer is too small to hold another character. As for how to call strcat, you can either put the character in a string (an array with 2 entries, the second being 0), or you can just manually use the length to write the character to the end.
Your worry about 2 nul terminators is unfounded. While it occupies memory contiguous with the string and is necessary, the nul byte at the end is NOT "part of the string" in the sense of length, etc. It's purely a marker of the end. strcat will overwrite the old nul and put a new one at the very end, after the concatenated string. Again, you need to make sure your buffer is large enough before you call strcat!
O(n) is the best you can do, because of the way C strings work.
char delimiter[] = ",";. This makes delimiter a character array holding a comma and a NUL Also, text needs to have length 7. hello is 5, then you have the comma, and a NUL.
If you define delimiter correctly, that's fine (as is, you're assigning a character to a pointer, which is wrong). The contents of output won't depend on delimiter later on.
It will overwrite the first NUL.
You're on the right track. I highly recommend you read K&R C 2nd Edition. It will help you with strings, pointers, and more. And don't forget man pages and documentation. They will answer questions like the one on strcat quite clearly. Two good sites are The Open Group and cplusplus.com.
A "C string" is in reality a simple array of chars, with str[0] containing the first character, str[1] the second and so on. After the last character, the array contains one more element, which holds a zero. This zero by convention signifies the end of the string. For example, those two lines are equivalent:
char str[] = "foo"; //str is 4 bytes
char str[] = {'f', 'o', 'o', 0};
And now for your questions:
Finding last character from a string
Your way is the right one. There is no faster way to know where the string ends than scanning through it to find the final zero.
Converting char to char*
As said before, a "string" is simply an array of chars, with a zero terminator added to the end. So if you want a string of one character, you declare an array of two chars - your character and the final zero, like this:
char str[2];
str[0] = ',';
str[1] = 0;
Or simply:
char str[2] = {',', 0};
Using strcat with variables that has local scope
strcat() simply copies the contents of the source array to the destination array, at the offset of the null character in the destination array. So it is irrelevant what happens to the source after the operation. But you DO need to worry if the destination array is big enough to hold the data - otherwise strcat() will overwrite whatever data sits in memory right after the array! The needed size is strlen(str1) + strlen(str2) + 1.
How strcat handles null terminating character?
The final zero is expected to terminate both input strings, and is appended to the output string.
Finding last character from a string
I propose a thought experiment: if it were generally possible to find the last character
of a string in better than O(n) time, then could you not also implement strlen
in better than O(n) time?
Converting char to char*
You temporarily can store the char in an array-of-char, and that will decay into
a pointer-to-char:
char delimiterBuf[2] = "";
delimiterBuf[0] = delimiter;
...
strcat(text, delimiterBuf);
If you're just using character literals, though, you can simply use string literals instead.
Using strcat with variables that has local scope
The variable itself isn't referenced outside the scope. When the function returns,
that local variable has already been evaluated and its contents have already been
copied.
How strcat handles null terminating character?
"Strings" in a C are NUL-terminated sequences of characters. Both inputs to
strcat must be NUL-terminated, and the result will be NUL-terminated. It
wouldn't be useful for strcat to write an extra NUL-byte to the result if it
doesn't need to.
(And if you're wondering what if the input strings have multiple trailing
NUL bytes already, I propose another thought experiment: how would strcat know
how many trailing NUL-bytes there are in a string?)
BTW, since you tagged this with "best-practices", I'll also recommend that you take care not to write past the end of your destination buffers. Typically this means avoiding strcat and strcpy (unless you've already checked that the input strings won't overflow the destination) and using safer versions (e.g. strncat. Note that strncpy has its own pitfalls, so that's a poor substitute. There also are safer versions that are non-standard, such as strlcpy/strlcat and strcpy_s/strcat_s.)
Similarly, functions like your foo function always should take an additional argument specifying what the size of the destination buffer is (and documentation should make it explicitly clear whether that size accounts for a NUL terminator or not).
How can I find out the last character
from a string?
Your technique with str[strlen(str) - 1] is fine. As pointed out, you should avoid repeated, unnecessary calls to strlen and store the results.
I somehow think that, this is not the
correct way because strlen has to
iterate over the characters to get the
length. So this operation will have a
O(n) complexity.
Repeated calls to strlen can be a bane of C programs. However, you should avoid premature optimization. If a profiler actually demonstrates a hotspot where strlen is expensive, then you can do something like this for your literal string case:
const char test[] = "foo";
sizeof test // 4
Of course if you create 'test' on the stack, it incurs a little overhead (incrementing/decrementing stack pointer), but no linear time operation involved.
Literal strings are generally not going to be so gigantic. For other cases like reading a large string from a file, you can store the length of the string in advance as but one example to avoid recomputing the length of the string. This can also be helpful as it'll tell you in advance how much memory to allocate for your character buffer.
I have a string and need to append a
char to it. How can i do that? strcat
accepts only char*.
If you have a char and cannot make a string out of it (char* c = "a"), then I believe you can use strncat (need verification on this):
char ch = 'a';
strncat(str, &ch, 1);
In the above code,delimiter is a local
variable which gets destroyed after
foo returned. Is it OK to append it to
variable output?
Yes: functions like strcat and strcpy make deep copies of the source string. They don't leave shallow pointers behind, so it's fine for the local data to be destroyed after these operations are performed.
If I am concatenating two null
terminated strings, will strcat
append two null terminating characters
to the resultant string?
No, strcat will basically overwrite the null terminator on the dest string and write past it, then append a new null terminator when it's finished.
How can I find out the last character from a string?
Your approach is almost correct. The only way to find the end of a C string is to iterate throught the characters, looking for the nul.
There is a bug in your answer though (in the general case). If strlen(str) is zero, you access the character before the start of the string.
I have a string and need to append a char to it. How can i do that?
Your approach is wrong. A C string is just an array of C characters with the last one being '\0'. So in theory, you can append a character like this:
char delimiter = ',';
char text[7];
strcpy(text, "hello");
int textSize = strlen(text);
text[textSize] = delimiter;
text[textSize + 1] = '\0';
However, if I leave it like that I'll get zillions of down votes because there are three places where I have a potential buffer overflow (if I didn't know that my initial string was "hello"). Before doing the copy, you need to put in a check that text is big enough to contain all the characters from the string plus one for the delimiter plus one for the terminating nul.
... delimiter is a local variable which gets destroyed after foo returned. Is it OK to append it to variable output?
Yes that's fine. strcat copies characters. But your code sample does no checks that output is big enough for all the stuff you are putting into it.
If I am concatenating two null terminated strings, will strcat append two null terminating characters to the resultant string?
No.
I somehow think that, this is not the correct way because strlen has to iterate over the characters to get the length. So this operation will have a O(n) complexity.
You are right read Joel Spolsky on why C-strings suck. There are few ways around it. The ways include either not using C strings (for example use Pascal strings and create your own library to handle them), or not use C (use say C++ which has a string class - which is slow for different reasons, but you could also write your own to handle Pascal strings more easily than in C for example)
Regarding adding a char to a C string; a C string is simply a char array with a nul terminator, so long as you preserve the terminator it is a string, there's no magic.
char* straddch( char* str, char ch )
{
char* end = &str[strlen(str)] ;
*end = ch ;
end++ ;
*end = 0 ;
return str ;
}
Just like strcat(), you have to know that the array that str is created in is long enough to accommodate the longer string, the compiler will not help you. It is both inelegant and unsafe.
If I am concatenating two null
terminated strings, will strcat append
two null terminating characters to the
resultant string?
No, just one, but what ever follows that may just happen to be nul, or whatever happened to be in memory. Consider the following equivalent:
char* my_strcat( char* s1, const char* s2 )
{
strcpy( &str[strlen(str)], s2 ) ;
}
the first character of s2 overwrites the terminator in s1.
In the above code,delimiter is a local
variable which gets destroyed after
foo returned. Is it OK to append it to
variable output?
In your example delimiter is not a string, and initialising a pointer with a char makes no sense. However if it were a string, the code would be fine, strcat() copies the data from the second string, so the lifetime of the second argument is irrelevant. Of course you could in your example use a char (not a char*) and the straddch() function suggested above.

Resources