Need help processing CHAR strings with printf - c

I'm using softserial to communicate with a bluetooth modem and I am pushing strings to the serial by using the following code:
char bt_string = "test";
bluetooth_println(bt_string);
I need to be able to replace the string with
printf(" Error: cmd=%02hX, res=%02hX\n", CMD_SEND_CID, res);
I have tried the following code
char bt_string;
sprintf(bt_string, " Error: cmd=%02hX, res=%02hX\n", CMD_SEND_CID, res);
bluetooth_println(bt_string);
But it fails to output anything. I'm obviously misunderstanding something. Thanks for any help.

You need to provide a buffer for your string.
char bt_string[256]; // <-- or any size that you are sure will be enough for what you will put in.
eventually, for safety you can use snpritf to avoid any buffer overflow:
#define MAX_BT_STRING 256
char bt_string[MAX_BT_STRING];
snprintf(bt_string, MAX_BT_STRING," Error: cmd=%02hX, res=%02hX\n", CMD_SEND_CID, res);
bluetooth_println(bt_string);

char *str and char str[] are distinctly different. Check this question for more details.
In your problem, you declared bt_string as const char *bt_string = "test", where bt_string is pointer which points to the first char in string "test". This string has a size of 5 bytes(don't forget the terminator \0);
In the next step:
sprintf(bt_string, " Error: cmd=%02hX, res=%02hX\n", CMD_SEND_CID, res);
You dump more than 5 bytes to bt_string which only has 5 bytes available space. The parts beyond 5 bytes will overwrite the contents after bt_string, which may lead to some serious situation or nothing, it depends on what is followed.
To settle this problem, you have to allocate enough memory space:
allocate on stack as A.S.H answered. the content is determined after function finished.
allocate via malloc;
use static key word to force the string stored either in BSS section or DATA section.

Related

Giving array a bigger value doesn't increase its size?

Here's what I did:
#include <stdio.h>
#include <string.h>
int main() {
char name[] = "longname";
printf("Name = %s \n",name);
strcpy(name,"evenlongername");
printf("Name = %s \n",name);
printf("size of the array is : %d",sizeof(name));
return 0;
}
It works, but how? I thought that once memory is assigned to an array in a program, it is not possible to change it. But, the output of this program is:
Name = longname
Name = evenlongername
size of the array is 9
So the compiler affirms that the size of the array is still 9. How is it able to store the word 'evenlongername' which has a size of 15 bytes (including the string terminator)?
In this case, name is allocated to fit "longname", which is 9 bytes. When you copy "evenlongername" into it, you're writing outside of bounds of that array. It's undefined behavior to write outside of the bounds, this means it may or may not work. Some times, it'll work, other times you'll get seg fault, yet other times you'll get weird behavior.
So the compiler affirms that the size of the array is still 9. How is it able to store the word 'evenlongername' which has a size of 15 bytes(including the string terminator)?
You are using a dangerous function (see Bugs), strcpy, which blindly copies source string to destination buffer without knowing about its size; in your case of copying 15 bytes into a buffer with size 9 bytes, essentially you have overflown. Your program may work fine if the memory access is valid and it doesn't overwrite something important.
Because C is a lower-level programming language, a C char[] is "barebone" mapping of memory, and not a "smart" container like C++ std::vector which automatically manages its size for you as you dynamically add and remove elements. If you are still not clear about the philosophy of C in this, I'd recommend you read *YOU* are full of bullshit. Very classic and rewarding.
Using sizeof on a char array will return the size of the buffer, not the length of the null-terminated string in the buffer. If you use strcpy to try and overflow the array, and it just happens to work (it's still undefined behavior), sizeof is still going to report the size used at declaration. That never changes.
If what you're interested in is observing how the length of a string changes with different assignments:
Use an adequate buffer to store every string you're going to test.
Use the function strlen in <string.h> which will give you the actual length of the string, and not the length of your buffer, which, once declared, is constant.

Confusion in "strcat function in C assumes the destination string is large enough to hold contents of source string and its own."

So I read that strcat function is to be used carefully as the destination string should be large enough to hold contents of its own and source string. And it was true for the following program that I wrote:
#include <stdio.h>
#include <string.h>
int main(){
char *src, *dest;
printf("Enter Source String : ");
fgets(src, 10, stdin);
printf("Enter destination String : ");
fgets(dest, 20, stdin);
strcat(dest, src);
printf("Concatenated string is %s", dest);
return 0;
}
But not true for the one that I wrote here:
#include <stdio.h>
#include <string.h>
int main(){
char src[11] = "Hello ABC";
char dest[15] = "Hello DEFGIJK";
strcat(dest, src);
printf("concatenated string %s", dest);
getchar();
return 0;
}
This program ends up adding both without considering that destination string is not large enough. Why is it so?
The strcat function has no way of knowing exactly how long the destination buffer is, so it assumes that the buffer passed to it is large enough. If it's not, you invoke undefined behavior by writing past the end of the buffer. That's what's happening in the second piece of code.
The first piece of code is also invalid because both src and dest are uninitialized pointers. When you pass them to fgets, it reads whatever garbage value they contain, treats it as a valid address, then tries to write values to that invalid address. This is also undefined behavior.
One of the things that makes C fast is that it doesn't check to make sure you follow the rules. It just tells you the rules and assumes that you follow them, and if you don't bad things may or may not happen. In your particular case it appeared to work but there's no guarantee of that.
For example, when I ran your second piece of code it also appeared to work. But if I changed it to this:
#include <stdio.h>
#include <string.h>
int main(){
char dest[15] = "Hello DEFGIJK";
strcat(dest, "Hello ABC XXXXXXXXXX");
printf("concatenated string %s", dest);
return 0;
}
The program crashes.
I think your confusion is not actually about the definition of strcat. Your real confusion is that you assumed that the C compiler would enforce all the "rules". That assumption is quite false.
Yes, the first argument to strcat must be a pointer to memory sufficient to store the concatenated result. In both of your programs, that requirement is violated. You may be getting the impression, from the lack of error messages in either program, that perhaps the rule isn't what you thought it was, that somehow it's valid to call strcat even when the first argument is not a pointer to enough memory. But no, that's not the case: calling strcat when there's not enough memory is definitely wrong. The fact that there were no error messages, or that one or both programs appeared to "work", proves nothing.
Here's an analogy. (You may even have had this experience when you were a child.) Suppose your mother tells you not to run across the street, because you might get hit by a car. Suppose you run across the street anyway, and do not get hit by a car. Do you conclude that your mother's advice was incorrect? Is this a valid conclusion?
In summary, what you read was correct: strcat must be used carefully. But let's rephrase that: you must be careful when calling strcat. If you're not careful, all sorts of things can go wrong, without any warning. In fact, many style guides recommend not using functions such as strcat at all, because they're so easy to misuse if you're careless. (Functions such as strcat can be used perfectly safely as long as you're careful -- but of course not all programmers are sufficiently careful.)
The strcat() function is indeed to be used carefully because it doesn't protect you from anything. If the source string isn't NULL-terminated, the destination string isn't NULL-terminated, or the destination string doesn't have enough space, strcat will still copy data. Therefore, it is easy to overwrite data you didn't mean to overwrite. It is your responsibility to make sure you have enough space. Using strncat() instead of strcat will also give you some extra safety.
Edit Here's an example:
#include <stdio.h>
#include <string.h>
int main()
{
char s1[16] = {0};
char s2[16] = {0};
strcpy(s2, "0123456789abcdefOOPS WAY TOO LONG");
/* ^^^ purposefully copy too much data into s2 */
printf("-%s-\n",s1);
return 0;
}
I never assigned to s1, so the output should ideally be --. However, because of how the compiler happened to arrange s1 and s2 in memory, the output I actually got was -OOPS WAY TOO LONG-. The strcpy(s2,...) overwrote the contents of s1 as well.
On gcc, -Wall or -Wstringop-overflow will help you detect situations like this one, where the compiler knows the size of the source string. However, in general, the compiler can't know how big your data will be. Therefore, you have to write code that makes sure you don't copy more than you have room for.
Both snippets invoke undefined behavior - the first because src and dest are not initialized to point anywhere meaningful, and the second because you are writing past the end of the array.
C does not enforce any kind of bounds checking on array accesses - you won't get an "Index out of range" exception if you try to write past the end of an array. You may get a runtime error if you try to access past a page boundary or clobber something important like the frame pointer, but otherwise you just risk corrupting data in your program.
Yes, you are responsible for making sure the target buffer is large enough for the final string. Otherwise the results are unpredictable.
I'd like to point out what is actually happening in the 2nd program in order to illustrate the problem.
It allocates 15 bytes at the memory location starting at dest and copies 14 bytes into it (including the null terminator):
char dest[15] = "Hello DEFGIJK";
...and 11 bytes at src with 10 bytes copied into it:
char src[11] = "Hello ABC";
The strcat() call then copies 10 bytes (9 chars plus the null terminator) from src into dest, starting right after the 'K' in dest. The resulting string at dest will be 23 bytes long including the null terminator. The problem is, you allocated only 15 bytes at dest, and the memory adjacent to that memory will be overwritten, i.e. corrupted, leading to program instability, wrong results, data corruption, etc.
Note that the strcat() function knows nothing about the amount of memory you've allocated at dest (or src, for that matter). It is up to you to make sure you've allocated enough memory at dest to prevent memory corruption.
By the way, the first program doesn't allocate memory at dest or src at all, so your calls to fgets() are corrupting memory starting at those locations.

dynamic memory allocation for strings in c

I found this code working perfectly.
#include <stdio.h>
#include <stdlib.h>
int main(int argc,char *argv[])
{
char* s; /* input string */
s=malloc(sizeof(s));
int c;
if(argc==1){ // if file name not given
while (gets(s)){
puts(s);
}
}
}
What I don't understand is, how is the string s stored in memory.i am allocating memory only for the pointer s, which is of 4 bytes.Now where does the input string given by the user get stored in?
It's only safe for the first four bytes. The fifth byte will overrun the allocated data and tramp on something else which will yield undefined behaviour (might crash, might not).
Also, you don't null terminate the string with '\0' after you finish writing the chars, so you'll probably introduce another crash when you try and call a string routine (strcpy) on it - unless the memory after your string happened to contain zeros anyway, but naturally you shouldn't rely on this chance!
Rather than this you should do
s=malloc(sizeof(*s)*(number_of_chars+1));
You set number_of_chars to appropriate value, so that you allocate memory to store those many characters. +1 is for last '\0' character.
With your approach you are allocating 4 bytes so you can store usually those many characters.
You've allocated sizeof(void*) bytes of memory and filling it with user-provided data. You have an address and writing to it, it's ok from compiler's point of view (maybe it's really what you want, who knows). Even if you program didn't crash when you exceed it - it's still an error. It's just a memory, something else could be stored in this area, and you'll overwrite it - so expect heavy trouble if you'll ever do that.
It's possible as compiler assigns two bytes.
now you give 10 bytes in input, so your allocated memory overflows and data stored beyond your allocated memory only if its available.
It might give error if the data you want to store is greater then available and not give error if the data you want to store is greater then allocated.
puts will print data until it gets '\0'.
So this is expected behavior!!

Which method is correct for Initializing a wchar_t string?

I am writing a program and I need to initialize a message buffer which will hold text. I am able to make it work, however I am writing below various ways used to initialize the strings in C and I want to understand the difference. Also, which is the most appropriate method for initializing a wchar_t/char string?
Method I:
wchar_t message[100];
based on my understanding, this will allocate a memory space of 200 bytes (I think size of wchar_t is 2 bytes on Windows OS). This memory allocation is static and it will be allocated inside the .data section of the executable at the time of compiling.
message is also a memory address itself that points to the first character of the string.
This method of initializing a string works good for me.
Method II:
wchar_t *message;
message=(wchar_t *) malloc(sizeof(wchar_t) * 100);
This method will first initialize the variable message as a pointer to wchar_t. It is an array of wide characters.
next, it will dynamically allocate memory for this string. I think I have written the syntax for it correctly.
When I use this method in my program, it does not read the text after the space in a string.
Example text: "This is a message"
It will read only "This" into the variable message and no text after that.
Method III:
wchar_t *message[100];
This will define message as an array of 100 wide characters and a pointer to wchar_t. This method of initializing message works good. However, I am not sure if it is the right way. Because message in itself is pointing to the first character in the string. So, initializing it with the size, is it correct?
I wanted to understand it in more depth, the correct way of initializing a string. This same concept can be extended to a string of characters as well.
The magic is the encoding-prefix L:
#include <wchar.h>
...
wchar_t m1[] = L"Hello World";
wchar_t m2[42] = L"Hello World";
wchar_t * pm = L"Hello World";
...
wcscat(m2, L" again");
pm = calloc(123, sizeof *pm);
wcspy(pm, L"bye");
See also the related part of the C11 Standard.
It really depends on what you want to do and how you use the data. If you need it globally, by all means, define a static array. If you only need it in a method, do the same in the method. If you want to pass the data around between functions, over a longer lifetime, malloc the memory and use that.
However, your method III is wrong - it is an array of 100 wchar_t pointers. If you want to create a 100 large wchar_t array and a pointer, you need to use:
wchar_t message[100], *message_pointer;
Also, concerning terminology: you are only declaring a variable in the method I, you never assign anything to it.

How strcpy works behind the scenes?

This may be a very basic question for some. I was trying to understand how strcpy works actually behind the scenes. for example, in this code
#include <stdio.h>
#include <string.h>
int main ()
{
char s[6] = "Hello";
char a[20] = "world isnsadsdas";
strcpy(s,a);
printf("%s\n",s);
printf("%d\n", sizeof(s));
return 0;
}
As I am declaring s to be a static array with size less than that of source. I thought it wont print the whole word, but it did print world isnsadsdas .. So, I thought that this strcpy function might be allocating new size if destination is less than the source. But now, when I check sizeof(s), it is still 6, but it is printing out more than that. Hows that working actually?
You've just caused undefined behaviour, so anything can happen. In your case, you're getting lucky and it's not crashing, but you shouldn't rely on that happening. Here's a simplified strcpy implementation (but it's not too far off from many real ones):
char *strcpy(char *d, const char *s)
{
char *saved = d;
while (*s)
{
*d++ = *s++;
}
*d = 0;
return saved;
}
sizeof is just returning you the size of your array from compile time. If you use strlen, I think you'll see what you expect. But as I mentioned above, relying on undefined behaviour is a bad idea.
http://natashenka.ca/wp-content/uploads/2014/01/strcpy8x11.png
strcpy is considered dangerous for reasons like the one you are demonstrating. The two buffers you created are local variables stored in the stack frame of the function. Here is roughly what the stack frame looks like:
http://upload.wikimedia.org/wikipedia/commons/thumb/d/d3/Call_stack_layout.svg/342px-Call_stack_layout.svg.png
FYI things are put on top of the stack meaning it grows backwards through memory (This does not mean the variables in memory are read backwards, just that newer ones are put 'behind' older ones). So that means if you write far enough into the locals section of your function's stack frame, you will write forward over every other stack variable after the variable you are copying to and break into other sections, and eventually overwrite the return pointer. The result is that if you are clever, you have full control of where the function returns. You could make it do anything really, but it isn't YOU that is the concern.
As you seem to know by making your first buffer 6 chars long for a 5 character string, C strings end in a null byte \x00. The strcpy function copies bytes until the source byte is 0, but it does not check that the destination is that long, which is why it can copy over the boundary of the array. This is also why your print is reading the buffer past its size, it reads till \x00. Interestingly, the strcpy may have written into the data of s depending on the order the compiler gave it in the stack, so a fun exercise could be to also print a and see if you get something like 'snsadsdas', but I can't be sure what it would look like even if it is polluting s because there are sometimes bytes in between the stack entries for various reasons).
If this buffer holds say, a password to check in code with a hashing function, and you copy it to a buffer in the stack from wherever you get it (a network packet if a server, or a text box, etc) you very well may copy more data from the source than the destination buffer can hold and give return control of your program to whatever user was able to send a packet to you or try a password. They just have to type the right number of characters, and then the correct characters that represent an address to somewhere in ram to jump to.
You can use strcpy if you check the bounds and maybe trim the source string, but it is considered bad practice. There are more modern functions that take a max length like http://www.cplusplus.com/reference/cstring/strncpy/
Oh and lastly, this is all called a buffer overflow. Some compilers add a nice little blob of bytes randomly chosen by the OS before and after every stack entry. After every copy the OS checks these bytes against its copy and terminates the program if they differ. This solves a lot of security problems, but it is still possible to copy bytes far enough into the stack to overwrite the pointer to the function to handle what happens when those bytes have been changed thus letting you do the same thing. It just becomes a lot harder to do right.
In C there is no bounds checking of arrays, its a trade off in order to have better performance at the risk of shooting yourself in the foot.
strcpy() doesn't care whether the target buffer is big enough so copying too many bytes will cause undefined behavior.
that is one of the reasons that a new version of strcpy were introduced where you can specify the target buffer size strcpy_s()
Note that sizeof(s) is determined at run time. Use strlen() to find the number of characters s occupied. When you perform strcpy() source string will be replaced by destination string so your output wont be "Helloworld isnsadsdas"
#include <stdio.h>
#include <string.h>
main ()
{
char s[6] = "Hello";
char a[20] = "world isnsadsdas";
strcpy(s,a);
printf("%s\n",s);
printf("%d\n", strlen(s));
}
You are relying on undefined behaviour in as much as that the compiler has chose to place the two arrays where your code happens to work. This may not work in future.
As to the sizeof operator, this is figured out at compile time.
Once you use adequate array sizes you need to use strlen to fetch the length of the strings.
The best way to understand how strcpy works behind the scene is...reading its source code!
You can read the source for GLibC : http://fossies.org/dox/glibc-2.17/strcpy_8c_source.html . I hope it helps!
At the end of every string/character array there is a null terminator character '\0' which marks the end of the string/character array.
strcpy() preforms its task until it sees the '\0' character.
printf() also preforms its task until it sees the '\0' character.
sizeof() on the other hand is not interested in the content of the array, only its allocated size (how big it is supposed to be), thus not taking into consideration where the string/character array actually ends (how big it actually is).
As opposed to sizeof(), there is strlen() that is interested in how long the string actually is (not how long it was supposed to be) and thus counts the number of characters until it reaches the end ('\0' character) where it stops (it doesn't include the '\0' character).
Better Solution is
char *strcpy(char *p,char const *q)
{
char *saved=p;
while(*p++=*q++);
return saved;
}

Resources