String termination - char c=0 vs char c='\0' - c

When terminating a string, it seems to me that logically char c=0 is equivalent to char c='\0', since the "null" (ASCII 0) byte is 0, but usually people tend to do '\0' instead. Is this purely out of preference or should it be a better "practice"?
What is the preferred choice?
EDIT: K&R says: "The character constant '\0' represents the character with value zero, the null character. '\0' is often written instead of 0 to emphasize the character nature of some expression, but the numeric value is just 0.

http://en.wikipedia.org/wiki/Ascii#ASCII_control_code_chart
Binary Oct Dec Hex Abbr Unicode Control char C Escape code Name
0000000 000 0 00 NUL ␀ ^# \0 Null character
There's no difference, but the more idiomatic one is '\0'.
Putting it down as char c = 0; could mean that you intend to use it as a number (e.g. a counter). '\0' is unambiguous.

'\0' is just an ASCII character. The same as 'A', or '0' or '\n'
If you write char c = '\0', it's the same aschar c = 0;
If you write char c = 'A', it's the same as char c = 65
It's just a character representation and it's a good practice to write it, when you really mean the NULL byte of string. Since char is in C one byte (integral type), it doesn't have any special meaning.

Preferred choice is that which can give people reading your code an ability to understand how do you use your variable - as a number or as a character.
Best practice is to use 0 when you mean you variable as a number and to use '\0' when you mean your variable is a character.

The above answers are already quite clear. I just share what I learned about this issue with a demo.
#include <stdlib.h>
#include <stdio.h>
char*
mystrcat(char *dest, char *src) {
size_t i,j;
for(i = 0; dest[i] != '\0'; i++)
;
for(j = 0; src[j] != '\0'; j++)
dest[i+j] = src[j];
dest[i+j] = '\0';
return dest;
}
int main() {
char *str = malloc(20); // malloc allocate memory, but doesn't initialize the memory
// str[0] = '\0';
str[0] = 0;
for (int k = 0; k <10; k++) {
char s[2];
sprintf(s, "%d", k);
mystrcat(str, s);
}
printf("debug:%s\n", str);
return 0;
}
In the above program, I used malloc to initialize the pointer, but malloc doesn't initialize the memory. So after the mystrcat operation(which is nearly the same as the strcat function in glibc), the string may contain mess code(since the memory content is not initialized).
So I need to initialize the memory. In this case str[0] = 0 and str[0] = 0 both can make it work.

Related

Converting string to one with escape sequences

I have a char string containing hexadecimal characters (without 0x or \x):
char *A = "0a0b0c";
from which I want to obtain
const char *B = "\x0a\x0b\x0c";
Is there an efficient way to do this? Thanks!
EDIT: To be clear, I want the resultant string to contain the 3 characters \x0a, \x0b, \x0c, not a 12 character string that says "\x0a\x0b\x0c" where the \ and x are read as individual characters.
This is what I have tried:
const char *B[12];
for (j = 0; j < 4; ++j) {
B[4 * j + 0] = '\\';
B[4 * j + 1] = 'x';
B[4 * j + 2] = A[2 * j];
B[4 * j + 3] = A[2 * j + 1];
};
B[12] = '\0';
which gives me a 12 character string "\x0a\x0b\x0c", but I want B to be as if it was assigned thus:
const char *B = "\x0a\x0b\x0c";
There are multiple confusions in your code:
the input string has 6 characters and a null terminator
the output string should be defined as const char B[3]; or possibly const char B[4]; if you intend to set a null terminator after the 3 converted bytes.
the definition const char *B[12]; in your code defines an array of 12 pointers to strings, which is a very different beast.
The for is fine, but it does not do what you want at all. You want to convert the hexadecimal encoded values to byte values, not insert extra \ and x characters.
the trailing ; after the } is useless
you set a null terminator at B[12], which is beyond the end of B.
Here is a corrected version using sscanf:
const char *A = "0a0b0c";
const char B[4] = { 0 };
for (j = 0; j < 3; j++) {
sscanf(&A[j * 2], "%2hhx", (unsigned char *)&B[j]);
}
The conversion format %2hhx means convert at most the first 2 bytes at A[j * 2] as an unsigned integer encoded in hexadecimal and store the resulting value into the unsigned char at B[j]. The cast is only necessary to avoid a compiler warning.
You can write a function that would sprintf the desired into a string, and then concat that with the destination string.
Something along these lines...
#include <stdio.h>
#include <string.h>
void createB (char B[10], const char *start)
{
char temp[10];
sprintf(temp, "\\x%c%c", start[0], start[1]);
strcat(B, temp);
}
int main ()
{
char A[] = "0a0b0c";
char B[10] = {'\0'};
for (int i=0; A[i] != '\0'; i = i+2)
{
createB(B, A+i);
}
printf("%s\n", B);
return 0;
}
$ ./main.out
\x0a\x0b\x0c
You can modify that to suit your needs or make it more efficient as you feel.
Please make edits as you please; to make it safer with necessary checks. I have just provided a working logic.
If you simply want to add "\x" before each '0' in the string-literal A with the result in a new string B, a simple and direct loop is all that is required, and storage in B sufficient to handle the addition for "\x" for each '0' in A.
For example:
#include <stdio.h>
#define MAXC 32
int main (void) {
char *A = "0a0b0c",
*pa = A,
B[MAXC],
*pb = B;
do { /* loop over all chars in A */
if (*pa && *pa == '0') { /* if chars remain && char is '0' */
*pb++ = '\\'; /* write '\' to B, adv ptr */
*pb++ = 'x'; /* write 'x' to B, adv ptr */
}
*pb++ = *pa; /* write char from A, adv ptr */
} while (*pa++); /* while chars remain (writes nul-termining char) */
puts (B); /* output result */
}
You cannot simply change A to an array with char A[] = 0a0b0c"; and then write back to A as there would be insufficient space in A to handle the character addition. You can always declare A large enough and then shift the characters to the right by two for each addition of "\x", but it makes more sense just to write the results to a new string.
Example Use/Output
$ ./bin/straddescx
\x0a\x0b\x0c
If you need something different, let me know and I'm happy to help further. This is probably one of the more direct ways to handle the addition of the character sequence you want.
#include <stdio.h>
int main(void)
{
char str1[] = "0a0b0c";
char str2[1000];
int i, j;
i = j = 0;
printf("sizeof str1 is %d.\n", sizeof(str1)-1);
for(i = 0; i < sizeof(str1)-1; i += 2)
{
str2[j] = '\\';
str2[j+1] = 'x';
str2[j+2] = str1[i];
str2[j+3] = str1[i+1];
j+=4;
}
str2[j] = '\0';
printf("%s\n", str2);
return 0;
}
I think you can do like this.
Assuming no bad input, assuming 'a' to 'f' are sequentially in order, assuming no uppercase:
// remember to #include <ctype.h>
char *input = "0a0b0c";
char *p = input;
while (*p) {
v = (isdigit((unsigned char)*p) ? *p-'0' : *p-'a'+10) * 16;
p++;
v += isdigit((unsigned char)*p) ? *p-'0' : *p-'a'+10;
p++;
printf("0x%d", v); // use v
}
While using char A[] = "0a0b0c";, as proposed by kiran, would make it possible to change the string, it wil not yet allow to insert characters. Because that would make the string longer and hence not fit into the available memory. This in turn is a problem, if you cannot create the target string right away with the needed size.
You could know the needed size in advance, if the input is always of the same length and always requires the same number of inserted characters, e.g. if like in your example, the target string is double the size of the input string. For a simple character array definition, you would need to know the size already at compile time.
char A[7] = "0a0b0c"; /* not 6, because size for the termianting \0 is needed */
char B[13] = ""; /* 2*6+1 */
So you can stay with char *A = "0a0b0c"; and make your life easier by setting up memory of appropriate size to serve as target. For that you need to first determine the length of the needed memory, then allocate it.
Determining the size if easy, if you know that it will be twice the input size.
/* inside a function, this does not work as a variable definition */
int iLengthB = 2*length(A);
char* B = malloc(iLengthB+1); /* mind the terminator */
Then loop over A, copying each two characters to B, prepending them with the two characters "\x". I assume that this part is obvious to you. Otherwise please show how you setup the program as described above and make a loop outputting each character from A separatly. Then, after you demonstrated that effort, I can help more.

Parse string to number in C

I tried to write a function to convert a string to an int:
int convert(char *str, int *n){
int i;
if (str == NULL) return 0;
for (i = 0; i < strlen(str); i++)
if ((isdigit(*(str+i))) == 0) return 0;
*n = *str;
return 1;
}
So what's wrong with my code?
*n = *str means:
Set the 4 bytes of memory that n points to, to the 1 byte of memory that str points to. This is perfectly fine but it's probably not your intention.
Why are you trying to convert a char* to an int* in the first place? If you literally just need to do a conversion and make the compiler happy, you can just do int *foo = (int*)bar where bar is the char*.
Sorry, I don't have the reputation to make this a comment.
The function definitely does not perform as intended.
Here are some issues:
you should include <ctype.h> for isdigit() to be properly defined.
isdigit(*(str+i)) has undefined behavior if str contains negative char values. You should cast the argument:
isdigit((unsigned char)str[i])
the function returns 0 if there is any non digit character in the string. What about "-1" and "+2"? atoi and strtol are more lenient with non digit characters, they skip initial white space, process an optional sign and subsequent digits, stopping at the first non digit.
the test for (i = 0; i < strlen(str); i++) is very inefficient: strlen may be invoked for each character in the string, with O(N2) time complexity. Use this instead:
for (i = 0; str[i] != '\0'; i++)
*n = *str does not convert the number represented by the digits in str, it merely stores the value of the first character into n, for example '0' will convert to 48 on ASCII systems. You should instead process every digit in the string, multiplying the value converted so far by 10 and adding the value represented by the digit with str[i] - '0'.
Here is a corrected version with your restrictive semantics:
int convert(const char *str, int *n) {
int value = 0;
if (str == NULL)
return 0;
while (*str) {
if (isdigit((unsigned char)*str)) {
value = value * 10 + *str++ - '0';
} else {
return 0;
}
}
*n = value;
return 1;
}
conversion of char* pointer to int*
#include
main()
{
char c ,*cc;
int i, *ii;
float f,*ff;
c = 'A'; /* ascii value of A gets
stored in c */
i=25;
f=3.14;
cc =&c;
ii=&i;
ff=&f;
printf("\n Address contained
in cc =%u",cc);
printf("\n Address contained
in ii =%u",ii);
printf(:\n Address contained
in ff=%u",ff);
printf(\n value of c= %c",
*cc);
printf(\n value of i=%d",
**ii);
printf(\n value of f=%f",
**ff);
}

Getting different lengths for the same operation with different number values in C

I have 2 for-loops which populate arrays with letters from the alphabet. I have a lowercase array set, and an uppercase array set. The problem is when I initialize the arrays with the letters, the lengths are coming back different.
char uppercase[26];
char lowercase[26];
int indexUpper = 0;
int indexLower = 0;
// Get uppercase array:
for(int a = 65; a <= 90; a++){
uppercase[indexUpper] = a;
indexUpper++;
}
// Get lowercase array:
for(int b = 97; b <= 122; b++){
lowercase[indexLower] = b;
indexLower++;
}
printf("UPPERCASE = %lu\n", strlen(uppercase));
printf("LOWERCASE = %lu\n", strlen(lowercase));
$=> UPPERCASE = 26
$=> LOWERCASE = 27
I apologize if this is a no brainer. I am truly trying to learn and comprehend the C language and its rules. Thanks to all who contribute.
strlen() reads the character array as long until it finds a NUL byte ('\0', numerical value zero). Your arrays don't contain any, since you haven't assigned one there.
That means that strlen will continue reading past the end of the array, which is illegal, and the resulting behaviour is not defined. Getting a 27 is rather mild, you could be getting arbitrary numbers, or your program could crash.
If you want to use strlen(), you should explicitly assign a NUL byte at the end of the string, and of course allocate space for it.
Perhaps something like this:
#include <stdio.h>
#include <string.h>
int main(void)
{
char upper[27];
int i;
for (i = 0 ; i < 26; i++) {
/* This only works in a character set where the letters
are contiguous */
upper[i] = 'A' + i;
}
/* i == 26 now */
upper[i] = '\0';
printf("len: %u\n", (unsigned) strlen(upper));
return 0;
}
(Though using strlen here at all seems somewhat pointless, since you already know the number of items in those arrays.)
When using strlen the char array must be nul terminated - but yours isn't so you have undefined behavior.
To print the size of the arrays try:
printf("UPPERCASE = %zu\n", sizeof uppercase);
printf("LOWERCASE = %zu\n", sizeof lowercase);

C: convert int[ ] to string

Technically I realize they are slightly different as the array is null terminated. But looking for a way to convert
int charArray[] = {'h', 'e', 'l', 'l', 'o'}; //ascii chars = ints
to
char *string;
Since charArray is not a string, you can't use the standard functions like strcpy(), or strlen(). Instead, copy every character, and add '\0' at the end. sizeof(charArray) / sizeof(int) can tell you how many characters to copy.
size_t sz = sizeof(charArray) / sizeof(int);
char *string = malloc(sz + 1);
for (int i = 0; i < sz; i++)
{
string[i] = charArray[i];
}
string[sz] = '\0';
You can't convert between int[] and char * in c. You can iterate over the array and build a char* with the desired value, though, then add \0 to the end. You can also typecast, by using (char*) charArray, but this is prone to lots of problems, like the missing \0 terminator. It won't work for strcpy, for example.
char arr[] = {'h','e','l'};
string str(arr);
cout<<str<<endl; // output hel
charArray is of type int[8] where as string is of type char *. They are different and incompatible types and you can't convert one into the other. What you can do is define an array of characters of enough size and then copy the integers(implicitly converted to char type) from the charArray into it.
// length of charArray
int length = sizeof charArray / sizeof charArray[0];
char string[length + 1]; // +1 for the null byte
for(int i = 0; i < length; i++)
string[i] = charArray[i]; // copy the characters
string[i] = '\0'; // add the terminating null byte

Character Array

I am trying to build a char array of words using calloc.
What I have:
char** word;
word=(char**)calloc(12,sizeof(char*));
for(i=0;i<12;i++){
word[i]=(char*)calloc(50,sizeof(char));
}
Is this correct if I want a char array that has 12 fields each capable of storing 50 characters?
Thanks!
The code is correct. Some points:
No need to cast return value of calloc() ( Do I cast the result of malloc? )
sizeof(char) is guaranteed to be 1
So code could be rewritten as:
char** word;
int i;
word = calloc(12, sizeof(char*));
for (i = 0; i < 12; i++)
word[i] = calloc(50, 1);
In C, most of the functions that operate on 'strings' require the char array to be null terminated (printf("%s\n", word[i]); for example). If it is required that the buffers holds 50 characters and be used as 'strings' then allocate an additional character for the null terminator:
word[i] = calloc(51, 1);
As commented by eq- a less error prone approach to using sizeof is:
word = calloc(12, sizeof(*word));

Resources