I need to concatenate some strings, and I need to include NULL bytes. I don't want to treat a '\0' as a terminating byte. I want to save my valuable NULL bytes!
In a code example, if
char *a = "\0hey\0\0";
I need to printf in a format that will output "\0hey\0\0".
-AUstin
How about:
int i;
for(i = 0; i < 4; i++)
printf("%c", a[i]);
If you want a 'printf-like' function to use this when you specify %s in a format string you could include the above code in your own function. But as #Neil mentioned, you'll struggle finding an alternative to looking for null bytes to determine the length of strings. For that I guess you could use some kind of escape character.
The issue here is that the length of the string a cannot be easily determined. For example, your code..
char *a = "\0hey\0\0";
.. allocates seven bytes to the string, the last being the NULL terminator. Using a function like strlen would return 0.
If you know the precise length of the string, then you can write or iterate over the bytes thus:
#ifdef ESCAPE_NULLS
int i;
for (i = 0; i <= 6; i++)
if (a[i] == 0)
printf("\\0");
else
printf("%c", a[i]);
#else
write(1, a, 6);
#endif
But you have to know about the 6.
The alternative is not to use NULL-terminated strings, and instead implement an alternative storage mechanism for your bytes; for example a length-encoded array.
#include <stdio.h>
typedef struct {
int length;
char *bytes;
} bytearr;
void my_printf(bytearr *arr)
{
#ifdef ESCAPE_NULLS
int i;
for (i = 0; i <= arr->length; i++)
if (arr->bytes[i] == 0)
printf("\\0");
else
printf("%c", arr->bytes[i]);
#else
write(1, arr->bytes, arr->length);
#endif
}
void main(void)
{
bytearr foo = {
6, "\0hey\0\0"
};
my_printf(&foo);
}
Graceless, but hopefully you get the idea.
Edit: 2011-05-31
Rereading the question I just noticed the word "concatenate". If the NULL characters are to be copied faithfully from one place in memory to another (not backslash-escape), and you know the total number of bytes in each array beforehand, then you can simply use memcpy.
#include <string.h>
char *a = "\0hey\0\0"; /* 6 */
char *b = "word\0up yo"; /* 10 */
char *c = "\0\0\0\0"; /* 4 */
void main(void)
{
char z[20];
char *zp = z;
zp = memcpy(zp, a, 6);
zp = memcpy(zp, b, 10);
zp = memcpy(zp, c, 4);
/* now z contains all 20 bytes, including 8 NULLs */
}
char *a="\0hey\0\0";
int alen = 7;
char buf[20] = {0};
int bufSize = 20;
int i=0;
int j=0;
while( i<bufSize && j<alen )
{
if(a[j]=='\0') {
buf[i++]='\\';
buf[i++]='0';
j++;
}
else {
buf[i++] = a[j++];
}
}
printf(buf);
Related
Taking an input as hex string and then converting it to char string in C. The hex string can contain 0x00 which translates to an 0 in Ascii when converted. This terminates the string. I have to store the value in an char string because the API uses that.
My code so far:
int hex_to_int(unsigned char c) {
int first =0;
int second =0;
int result=0;
if(c>=97 && c<=102)
c-=32;
first=c / 16 - 3;
second =c % 16;
result = first*10 + second;
if(result > 9) result--;
return result;
}
unsigned char hex_to_ascii(unsigned char c, unsigned char d){
unsigned char a='0';
int high = hex_to_int(c) * 16;
int low = hex_to_int(d);
a= high+low;
return a;
}
unsigned char* HextoString(unsigned char *st){
int length = strlen((const char*)st);
unsigned char* result=(unsigned char*)malloc(length/2+1);
unsigned char arr[500];
int i;
unsigned char buf = 0;
int j=0;
for(i = 0; i < length; i++)
{
if(i % 2 != 0)
{
arr[j++]=(unsigned char)hex_to_ascii(buf, st[i]);
}
else
{
buf = st[i];
}
}
arr[length/2+1]='\0';
memcpy(result,arr,length/2+1);
return result;
}
You can store any values in a char array. But if you want to store a value of 0x00, you cannot use the string functions on this array. So you have to use an integer variable to store the length of the data you want to store. You can then write functions that use this integer.
As you provided more information now, I can tell you that your function doesn't cut anything as it loops through the whole C-string which you provided for example as input "0a12345600a0020b12". The "problem" is that if you want to get the length (strlen()) of the output string after the conversion for example then it will stop at '\0' and you will get a "wrong" length in terms of your original input string.
It is exacly like it's written in the answer of Xaver save the length information and the string to work with that length and not the one you would get by the C-string functions like strlen().
To show that and in order to provide a right length information I've added a struct definition to your code that defines a string type consisting of a size_t len and an unsigned char* str called HexString. With the additional length information you can handle a 0 byte. Also I made little changes to your code, e.g. you don't need that character buffer arr on the stack.
With your input: "0a12345600a0020b12"
the following output you will see: <0a> <12> <34> <56> <00> <a0> <02> <0b> <12> <00>
if you print the C-string hexadecimal every single character. The last <00> is the null termination.
Look here on ideone for a live example.
The code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct
{
size_t len; /* C-string length + '\0' */
unsigned char* str;
} HexString;
int hex_to_int(unsigned char c)
{
int first =0;
int second =0;
int result=0;
if (c >= 97 && c <= 102) /* 97 = 'a'; 102 = 'f' */
c -= 32;
first = c / 16 - 3;
second = c % 16;
result = first * 10 + second;
if (result > 9) result--;
return result;
}
unsigned char hex_to_ascii(unsigned char c, unsigned char d)
{
unsigned char a = '0';
int high = hex_to_int(c) * 16;
int low = hex_to_int(d);
a = high + low;
return a;
}
HexString HextoString(const char* const st)
{
HexString result;
size_t length = strlen(st);
result.len = length/2+1;
result.str = malloc(length/2+1);
size_t i;
size_t j = 0;
unsigned char buf = 0;
for (i = 0; i < length; i++)
{
if (i % 2 != 0)
{
result.str[j++] = hex_to_ascii(buf, st[i]);
}
else
{
buf = (unsigned char)st[i];
}
}
result.str[length/2+1] = '\0';
return result;
}
int main()
{
size_t i;
HexString hexString = HextoString("0a12345600a0020b12");
for (i = 0; i < hexString.len; ++i)
{
printf("<%02x> ", hexString.str[i]);
}
free(hexString.str);
return 0;
}
I wanted to split an array to 2 arrays that the first one contains the lowercased letters of the original array and the second one contains the uppercased letters and from some reason it prints some unrelated chars.
#include <stdio.h>
#include <string.h>
#define LEN 8
int main(void)
{
char str[] = "SHaddOW";
char smallStr[LEN], bigStr[LEN];
int i = 0;
int indexSmall = 0;
int indexBig = 0;
for (i = 0; i <= LEN; i++)
{
if (str[i] <= 'Z')
{
smallStr[indexSmall] = str[i];
indexSmall++;
}
if (str[i] >= 'Z')
{
bigStr[indexBig] = str[i];
indexBig++;
}
}
printf("1: ");
puts(smallStr);
printf("2: ");
puts(bigStr);
system("PAUSE");
return 0;
}
Don't define length before you create the string to test.
Create it's length after defining the string to test.
Copy the characters as you encounter them, but as #Ed Heal says you must add a null terminator so that you can print out the two strings (they aren't really strings until they are null terminated).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int main (void)
{
char str[] = "SHaddOW";
int len = strlen(str) +1;
char smallStr[len], bigStr[len];
char term[] = {'\0'};
int n, s, b;
s=0;
b=0;
for(n=0; n<len; n++) {
if(islower(str[n])) {
memcpy(smallStr +s, str +n, 1);
s++;
} else if (isupper(str[n])){
memcpy(bigStr +b, str +n, 1);
b++;
}
}
memcpy(smallStr + s, term, 1);
memcpy(bigStr + b , term, 1 );
printf("Upper: %s\n", bigStr);
printf("Lower: %s\n", smallStr);
}
Output:
Upper: SHOW
Lower: add
Add this to the if structure (and other code to support it)
} else {
memcpy(anyStr +a, str +n, 1);
a++;
}
then:
char str[] = ".S1H2a3d4d5O6W.";
and:
printf("Anything else: %s\n", anyStr);
returns:
Upper: SHOW
Lower: add
Anything else: .123456.
A more compact approach with (perhaps) more meaningful variable names:
#include <stdio.h>
#include <ctype.h>
#include <stdint.h>
#include <string.h>
int main ( void ) {
const char str[] = "SHaddOW";
size_t len = strlen(str); /* better to get actual length */
char lowers[len + 1]; /* add one for the nul char */
char uppers[len + 1]; /* (see below) */
int c;
int i = 0;
int n_upper = 0;
int n_lower = 0;
while ((c = str[i++]) != '\0') {
if (isupper(c)) uppers[n_upper++] = c; /* no need to reinvent */
if (islower(c)) lowers[n_lower++] = c; /* the wheel here */
}
uppers[n_upper] = '\0'; /* the nul char ('\0') marks */
lowers[n_lower] = '\0'; /* the end of a C "string" */
printf("1: %s\n", lowers);
printf("2: %s\n", uppers);
return 0;
}
Notes
If you are super concerned about efficiency you could add an else before if (islower...
Adding const means you "promise" the characters in the array won't be changed.
The type size_t is an integer type, but may be larger than int. It is the correct type for the return of strlen(). It is defined in <stdint.h>. None the less, using int will almost always work (on most systems a string would have to be 'yooooge' for its length to be bigger than an int can hold).
The variable c is declared as int instead of char because int is the proper type for the isXXXXX() functions (which are defined in <ctype.h>). It is also a good habit to get into because of the parallels between this loop and another common idiom while ((c = fgetc(fp)) != EOF) ....
You should consider using isupper() and islower() functions. Code would be cleaner. And what if you have some non alpha characters? Your conditions won't work.
for (i = 0; i < LEN; i++)
{
if (islower(str[i]))
{
smallStr[indexSmall] = str[i];
indexSmall++;
}
else if (isupper(str[i]))
{
bigStr[indexBig] = str[i];
indexBig++;
}
}
As #Ed Heal mention. To avoid printing rubbish, after for loopt you should add a null characters to arrays.
smallStr[indexSmall] = '\0';
bigStr[indexBig] = '\0';
I have multiple int variables , each int is about 4 to 6 digits.
I want to combine them into one big string (char *) and add symbol '>' in between each integers
which would be some thing like :
int a = 123456, b = 2244, c = 23456, d = 54321;
char * str;
and out put string would be like this : 123456>2244>23456>54321\0
Try this
char string[100];
snprintf(string, sizeof(string), "%d>%d>%d>%d", a, b, c, d);
If you mean an array with variable length, you can do it this way
#include <stdio.h>
#include <string.h>
int main()
{
int array[5] = {1, 2, 3, 4, 5};
char string[1024];
int size;
size = sizeof(string);
size -= snprintf(string, sizeof(string), "%d", array[0]);
for (size_t i = 1 ; ((i < sizeof(array) / sizeof(array[0])) && (size > 0)) ; ++i)
{
char current[100];
snprintf(current, sizeof(current), ">%d", array[i]);
size -= strlen(current);
if (size >= 0)
strcat(string, current);
}
printf("%s\n", string);
return 0;
}
How about:
char buf[n*7+1]; /* big enough for 6 digits per, plus >. See my comment below. */
char * cur = buf; /* points to start of buf. */
int i;
for(i=0; i<n; ++i) {
/* sprintf returns the number of characters converted, so
this advances the pointer to the end of the added chars. */
cur += sprintf(cur, "%d>", array[i]);
}
*(cur - 1) = '\0'; /* get rid of the last > */
I haven't compiled this, so you may have to make some mods.
I got an assignment for wich i have to write an program that will take the letters in the first parameter string, and find them in the second parameter string like so:
./a.out "lolabab" "ablcocllcab"
the program needs to print "loab", because each letter should only be printed once.
here's the main part of my program
char *do_stuff(char *s1, char *s2)
{
int i, j, k;
char *out;
out = malloc(sizeof(char) * str_len(s1));
i = 0;
j = 0;
k = 0;
while (s2[j] != '\0' && s1[i] != '\0')
{
if (s2[j] == s1[i])
{
if (check_char(out, s1[i]) == 0)
{
out[k] = s1[i];
k++;
}
i++;
j = -1;
}
j++;
}
return (out);
}
my question is: if I dont initialize "out" i have a problem.
i initialize it with malloc at the moment, but i am not allowed to use malloc :).
any other way i tried, seems to not work for me (segmentation fault).
So how do i initialize a string without using malloc?
It's probably obvious, but i'm new at this so pls help. Thanks!
You can always pass the output buffer as a parameter
void do_stuff(char *s1, char *s2, char *out /* some large enough char [] */)
{
int i, j, k;
i = 0;
j = 0;
k = 0;
while (s2[j] != '\0' && s1[i] != '\0')
{
if (s2[j] == s1[i])
{
if (check_char(out, s1[i]) == 0)
{
out[k] = s1[i];
k++;
}
i++;
j = -1;
}
j++;
}
}
and in the calling function
char result[SOME_REASONABLE_SIZE] = {0} /* initialize it for the check_char function */;
do_stuff(argv[1], argv[2], result);
you should check that the function recieved the 2 arguments of course.
One more thing, try not to use strlen in the check char function, pass the current string length k to it, that way your program would be more efficient.
Use the fact that the number of characters is constant (and relatively small):
#include <limits.h>
#define CHAR_NUM (1<<CHAR_BIT)
#define FLAG(x) (1<<(x))
void get_common_chars(char* s1,char* s2,char out[CHAR_NUM])
{
int i,n;
int flags[CHAR_NUM] = {0};
for (i=0; s1[i]!=0; i++)
flags[(unsigned char)s1[i]] |= FLAG(1);
for (i=0; s2[i]!=0; i++)
flags[(unsigned char)s2[i]] |= FLAG(2);
n = 0;
for (i=0; i<CHAR_NUM; i++)
if (flags[i] == FLAG(1)|FLAG(2))
out[n++] = (char)i;
out[n] = 0;
}
If you're only interested in non-capital letters, then you can further improve it:
#define MIN_CHAR 'a'
#define MAX_CHAR 'z'
#define CHAR_NUM (MAX_CHAR-MIN_CHAR+1)
#define FLAG(x) (1<<(x))
void get_common_chars(char* s1,char* s2,char out[CHAR_NUM])
{
int i,n;
int flags[CHAR_NUM] = {0};
for (i=0; s1[i]!=0; i++)
if (MIN_CHAR <= s1[i] && s1[i] <= MAX_CHAR)
flags[s1[i]-MIN_CHAR] |= FLAG(1);
for (i=0; s2[i]!=0; i++)
if (MIN_CHAR <= s2[i] && s2[i] <= MAX_CHAR)
flags[s2[i]-MIN_CHAR] |= FLAG(1);
n = 0;
for (i=0; i<CHAR_NUM; i++)
if (flags[i] == FLAG(1)|FLAG(2))
out[n++] = (char)(MIN_CHAR+i);
out[n] = 0;
}
Here is a usage example:
#include <stdio.h>
int main(int argc,char* argv[])
{
char common_chars[CHAR_NUM];
if (argc >= 3)
{
get_common_chars(argv[1],argv[2],common_chars);
printf("%s\n",common_chars);
}
return 0;
}
If I understand correctly what you need, you should not create a new string, but use the command-line parameters, which are available in the arguments of main().
When you write
int main(int argc, char** argv) {
The compiler will arrange so that argc is the number of command-line arguments, and argv is an array of strings with the arguments. The first, argv[0], is the program name, and the rest are arguments passed to the program.
So this is one way to get your assignment done (high-level description only -- the rest is yours!)
Take the first argument, argv[1], and loop over it, character by character. For each character, try to find it in the other argument, argv[2]. If you find it, print the single character.
No need to allocate memory at all!
edit: if you don't want to print doubles, then one way would be to keep a static array that you could use as an index of already printed characters:
static int printed[26] = { 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 };
When you print c, set its position to 1. And only print if the character's position is zero.
It's up to you to find out how to find the index of an arbitrary character (and to decide wether you want to differentiate between upper and lower case).
My version of strncat is copying one too many chars into the destination and I cannot figure out why.
#include <stdio.h>
#define MAX_CHARS 20
void nconcatenate(char *start, char *end, int n)
{
if(sizeof start + n > MAX_CHARS)
return;
while(*start++);
start--; /* now points to the final char of start, the \0 */
int i;
for(i = 0; (*start++ = *end++) && i < n; i++);
*start = '\0';
}
int main()
{
char start[MAX_CHARS] = "str";
char *end = "ingy!";
nconcatenate(start, end, 3);
printf("start = %s\n", start);
return 0;
}
Using 3 as 'n' outputs
stringy
which is one too many chars.
Maybe because in the condition
(*start++ = *end++) && i < n
first it does (*start++ = *end++) and after that, it checks i < n.
I haven't tested it, but check it out and see.