Subtracting from pointer to get length - c

I wanted to find the length of a part of a string after searching for it within a bigger string.
I cannot use strlen since I am dealing with binary data.
char *temp= "this is some random text";
char *temp1 = strstr(temp,"some");
int len = strlen(temp);
int len1 =0;
len1 = temp+len - temp1;
to get length of "some random text"
len1 returns negative value (even the positive value of it is wrong)

If your data is not NULL-terminated, then you cannot call strstr() on it for the same reason you can't call strlen(). If you do that, you can end up scanning past the end of your data. If you find a match there (which is quite possible; reading past the end of arrays is not guaranteed to crash the program), then your pointer arithmetic is going to give you a negative value, because you're subtracting a larger address from a smaller one.
On the other hand, if your data is actually properly NULL-terminated, then your problem is probably that strstr() doesn't find the substring and thus returns NULL. Are you checking for NULL? Otherwise, what you end up doing is:
len1 = temp + len - (char*)NULL;

Final answer:
You're looking for len - (temp1 - temp). The length of the first part is temp1 - temp. Substract it from the length of the entire string to get the length of the remaining part.
Longer answer:
Since strlen (which is what you have used in your example, even if it only works for proper text messages) goes until it finds a \0 character you can simply use strlen(temp1) for the length of the last part of the input. If you are really concerned that calling strlen twice will harm your performance (really?) then you can use len - (temp1 - temp).
You only need to do pointer substraction if you are interested in the length of the first part of the input.
If you want to work with binary arrays which contain \0 in them at non-terminal position you cannot use strlen at all in your code. However, you have to have a way to specify the length of the entire input. Either you have this in an integer variable or you have a specific delimiter an a length-computing function.
If you have the integer variable for length then, since the length of the first part of the input is obtained by pointer substraction, you only have to do len - (temp1 - temp).
If you have a length-computing function, simply call it with temp1 as argument.
PS: Don't forget to check if strstr returns NULL (by the way, you cannot use strstr if you have binary data with \0 inside the buffer)

Related

C - strcpy() function restrictions

I am incredibly new in C programming and I'm having a hard time understanding some aspects of it, including the strcpy() function.
I am doing some quizzes and passed over the following question:
To assure the correctness of the following strcpy(d,s) call, which of the following conditions must always be met:
a. sizeof(d) >= strlen(s) + 1
b. sizeof(d) >= sizeof(s)
c. sizeof(d) >= strlen(s)
d. strlen(d) >= strlen(s)
e. strlen(d) >= strlen(s) + 1
After doing some research, I found that the size of the destination string should be large enough to store the copied string. Source here. This led me to answers either b or d.
However the correct answer is 'a' and I cannot understand why, and cannot find any documentation. Could someone please explain in more details what the restrictions of strcpy() are?
It sort of depends on how the variables are declared and/or defined even, however judging by the fact that the answer is a, I'm positive that this is how they were declared and defined-
// Assume SIZE_0 and SIZE_1 are some integer values
char d[SIZE_0];
char s[SIZE_1];
// Assign a bunch of characters to `s` here and null terminate it
// Assume `s` now has `LEN` number of characters + 1 for the null terminator, for a total of `LEN + 1`
// Of course, `LEN + 1` is either less than, or equal to, `SIZE_1`
Now let's get the values cleared up real quick-
strlen(s) -> Returns LEN, as it counts the number of characters until the null terminator
sizeof(s) -> Returns SIZE_1
sizeof(d) -> Returns SIZE_0
strlen(d) -> Doesn't work as you'd think, strlen won't work without a null terminator, currently d has no value - so there's no length, not even 0 would count unless you set d[0] to '\0' by yourself
So, it's evident that for strcpy(d, s) to work, sizeof(d) (which is the only valid call) MUST BE more than, or equal to, LEN + 1 (LEN for all the characters from s and +1 for the null terminator).
Of course, strlen on s will return LEN, so we'll need strlen(s) + 1.
And that is why you need sizeof(d) to be more than, or equal to, strlen(s) + 1.
I must say though, if strlen did return the capacity (aka size) of the string, instead of the current length, your assumption would work.
At the end of the day, always remember the difference between capacity (size) and length, especially in C.
It is copying a sequence of character bytes with a null terminator. For example the string "ABC" is represented 65,66,67,0.
strlen calculates the length of the string excluding the null terminator (e.g. 3 for "ABC"). sizeof (for an array) gives you the amount of memory set aside for the array. The string it contains may be shorter.
char s[20] = "ABC"; would give you a string of length 3, using 4 bytes including the terminator, in a reserved memory space of 20 bytes.
To safely copy from s to d, you need to have enough space in d to receive the string and its terminator (without the terminator you won't have a valid string). Hence strlen(s)+1.
The size of s is irrelevant as not all of the reserved space will be copied. strlen(d) is irrelevant also - any existing string in d will be overwritten.
So option a is correct.
All character strings in C (the kind used by functions such as strcpy) must have a nul-terminator (a zero-value char signalling the end of the string). So, to store a string like "abc", you will need any char[] array to have at least four elements: one for each of the letters plus one for the nul-terminator.
The strlen() function returns the number of characters in the given string not including that nul-terminator; but the strcpy() function copies all characters including the terminator, so the destination buffer must be at least one 'chargreater than thestrlen` of the source.
Also note that the sizeof(d) calculation will only work if d is declared as an array of char (e.g. char d[42]); if that array is passed to a function, it will 'decay' to a pointer, and the array's size will not be (implicitly) known to that function; see this discussion: How to find the 'sizeof' (a pointer pointing to an array)?.

Char string length not getting initialized properly despite literally putting in the integer size I want it to be?

I'm working with char arrays in C. I'm setting the size in a previous step. When I print it out it clearly shows the num_digits as 1.
But then when I put it in to set the size of a char array to make it a char array of size num_digits, its setting the size of the array as 6.
In the next step when I print strlen(number_array), it prints 6. Printing it out I get something with a lot of question marks. Does anyone know why this is happening?
int num_digits = get_num_digits(number);
printf("Num digits are %d\n", num_digits);
char number_array[num_digits];
printf("String len of array: %d\n", strlen(number_array));
You need to null terminate your array.
char number_array[num_digits + 1];
number_array[num_digits] = '\0';
Without this null terminator, C has no way of know when you've reached the end of the array.
just use 'sizeof' instead of 'strlen'
printf("String len of array: %d\n", sizeof(number_array));
There are a couple possible issues I see here:
As noted in Michael Bianconi's answer, C character arrays (often called strings) require null terminators. You would explicitly set this this with something like:
number_array[number + 1] = '\0'; /* See below for why number + 1 */
Rather than just setting the last element to null, pre-initializing the entire character array to nulls might be helpful. Some compilers may do this for you, but if not you'll need to do this explicitly with something like:
for (int i = 0; i < num_digits + 1; i ++) number_array[i] = '\0';
Note that with gcc I had to use C99 mode using -std=c99 to get this to compile, as the compiler didn't like the initialization within the for statement.
Also, the code presented sets the length of the character array to be the same length as number's length. We don't know what get_num_digits returns, but if it returns the actual number of significant digits in an integer, this will come up one short (see above and other answer), as you need an extra character for the null terminator. An example: if the number is 123456 and get_number_digits returns 6, you would would need to set the length of number_array to 7, instead of 6 (i.e. number + 1).
char number_array[num_digits]; allocates some space for a string. It's an array of num_digits characters. Strings in C are represented as an array of characters, with a null byte at the end. (A null byte has the value zero, not to be confused with the digit character '0'.) So this array has room for a string of up to num_digits - 1 characters.
sizeof(number_array) gives you the array storage size. That's the total amount of space you have for a string plus its null terminator. At any given time, the array can contain a string of any length up to number_array - 1, or it might not contain a string at all if the array doesn't contain a null terminator.
strlen(number_array) gives you the length of the string contained in the array. If the array doesn't contain a null terminator, this call may return a garbage value or crash your program (or make demons fly out of your nose, but most computers fortunately lack the requisite hardware).
Since you haven't initialized number_array, it contains whatever happened to be there in memory before. Depending on how your system works, this may or may not vary from one execution of the program to the next, and this certainly does vary depending on what the program has been doing and on the compiler and operating system.
What you need to do is:
Give the array enough room for the null terminator.
Initialize the array to an empty string by making setting the first character to zero.
Optionally, initialize the whole array to zero. This is not necessary, but it may simplify further work with the array.
Use %zu rather than %d to print a size. %d is for an int, but sizeof and strlen return a size_t, which depending on your system may or may not be the same size of integers.
char number_array[num_digits + 1];
number_array[0] = 0; // or memset(number_array, 0, sizeof(number_array));
printf("Storage size of array: %zu\n", sizeof(number_array));
printf("The array contains an empty string: length=%zu\n", strlen(number_array));

How do I properly store characters in an array using read?

I have written the following code, and I don't understand why read is not storing the characters the way I expect:
char temp;
char buf[256];
while(something)
read (in,&temp, 1);
buf[strlen(buf)] = temp;
}
If I print temp and the last place of the buf array as I am reading, sometimes they don't match up. For example maybe the character is 'd' but the array contains % or the character is 0 and the array contains .
I am reading less than 256 characters but it doesn't matter because I am printing as I am reading.
Am I missing something obvious?
Yes, you're not initializing buf -- strlen(buf) is undefined. You should initialize it like so:
buf[0] = 0;
Also, it's better to keep track of the length instead of calling strlen each iteration to avoid a Shlemiel the painter algorithm.
You should also be checking for errors in the call to read(2) -- if it returns -1 or 0, you should break out of your loop, since it means either an error occurred or you reached the end of the file/input stream.
Don't use strlen in this code. strlen relies on it's argument being a NULL terminated C string. So unless you initialize your entire buffer to 0, then this code doesn't work.
At any rate strlen isn't a good choice to use when buffering data, even if you know that you're working with printable string data, if only because strlen will traverse the string every time just to get your length.
Keep a separate counter, named e.g. numRead, only append to buf at the numRead position, and increment numRead by the amount that you read.

C Sprintf() appends junk characters

I tried to use sprintf to append a int, string and an int.
sprintf(str,"%d:%s:%d",count-1,temp_str,start_id);
Here, the value of start_id is always the same. The value of temp_str which is a char * increases every time. I get correct output for some time and then my sprintf starts printing junk characters between temp_str and startid. So my str get corrupted.
Can anyone explain this behavior ?
example
at count 11
11:1:2:3:1:2:3:1:2:3:1:21:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2
at count 8
8:1:2:3:1:2:3:1:2:3:1:21:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1�:2
I don't understand why and how "�" is appended to the string
Either temp_str is not null-terminated at some point or you've blown the buffer for str and some other memory access is affecting it.
Without seeing the code, it's a little hard to tell but, if you double the size of str and the problem behaviour changes, then it's probably the latter.
1> try to memset your str buffer with 0 befor using sprintf
2> The value of temp_str which is a char * increases every time
what do u mean by this ?
this should be normal charachter pointer which will point some string and that string should be null terminated and tha will be copied to str
3> the total size by combing all three argument should not be exceed the size of str buffer
It looks like the string temp_str isn't NUL-terminated. You can either terminate it before the call to sprintf, or if you know the length you want to print, use the %.*s formatting operator like this:
int str_len = ...; // Calculate length of temp_str
sprintf(str, "%d:%.*s:%d", count-1, str_len, temp_str, start_id);
You are running off the end of temp_str. Check your bounds and make sure it's null terminated. Stop incrementing sun you get to the end.

C, sprintf and "sum" of string and int

I didn't used C for a lot of time, and now I have to modify a little piece of code. There one thing I can't understand:
char filename[20];
filename[0] = '\0';
for (j=0; j < SHA_DIGEST_LENGTH; j++){
sprintf(filename + strlen(filename),"%02x",result[j]);
}
In the first line a string of 20 characters is dleclared.
In the second line the first char is set to '\0', so is an empty string, I suppose.
In the for loop I don't understand the "sum" between filename and its length... The firs parameter of sprintf should be a buffer where to copy the formatted string on the right. What is the result of that sum? It seems to me like I'm trying to sum an array and an integer...
What I'm missing?
It's pointer arithmetic. strlen returns the number of characters before the NUL terminator. The result of the addition will point to this terminator. E.g. if the current string is "AA" (followed by a NUL), strlen is 2. filename + 2 points to the NUL. It will write the next hex characters (e.g. BB) over the NUL and the next character. It will then NUL-terminate it again (at filename + 4). So then you'll have "AABB" (then NUL).
It doesn't really make sense though. It wastes a lot of time looking for those NULs. Specifically, it's a quadratic algorithm. The first time, it examines 1 character, then 3, 5, 7, ..., 2 * SHA_DIGEST_LENGTH - 1) that . It could just be:
sprintf(filename + 2 * j,"%02x",result[j]);
There's another problem. A hexadecimal representation of a SHA-1 sum takes 40 characters, since a byte requires two characters. Then, you have a final NUL terminator, so there should be 41. Otherwise, there's a buffer overflow.
Why dont you declare
char filename[SHA_DIGEST_LEN*2 +1]; /* And +1 if you want to have the NULL terminating char*/
This is because SHA1 digest length is 20 bytes, if you were just to print the digest then you may probably not want the additional memory but since you want hexadecimal string of the digest you can use the above declaration.
A strlen operation returns lenghth of string till a null terminating character is encountered.
So basically when you do the following :
sprintf(filename + strlen(filename),"%02x",result[j]);
In the first interation filname is copied with 2 bytes of the hexadecimal representation of the first byte of the sha-1 digest. Eg. Say that is AA, now you need to move your pointer two places to copy the next byte.
After second iteration it becomes AABB.
After the 20th iteration you have the entire string AABBCC......AA[40 bytes] and +1 if you need the '\0' which is the NULL termination character.
First iteration, when j = 0, you will write 3 chars (yes, including the '\0' terminating the string) onto the beginning of filename, since strlen() then returns 0.
Next round, strlen() returns 2, and it will continue writing after the first two chars.
Be careful for stepping outside the 20 char space allocated. Common mistake is to forget the space required for the string terminator.
EDIT: make sure that SHA_DIGEST_LENGTH is not greater than 9.
you are adding strlen(filename) only to do concatenation of result[j]
Each iteration concatenates the current result[j] at the end of filename so each time you need to know to offset within the filename where the concatenation should take place.
Replace the code with:
char filename[SHA_DIGEST_LENGTH*2+1];
for (j=0; j < SHA_DIGEST_LENGTH; j++){
sprintf(filename + 2*j,"%02x",result[j]);
}
Faster, simpler, and the bugs are gone.

Resources