Finding the Beginning of a string in C

Finding the Beginning of a string in C - c

To solve a question, I am looking for a way to stop a loop after it has reached the beginning of the string, assuming the loop starts from the end and decrements, is there an alternative way to do this without finding the length of the string first and decrementing till the number is zero?
Please keep in mind the only functions I can use are malloc, free and write.

This is not possible, because there is nothing special about a string's contents at the beginning. C strings have a "sentinel value" at their end - '\0' - but the first character, and the byte in memory before the first character, can have any value.

is there an alternative way to do this without finding the length of the string first and decrementing till the number is zero?
Apparently you already know where the end of the string is. I suppose you must have a pointer to the terminator character, since you think you do not know the string length.
If finding the length of the string is a viable option at all, however, then you must already know where the beginning is, too. And if you know where the beginning is and you know where the end is, then you already know the length: it is end - beginning. But you do not need to keep a separate counter to iterate backward from the end of a string to the beginning, supposing that you do know where both the end and the beginning are. You can simply use pointer comparisons instead. For example:
int count_a_backwards(const char *beginning, const char *end) {
int count = 0;
for (const char *c = end; c > beginning; ) {
if (*--c == 'a') count += 1;
}
return count;
}
If in fact you do not know where the beginning of the string is, however, then you cannot identify it at all, at least not in the general case. Perhaps you can recognize the beginning if you have some kind of prior knowledge about the string's contents, or about its alignment, or some such, but in general, the beginning of a string cannot be recognized.

Please keep in mind the only functions I can use are malloc, free and
write.
If you are using the function malloc then the function returns pointer to the first byte of the allocated memory. So if the allocated array will contain a string then its beginning will be known.
The task is to find the end of the string.
You can use either the standard C function strlen or write your own loop that will find the end of the stored string.
So if you have two pointers, one that points to the beginning of a string and the second that points to the end of the same string then to traverse the string in the reverse order is not a hard work.
Pay attention to that if you have a character array that contains a string like this
char s[] = "Hello";
then the expressions s, s + 1, s + 2 and so on all points to a string correspondingly "Hello", "ello", "llo" and so on.
You could find the beginning of a string having a pointer to its end provided that the first element of the array contains a unique symbol that is a sentinel value. However in general this is a very rare case.
Here is a demonstrative program that shows how you can traverse a string in the reverse order without using standard C string functions except a function that places a string in a dynamically allocated array.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
enum { N = 12 };
char *s = malloc( N );
strcpy( s, "Hello World" );
puts( s );
char *p = s;
while ( *p ) ++p;
while ( p != s ) putchar( *--p );
putchar( '\n');
free( s );
return 0;
}
The program output is
Hello World
dlroW olleH

Related

Array pointer issue in C - array only contains last value

I'm trying to add (well, append really) the letters in the alphabet to an empty char array. However, I appear to run into some sort of pointer issue I don't understand, as my array contains only the last character. I tried moving the letter char outside of the for loop, but the compiler didn't like that. I also looked on here about how to create a list of all alphabetical chars, and one of the better answers was to type them all in 1 at a time. However, my problem means I don't fully understand for loops and pointers in C, and I want to.
#include <stdio.h>
int main(void) {
char *empty_list[26];
for (int i = 0; i < 26; i++){
char letter = i + 65;
empty_list[i] = &letter;
}
printf("%s", *empty_list);
return 0;
}

The main problem is your declaration:
char *empty_list[26];
defines an array of 26 pointers to characters. In your current code you assign each element in the array the address of the variable letter. Since that is out of scope when you print it is luck that it prints out the last one, it could equally have printed out garbage or crashed if the code between was complex. It could also have printed out additional garbage after the letter with what you already have since there is no way of knowing whether there is a string terminating character (\0) after the letter. In your existing code printf("%s", *empty_list); prints the first pointer from the array as a null terminated string, which if you ignore the loss of scope and assume the memory contents are still around, will be the last value from the loop since all pointers in your array point to the memory that letter was stored at and that memory has the last value from the loop.
If your intention was to create an array with the letters then it should be:
char empty_list[27];
It needs to be 27 as you need to leave space for the string terminating character at the end. One way to fill that in would be to use:
empty_list[26] = '\0';
after the end of your for loop and before you print the contents of the array (do not include the asterisk here - because it is an array the compiler will automatically take the address of the first element):
printf("%s", empty_list);
As brothir mentioned in the comments when you assign the value of the letter to the element in the array it should be without the ampersand:
empty_list[i] = letter;

There are a few things wrong with your code.
Firstly, the type of empty_list is presently an array of pointers to char, when it really should be an array of char, since your intent is to print it out as if it were the latter in the call to printf after your loop. char empty_list[26]; is the correct declaration.
Secondly, in your loop, you assign &letter when all you need is letter. Heck, you don't even need the intermediate variable letter. Just empty_list[i] = i + 'A'; will suffice.
Lastly, you are passing empty_list to printf to satisfy a format specifier %s, which expects a null-terminated string. What you need to do is add another element to empty_list and set that to zero:
char empty_list[27];
// populated 0..25 with 'A'..'Z' in your code...
empty_list[26] = '\0';
printf("%s\n", empty_list);
// Output: ABC...Z

With the above help (much appreciated), my working code to create an array of letters in C is below:
#include <stdio.h>
int main(void) {
// create an array with 1 extra space for null terminator
char empty_list[27];
// add null terminator so string knows when it's finished.
empty_list[26] = '\0';
for (int i = 0; i < 26; i++){
// add 65 to get ASCII value for 'A'
char letter = A + i;
// insert each char into the array sequentially
empty_list[i] = letter;
}
printf("%s", empty_list);
return 0;
}

Reseting a char pointer to the top of an array

I am writing a function and I need to count the length of an array:
while(*substring){
substring++;
length++;
}
Now when I exit the loop. Will that pointer still point to the start of the array? For example:
If the array is "Hello"
when I exit the loop with the pointer be pointed at:
H or the NULL?
If it is pointing at NULL how do I make it point at H?

Strings in C are stored with a null character (denoted \0) at the end.
Thus, one might declare a string as follows.
char *str="Hello!";
In memory, this will look like Hello!0 (or rather, a string of numbers corresponding to each letter followed by a zero).
Your code looks like this:
substring=str;
length=0;
while(*substring){
substring++;
length++;
}
When you reach the end of this loop, *substring will be equal to 0 and substring will contain the address of the 0 character mentioned above. The value of substring will not change unless you explicitly do so.
To make it point at the beginning of the string you could use substring-length, since pointers are integers and may be manipulated as such. Alternatively, you could memorize the location before you begin:
beginning=str;
substring=str;
length=0;
while(*substring){
substring++;
length++;
}
substring=beginning;

It's pointing at the NULL-terminator of the array. Just remember the position in another variable, or subtract length from the pointer.

Pointer once moved will not automatically move to any another location. So once the while loop gets over the pointer would be pointing to NULL or precisely '\0' which is a termination sequence for the string.
In order to move back to the length of string just calculate the string length, which you already are doing by incrementing the length variable.
Sample code:
#include<stdio.h>
int main()
{
char name1[10] = "test program";
char *name = '\0';
name = name1;
int len = strlen(name);
while(*name)
{
name++;
}
name=name-len;
printf("\n%s\n",name);
}
Hope this helps...

At the end of the loop, *substring will be 0. That's the condition for the loop to end:
while(*substring)
So while( (the value pointed to by substring) is not equal to 0), do stuff
But then *substring becomes 0 (i.e. end of string), so *substring will point to NULL.
If you want to bring it back to H, do substring - length
However, the function you are writing already exists. It's in string.h and it's size_t strlen(const char*) size_t is an integer the size of a pointer (i.e. 32 bits on 32 bit OS and 64 bits on 64 bit OS).

How does this string reverse recursion work?

I usually understand recursions pretty well, but because I'm new to C functions like strcpy and pointers I couldn't figure out how this recursion reverses a string:
char *reverse(char *string)
{
if (strlen(string) <= 1)
return string;
else
{
char temp = *string;
strcpy(string, reverse(string+1));
*(string+strlen(string)) = temp;
return string;
}
}
The strcpy part seems a little bit complicated to me, and also what's the purpose of this line:
*(string+strlen(string)) = temp;?
I realize that after flipping the string you need to add the character that was at the beginning to the end of the string, but I'm not sure I understand the logic behind this code.

This code is extremely inefficient but what it does is:
Save the original first character
Recursively reverse the rest of the string (string+1 is a pointer to the second character in the string).
Copy the rest of the (reversed) string one character to the left.
Put the original first character at the end (*(string+strlen(string)) = temp;).
The *(string+strlen(string)) = temp; is equivalent to string[strlen(string)] = temp; if that is easier to understand.
I will not recommend using this code at all, since it is extremely inefficient -- it copies the entire string (and measures its length twice) in every iteration, not to mention waste stack space.
A much better implementation would be:
void reverse(char *s) {
char *e = s+strlen(s)-1;
while (e > s) {
char tmp = *s;
*s=*e;
*e=tmp;
s++; e--:
}
}

*(string+strlen(string)) = temp is so-called pointer arithmetic - that code is equivalent to string[strlen(string)] = temp. Therefore, this puts the temp character to the end of the string. Note that the string still remains zero-terminated as reverse() returns string of the same length as its argument.
The reverse(string+1) is again pointer arithmetic, this time same as reverse(&string[1]) - i.e., reverse() will mirror the string from the second character onwards, but then strcpy() will place it at the beginning of the string, overwriting the first character that is stored in temp and put at the end of the string.
However, the overall code looks needlessly convoluted and inefficient, so I'd think twice before drawing any lessons on how to do things from it.

This is how the code works. The input string is divided in two parts,
the first character, and the rest. The first character is stored in temp,
and the rest is reversed through a recursive call. The result of the recursive call is placed at the beginning of the result, and the character in temp is placed at the end.
string is [1234567890]
temp is 1, string+1 is [234567890]
reverse(string+1) is [098765432], temp is 1
the strcopy line is the part that copies the result from reverse(string+1) at the beginning of string, and *(string+strlen(string)) = temp is the part that copies temp at the end of string.

C version of strpos and substr?

I'm really surprised I can't figure out a way to do this effectively. I've tried strstr, a combination of things with sscanf, and nothing seems to work the way I would expect it to based on my experience in other languages.
I have a char of "ABCDEFG HIJ K BEGINTheMiddleEND LMNO PQRS". I do not know where "BEGINTheMiddleEND" is in the string, and I would like to end with a char that equals "TheMiddle" by finding the occurrences of "BEGIN" and "END" and grabbing what is in between.
What is the most efficient way to accomplish this (find and sub-string)?
Thanks!
-- EDIT BASED ON ANSWERS --
I have tried this:
char *searchString = "ABCDEFG HIJ K BEGINTheMiddleEND LMNO PQRS"
char *t1, *t2;
t1 = strstr(searchString, "BEGIN");
t2 = strstr(t1, "END");
But something must be wrong from a pointer standpoint as it doesn't work for me. Strstr only takes two arguments, so I'm not sure what you mean by starting at the previous pointer. I'm also not sure how to then use those pointers to substring it, as they are not integer values like strpos returns, but character pointers.
Thanks again.
-- EDIT WITH FINAL CODE --
For anyone else who hits this, the final, working code:
char *searchString = "ABCDEFG HIJ K BEGINTheMiddleEND LMNO PQRS"
char *b = strstr(searchString , "BEGIN");
char *e = strstr(b, "END");
int offset = e - b;
b[offset] = 0;
Where "b" is now equal to "BEGINTheMiddle". (which as it turns out is what I needed in this case).
Thanks again everyone.

You need to realize what a string is. A 0 delimited sequence of chars.
strstr does what it says: it finds the beginning of the given substring.
So calling strstr with the needle "BEGIN" takes you to the position of this substring. The pointer pointing to "BEGIN", advanced by 5 characters is pointing to "TheMiddle" onward to the next 0 char. By searching for "END" you can find the end pointer, and then you need to copy the substring into a new string array (or cut it, by replacing the "E" with a 0; or implement your own string functions that do not use 0 terminated strings, so they can arbitrarily overlap).
That is probably the step that you are still missing: actually copy the string. E.g. using
t3 = strndup(t1, t2 - t1);
Take the string ABCDEF0, where 0 is an actual 0 character. A pointer to the beginning points to the full string, a pointer pointing to the E points to "EF" only. If you want to get a string "AB", you need to either copy that to "AB0", or replace C by 0.
strstr does not do the copying for you. It just finds the position. If you want an index, you can do int offset = newPosition - oldPosition;, but if you need to continue searching, it's easier to work with the newPosition pointer.
All this is less intuitive than e.g. String operations in Java. Except for truncating strings, it actually is more efficient as far as I know, and if you realize the 0-terminated memory layout, it makes a lot of sense. It's only when you think of strings as arrays that it may seem odd to have a pointer somewhere in the middle, and continue using it like a regular array. That makes "sub = string + offset" the C way of writing "sub = string.substring(offset)".

use strstr() twice, but the sencond time start from the position returned by the first call to strstr() + strlen(BEGIN).
This will be efficient because the first pointer returned from strstr() is going to be the beginning of BEGIN, therefore you won't be looking through the whole string again but start at the BEGIN-ing and look for the END from there; which means that at the most you run through the whole string once.

I hope this will help
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int strpos(char *haystack, char *needle, int offset);
int main()
{
char *p = "Hello there all y'al, hope that you are all well";
int pos = strpos(p, "all", 0);
printf("First all at : %d\n", pos);
pos = strpos(p, "all", 10);
printf("Second all at : %d\n", pos);
}
int strpos(char *hay, char *needle, int offset)
{
char haystack[strlen(hay)];
strncpy(haystack, hay+offset, strlen(hay)-offset);
char *p = strstr(haystack, needle);
if (p)
return p - haystack+offset;
return -1;
}

The char array in C. How to find actual length of valid input?

Suppose i have array of characters. say char x[100]
Now, i take input from the user and store it in the char array. The user input is less than 100 characters. Now, if i want to do some operation on the valid values, how do i find how many valid values are there in the char array. Is there a C function or some way to find the actual length of valid values which will be less than 100 in this case.

Yes, C has function strlen() (from string.h), which gives you number of characters in char array. How does it know this? By definition, every C "string" must end with the null character. If it does not, you have no way of knowing how long the string is or with other words, values of which memory locations of the array are actually "useful" and which are just some dump. Knowing this, sizeof(your_string) returns the size of the array (in bytes) and NOT length of the string.
Luckily, most C library string functions that create "strings" or read input and store it into a char array will automatically attach null character at the end to terminate the "string". Some do not (for example strncpy() ). Be sure to read their descriptions carefully.
Also, take notice that this means that the buffer supplied must be at least one character longer than the specified input length. So, in your case, you must actually supply char array of length 101 to read in 100 characters (the difference of one byte is for the null character).
Example usage:
#include <stdio.h>
#include <string.h>
int main(void)
{
char *string = "Hello World";
printf("%lu\n", (unsigned long)strlen(string));
return 0;
}
strlen() is defined as:
size_t strlen(const char * str)
{
const char *s;
for (s = str; *s; ++s);
return(s - str);
}
As you see, the end of a string is found by searching for the first null character in the array.

That depends on entirely where you got the input. Most likely strlen will do the trick.

Every time you enter a string in array in ends with a null character. You just have to find where is the null character in array.
You can do this manually otherwise, strlen() will solve your problem.

char ch;
int len;
while( (ch=getche() ) != '13' )
{
len++;
}
or use strlen after converting from char to string by %s

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight