Getting substring from string in C - c

I have a string "abcdefg-this-is-a-test" and I want to delete the first 6 characters of the string. This is what I am trying:
char contentSave2[180] = "abcdefg-this-is-a-test";
strncpy(contentSave2, contentSave2+8, 4);
No luck so far, processor gets stuck and resets itself.
Any help will be appreaciated.
Question: How can I trim down a string in C?
////EDIT////
I also tried this:
memcpy(contentSave2, &contentSave2[6], 10);
Doesn't work, same problem.

int len=strlen(content2save);
for(i=6;i<len;i++)
content2save[i-6]=content2save[i];
content2save[i-6]='\0'
This will delete first 6 charcters . Based on requirement you may modify your code. If you want to use an inbuilt function try memmove

The problem with your first code snippet is that it copies the middle four characters to the beginning of the string, and then stops.
Unfortunately, you cannot expand it to cover the entire string, because in that case the source and output buffers would overlap, causing UB:
If the strings overlap, the behavior is undefined.
Overlapping buffers is the problem with your second attempt: memcpy does not allow overlapping buffers, so the behavior is undefined.
If all you need is to remove characters at the beginning of the string, you do not need to copy it at all: simply take the address of the initial character, and use it as your new string:
char *strWithoutPrefix = &contentSave2[8];
For copying of strings from one buffer to another use memcpy:
char middle[5];
memcpy(middle, &contentSave2[8], 4);
middle[4] = '\0'; // "this"
For copying potentially overlapping buffers use memmove:
char contentSave2[180] = "abcdefg-this-is-a-test";
printf("%s\n", contentSave2);
memmove(contentSave2, contentSave2+8, strlen(contentSave2)-8+1);
printf("%s\n", contentSave2);
Demo.

Simply you can use pointer because contentSave2 here is also a pointer to a char array plus this will be quick and short.
char* ptr = contentSave2 + 6;
ptr[0] will be equal to contentSave2[6]

You can use memmove function.
It is specially used when source and destination memory addresses overlap.
Small word of advice, try to avoid copying to and from overlapping source and destination. It is simply a buggen.

The following snippet should works fine:
#include <stdio.h>
#include <string.h>
int main() {
char contentSave2[180] = "abcdefg-this-is-a-test";
strncpy(contentSave2, contentSave2+8, 4);
printf("%s\n", contentSave2);
return 0;
}
I would suggest posting the rest of your code because your issue is elsewhere. As others pointed out, watch out for overlap when you use strncpy though in this specific case it should works.

Related

C code removing wrong character from string

This code is supposed to remove any leading spaces from the given string, and it was working correctly. Then, for seemingly no reason at all, it started removing characters in the middle of the word. In this example the word "CHEDDAR" is given, which has no leading spaces so it should be passed back the same as it was input, however it's returning "CHEDDR" and I have no idea why. Does anyone know how this is even possible? I assume it has to do with pointers and memory, but I am not fluent in C and I need some help. Runnning on RHEL. Thanks.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define REMOVE_LEADING_SPACES(input) \
{ \
stripFrontChar( input, ' ' ); \
}
char *stripFrontChar(char *startingString, char removeChar) {
while (*startingString == removeChar)
strcpy(startingString, startingString + 1);
return (startingString);
}
void main(argc, argv)
char **argv;int argc; {
char *result = "CHEDDAR";
REMOVE_LEADING_SPACES(result);
printf("%s\n", result);
}
EDIT: It's a little late now but based on the comments I should have shown that the word (CHEDDAR I used as an example) is read from a file, not a literal as shown in my code. I was trying to simplify it for the question and I realize now it's a completely different scenario, so I shouldn't have. Thanks, looks like I need to use memmov.
EDIT2: There actually is a space like " CHEDDAR", so I really just need to change it to memmov, thanks again everyone.
You copy a string using overlapping memory area:
strcpy(startingString, startingString + 1);
From the C standard:
7.24.2.3 The strcpy function
If copying takes place between objects that overlap, the behavior is undefined.
You need to use memmov (and provide proper length) or you need to move the characters on your own. You can also improve the performance if you start with counting the characters that need to be removed and then copy all in one go.
Another issue that was pointed out by Joop Eggen in a comment:
char *result = "CHEDDAR";
You are not allowed to modify string literals.
If you try to remove leading characters, you invoke undefined behaviour.
You should change this to
char result[] = "CHEDDAR";
As your sample string does not contain a leading space, this does not cause trouble yet. But you should fix it nevertheless
This code
strcpy(startingString, startingString + 1);
copies overlapping strings.
Per 7.24.2.3 The strcpy function, paragraph 2 of the C standard:
The strcpy function copies the string pointed to by s2 (including the terminating null character) into the array pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.
You are invoking undefined behavior.
Although the answers identifying that you are copying overlapping strings identify undefined behavior, another cause is that you use a literal string, and on most platforms those are immutable and will cause the program to abort.
Instead of:
char *result = "CHEDDAR";
Use:
char result[] = "CHEDDAR";
(Note: looking at how most strcpy functions will have been implemented, namely a loop that terminates when seeing the null character of the source string, then the overlap that you use will still see the null character of the source and place it in the destination (down-copying). Copying the other way around (up-copying) would not see the null terminator anymore, as it will have been overwritten, and may continue copying beyond the destination string.)
In your case, where no modification is needed and no allocs are done, you only need to find the start without copying anything.
char *stripFrontChar(char *startingString, char removeChar) {
for( ; *startingString == removeChar; startingString++)
;
return (startingString);
}
But you have to use the return of stripFrontChar()
printf("%s\n", stripFrontChar(result));

Confusion in "strcat function in C assumes the destination string is large enough to hold contents of source string and its own."

So I read that strcat function is to be used carefully as the destination string should be large enough to hold contents of its own and source string. And it was true for the following program that I wrote:
#include <stdio.h>
#include <string.h>
int main(){
char *src, *dest;
printf("Enter Source String : ");
fgets(src, 10, stdin);
printf("Enter destination String : ");
fgets(dest, 20, stdin);
strcat(dest, src);
printf("Concatenated string is %s", dest);
return 0;
}
But not true for the one that I wrote here:
#include <stdio.h>
#include <string.h>
int main(){
char src[11] = "Hello ABC";
char dest[15] = "Hello DEFGIJK";
strcat(dest, src);
printf("concatenated string %s", dest);
getchar();
return 0;
}
This program ends up adding both without considering that destination string is not large enough. Why is it so?
The strcat function has no way of knowing exactly how long the destination buffer is, so it assumes that the buffer passed to it is large enough. If it's not, you invoke undefined behavior by writing past the end of the buffer. That's what's happening in the second piece of code.
The first piece of code is also invalid because both src and dest are uninitialized pointers. When you pass them to fgets, it reads whatever garbage value they contain, treats it as a valid address, then tries to write values to that invalid address. This is also undefined behavior.
One of the things that makes C fast is that it doesn't check to make sure you follow the rules. It just tells you the rules and assumes that you follow them, and if you don't bad things may or may not happen. In your particular case it appeared to work but there's no guarantee of that.
For example, when I ran your second piece of code it also appeared to work. But if I changed it to this:
#include <stdio.h>
#include <string.h>
int main(){
char dest[15] = "Hello DEFGIJK";
strcat(dest, "Hello ABC XXXXXXXXXX");
printf("concatenated string %s", dest);
return 0;
}
The program crashes.
I think your confusion is not actually about the definition of strcat. Your real confusion is that you assumed that the C compiler would enforce all the "rules". That assumption is quite false.
Yes, the first argument to strcat must be a pointer to memory sufficient to store the concatenated result. In both of your programs, that requirement is violated. You may be getting the impression, from the lack of error messages in either program, that perhaps the rule isn't what you thought it was, that somehow it's valid to call strcat even when the first argument is not a pointer to enough memory. But no, that's not the case: calling strcat when there's not enough memory is definitely wrong. The fact that there were no error messages, or that one or both programs appeared to "work", proves nothing.
Here's an analogy. (You may even have had this experience when you were a child.) Suppose your mother tells you not to run across the street, because you might get hit by a car. Suppose you run across the street anyway, and do not get hit by a car. Do you conclude that your mother's advice was incorrect? Is this a valid conclusion?
In summary, what you read was correct: strcat must be used carefully. But let's rephrase that: you must be careful when calling strcat. If you're not careful, all sorts of things can go wrong, without any warning. In fact, many style guides recommend not using functions such as strcat at all, because they're so easy to misuse if you're careless. (Functions such as strcat can be used perfectly safely as long as you're careful -- but of course not all programmers are sufficiently careful.)
The strcat() function is indeed to be used carefully because it doesn't protect you from anything. If the source string isn't NULL-terminated, the destination string isn't NULL-terminated, or the destination string doesn't have enough space, strcat will still copy data. Therefore, it is easy to overwrite data you didn't mean to overwrite. It is your responsibility to make sure you have enough space. Using strncat() instead of strcat will also give you some extra safety.
Edit Here's an example:
#include <stdio.h>
#include <string.h>
int main()
{
char s1[16] = {0};
char s2[16] = {0};
strcpy(s2, "0123456789abcdefOOPS WAY TOO LONG");
/* ^^^ purposefully copy too much data into s2 */
printf("-%s-\n",s1);
return 0;
}
I never assigned to s1, so the output should ideally be --. However, because of how the compiler happened to arrange s1 and s2 in memory, the output I actually got was -OOPS WAY TOO LONG-. The strcpy(s2,...) overwrote the contents of s1 as well.
On gcc, -Wall or -Wstringop-overflow will help you detect situations like this one, where the compiler knows the size of the source string. However, in general, the compiler can't know how big your data will be. Therefore, you have to write code that makes sure you don't copy more than you have room for.
Both snippets invoke undefined behavior - the first because src and dest are not initialized to point anywhere meaningful, and the second because you are writing past the end of the array.
C does not enforce any kind of bounds checking on array accesses - you won't get an "Index out of range" exception if you try to write past the end of an array. You may get a runtime error if you try to access past a page boundary or clobber something important like the frame pointer, but otherwise you just risk corrupting data in your program.
Yes, you are responsible for making sure the target buffer is large enough for the final string. Otherwise the results are unpredictable.
I'd like to point out what is actually happening in the 2nd program in order to illustrate the problem.
It allocates 15 bytes at the memory location starting at dest and copies 14 bytes into it (including the null terminator):
char dest[15] = "Hello DEFGIJK";
...and 11 bytes at src with 10 bytes copied into it:
char src[11] = "Hello ABC";
The strcat() call then copies 10 bytes (9 chars plus the null terminator) from src into dest, starting right after the 'K' in dest. The resulting string at dest will be 23 bytes long including the null terminator. The problem is, you allocated only 15 bytes at dest, and the memory adjacent to that memory will be overwritten, i.e. corrupted, leading to program instability, wrong results, data corruption, etc.
Note that the strcat() function knows nothing about the amount of memory you've allocated at dest (or src, for that matter). It is up to you to make sure you've allocated enough memory at dest to prevent memory corruption.
By the way, the first program doesn't allocate memory at dest or src at all, so your calls to fgets() are corrupting memory starting at those locations.

Malloc, realloc, and returning pointers in C

So I am trying to get information from an html page. I use curl to get the html page. I then try to parse the html page and store the information I need in a character array, but I do not know what the size of the array should be. Keep in mind this is for an assignment so I won't be giving too much code, so I am supposed to dynamically allocate memory, but since I do not know what size it is, I have to keep allocating memory with realloc. Everything is fine within the function, but once it is returned, there is nothing stored within the pointer. Here is the code. Also if there is some library that would do this for me and you know about it, could you link me to it, would make my life a whole lot easier. Thank you!
char * parse(int * input)
{
char * output = malloc(sizeof(char));
int start = 270;
int index = start;
while(input[index]!='<')
{
output = realloc(output, (index-start+1)*sizeof(char));
output[index-start]=input[index];
index++;
}
return output;
}
The strchr function finds the first occurrence of its second argument in its first argument.
So here you'd have to find a way to run strchr starting at input[start], passing it the character '<' as second argument and store the length that strchr finds. This then gives you the length that you need to allocate for output.
Don't forget the '\0' character at the end.
Use a library function to copy the string from input to output.
Since this is an assignment, you'll probably find out the rest by yourself ...
That is the dynamic reading:
#include "stdio.h"
#include "string.h"
#include "stdlib.h"
int main(){
int mem=270;
char *str=malloc(mem);
fgets(str,mem,stdin);
while(str[strlen(str)-1]!='\n'){//checks if we ran out of space
mem*=2;
str=realloc(str,mem);//double the amount of space
fgets(str+mem/2-1,mem/2+1,stdin);//read the rest (hopefully) of the line into the new space.
}
printf("%s",str);
}
Your output needs to end with '\0'. A pointer is just a pointer to the beginning of the string, and has no length, so without a '\0' (NUL) as a sentinel, you don't know where the end is.
You generally don't want to call realloc for every individual new character. It would usually make more sense to malloc() output to be the strlen() of input and then realloc() it once at the end.
Alternatively, you should double it in size each time you realloc it instead of just adding one byte. That requires you to keep track of the current allocated length in a separate variable though, so that you know when you need to realloc.
You might read up on the function strcspn, it can be faster than using a while loop.

How strcpy works behind the scenes?

This may be a very basic question for some. I was trying to understand how strcpy works actually behind the scenes. for example, in this code
#include <stdio.h>
#include <string.h>
int main ()
{
char s[6] = "Hello";
char a[20] = "world isnsadsdas";
strcpy(s,a);
printf("%s\n",s);
printf("%d\n", sizeof(s));
return 0;
}
As I am declaring s to be a static array with size less than that of source. I thought it wont print the whole word, but it did print world isnsadsdas .. So, I thought that this strcpy function might be allocating new size if destination is less than the source. But now, when I check sizeof(s), it is still 6, but it is printing out more than that. Hows that working actually?
You've just caused undefined behaviour, so anything can happen. In your case, you're getting lucky and it's not crashing, but you shouldn't rely on that happening. Here's a simplified strcpy implementation (but it's not too far off from many real ones):
char *strcpy(char *d, const char *s)
{
char *saved = d;
while (*s)
{
*d++ = *s++;
}
*d = 0;
return saved;
}
sizeof is just returning you the size of your array from compile time. If you use strlen, I think you'll see what you expect. But as I mentioned above, relying on undefined behaviour is a bad idea.
http://natashenka.ca/wp-content/uploads/2014/01/strcpy8x11.png
strcpy is considered dangerous for reasons like the one you are demonstrating. The two buffers you created are local variables stored in the stack frame of the function. Here is roughly what the stack frame looks like:
http://upload.wikimedia.org/wikipedia/commons/thumb/d/d3/Call_stack_layout.svg/342px-Call_stack_layout.svg.png
FYI things are put on top of the stack meaning it grows backwards through memory (This does not mean the variables in memory are read backwards, just that newer ones are put 'behind' older ones). So that means if you write far enough into the locals section of your function's stack frame, you will write forward over every other stack variable after the variable you are copying to and break into other sections, and eventually overwrite the return pointer. The result is that if you are clever, you have full control of where the function returns. You could make it do anything really, but it isn't YOU that is the concern.
As you seem to know by making your first buffer 6 chars long for a 5 character string, C strings end in a null byte \x00. The strcpy function copies bytes until the source byte is 0, but it does not check that the destination is that long, which is why it can copy over the boundary of the array. This is also why your print is reading the buffer past its size, it reads till \x00. Interestingly, the strcpy may have written into the data of s depending on the order the compiler gave it in the stack, so a fun exercise could be to also print a and see if you get something like 'snsadsdas', but I can't be sure what it would look like even if it is polluting s because there are sometimes bytes in between the stack entries for various reasons).
If this buffer holds say, a password to check in code with a hashing function, and you copy it to a buffer in the stack from wherever you get it (a network packet if a server, or a text box, etc) you very well may copy more data from the source than the destination buffer can hold and give return control of your program to whatever user was able to send a packet to you or try a password. They just have to type the right number of characters, and then the correct characters that represent an address to somewhere in ram to jump to.
You can use strcpy if you check the bounds and maybe trim the source string, but it is considered bad practice. There are more modern functions that take a max length like http://www.cplusplus.com/reference/cstring/strncpy/
Oh and lastly, this is all called a buffer overflow. Some compilers add a nice little blob of bytes randomly chosen by the OS before and after every stack entry. After every copy the OS checks these bytes against its copy and terminates the program if they differ. This solves a lot of security problems, but it is still possible to copy bytes far enough into the stack to overwrite the pointer to the function to handle what happens when those bytes have been changed thus letting you do the same thing. It just becomes a lot harder to do right.
In C there is no bounds checking of arrays, its a trade off in order to have better performance at the risk of shooting yourself in the foot.
strcpy() doesn't care whether the target buffer is big enough so copying too many bytes will cause undefined behavior.
that is one of the reasons that a new version of strcpy were introduced where you can specify the target buffer size strcpy_s()
Note that sizeof(s) is determined at run time. Use strlen() to find the number of characters s occupied. When you perform strcpy() source string will be replaced by destination string so your output wont be "Helloworld isnsadsdas"
#include <stdio.h>
#include <string.h>
main ()
{
char s[6] = "Hello";
char a[20] = "world isnsadsdas";
strcpy(s,a);
printf("%s\n",s);
printf("%d\n", strlen(s));
}
You are relying on undefined behaviour in as much as that the compiler has chose to place the two arrays where your code happens to work. This may not work in future.
As to the sizeof operator, this is figured out at compile time.
Once you use adequate array sizes you need to use strlen to fetch the length of the strings.
The best way to understand how strcpy works behind the scene is...reading its source code!
You can read the source for GLibC : http://fossies.org/dox/glibc-2.17/strcpy_8c_source.html . I hope it helps!
At the end of every string/character array there is a null terminator character '\0' which marks the end of the string/character array.
strcpy() preforms its task until it sees the '\0' character.
printf() also preforms its task until it sees the '\0' character.
sizeof() on the other hand is not interested in the content of the array, only its allocated size (how big it is supposed to be), thus not taking into consideration where the string/character array actually ends (how big it actually is).
As opposed to sizeof(), there is strlen() that is interested in how long the string actually is (not how long it was supposed to be) and thus counts the number of characters until it reaches the end ('\0' character) where it stops (it doesn't include the '\0' character).
Better Solution is
char *strcpy(char *p,char const *q)
{
char *saved=p;
while(*p++=*q++);
return saved;
}

Make a copy of a char*

I have a function that accepts a char* as one of its parameters. I need to manipulate it, but leave the original char* intact. Essentially, I want to create a working copy of this char*. It seems like this should be easy, but I am really struggling.
My first (naive) attempt was to create another char* and set it equal to the original:
char* linkCopy = link;
This doesn't work, of course, because all I did was cause them to point to the same place.
Should I use strncpy to accomplish this?
I have tried the following, but it causes a crash:
char linkCopy[sizeof(link)] = strncpy(linkCopy, link, sizeof(link));
Am I missing something obvious...?
EDIT: My apologies, I was trying to simplify the examples, but I left some of the longer variable names in the second example. Fixed.
The sizeof will give you the size of the pointer. Which is often 4 or 8 depending on your processor/compiler, but not the size of the string pointed to. You can use strlen and strcpy:
// +1 because of '\0' at the end
char * copy = malloc(strlen(original) + 1);
strcpy(copy, original);
...
free(copy); // at the end, free it again.
I've seen some answers propose use of strdup, but that's a posix function, and not part of C.
You might want to take a look at the strdup (man strdup) function:
char *linkCopy = strdup(link);
/* Do some work here */
free(linkCopy);
Edit: And since you need it to be standard C, do as others have pointed out:
char *linkCopy = malloc(strlen(link) + 1);
/* Note that strncpy is unnecessary here since you know both the size
* of the source and destination buffers
*/
strcpy(linkCopy, link);
/* Do some work */
free(linkCopy);
Since strdup() is not in ANSI/ISO standard C, if it's not available in your compiler's runtime, go ahead and use this:
/*
** Portable, public domain strdup() originally by Bob Stout
*/
#include <stdlib.h>
#include <string.h>
char* strdup(const char* str)
{
char* newstr = (char*) malloc( strlen( str) + 1);
if (newstr) {
strcpy( newstr, str);
}
return newstr;
}
Use strdup, or strndup if you know the size (more secure).
Like:
char* new_char = strdup(original);
... manipulate it ...
free(new_char)
ps.: Not a C standard
Some answers, including the accepted one are a bit off. You do not strcpy a string you have just strlen'd. strcpy should not be used at all in modern programs.
The correct thing to do is a memcpy.
EDIT: memcpy is very likely to be faster in any architecture, strcpy can only possibly perform better for very short strings and should be avoided for security reasons even if they are not relevant in this case.
You are on the right track, you need to use strcpy/strncpy to make copies of strings. Simply assigning them just makes an "alias" of it, a different name that points to the same thing.
Your main problem in your second attempt is that you can't assign to an array that way. The second problem is you seem to have come up with some new names in the function call that I can't tell where they came from.
What you want is:
char linkCopy[sizeof(link)];
strncpy(linkCopy, chLastLink, sizeof(link));
but be careful, sizeof does not always work the way you want it to on strings. Use strlen, or use strdup.
Like sean.bright said strdup() is the easiest way to deal with the copy. But strdup() while widely available is not std C. This method also keeps the copied string in the heap.
char *linkCopy = strdup(link);
/* Do some work here */
free(linkCopy);
If you are committed to using a stack allocated string and strncpy() you need some changes. You wrote:
char linkCopy[sizeof(link)]
That creates a char array (aka string) on the stack that is the size of a pointer (probably 4 bytes). Your third parameter to strncpy() has the same problem. You probably want to write:
char linkCopy[strlen(link)+1];
strncpy(linkCopy,link,strlen(link)+1);
You don't say whether you can use C++ instead of C, but if you can use C++ and the STL it's even easier:
std::string newString( original );
Use newString as you would have used the C-style copy above, its semantics are identical. You don't need to free() it, it is a stack object and will be disposed of automatically.

Resources