I am trying to remove the whitespace at the start of a string, I have the index of the first non whitespace character, so I tried to do this:
int firstNonWhitespace = ...;
char *line = ...;
char *realStart = line + firstNonWhiteSpace;
strcpy(line, realStart);
but got Abort Trap 6 when at runtime.
However it works if I copy the realStart string to a temporary string, and then copy the temporary string to line:
int firstNonWhitespace = ...;
char *line = ...;
char *realStart = line + firstNonWhiteSpace;
char *tstring = malloc(strlen(realStart) + 1);
strcpy(tstring, realStart);
strncpy(line, tstring, strlen(line));
free(tstring);
There are two problems with your code.
The source and destination in the call to strcpy() do overlap, which results in Undefined Behaviour.
It might well be the case that realStart points to some non-writeable area of memory.
The faster way is
line += firstNonWhiteSpace;
but that might have consequences for your memory management, in case that part of memory was dynamically allocated. Only do this if you know what you are doing.
int main()
{
char a[] = " hey";
int i = 0;
char *p = a;
while(a[i++] == ' ');
strcpy(p, p + i - 1);
printf("%s\n", a);
}
Your problem is likely that you are not allowed to modify string literals, i. e. the code
int main() {
int firstNonWhitespace = 3;
char *line = " foo";
char *realStart = line + firstNonWhiteSpace;
strcpy(line, realStart);
}
may or may not work depending on whether your platform protects against modifying the string literal " foo". Copying the string first is required by the language standard.
Also, since strcpy() is not guaranteed to work correctly on overlapping strings (you might get lucky, though), use memmove() to do the moving.
Related
#include <stdio.h>
void append(char* s, char n);
void splitstr(char* string);
int main()
{
splitstr("COMPUTE 1-1");
printf("\n");
splitstr("COMPUTE 1+1");
printf("\n");
splitstr("COMPUTE 1*1");
return 0;
}
void append(char* s, char ch) {
while(*s != '\0'){
s = s + 1;
}
*s = ch;
s = s + 1;
*s = '\0';
}
void splitstr(char* string){
int count = 1;
char* expression = "";
while(*string != '\0'){
if(count > 8){
append(expression, *string);
string = string + 1;
count = count + 1;
}else{
string = string + 1;
count = count + 1;
}
}
printf("%s",expression);
}
Example Input and Output:
Input: COMPUTE 1+1
Output: 1+1
Input: COMPUTE 2-6
Output: 2-6
Originally, this code does not include stdio.h (I am doing this for testing on an online C compiler) because I am building an OS from scratch so I need to write all the functions by myself. I think the problem might be in the append function but I cannot find it.
instead of
char* expression = "";
do
char[MAX_expression_length+1] expression;
or use realloc in the append function
I think this line is the culprit:
append(expression, *string);
Notice how expression is declared:
char* expression = "";
In other words, expression consists of one byte, a single \0. Right away, we can see that append() won't work like you want it to--the while loop will never run, because *s is already \0.
But beyond that, the segfault likely happens at the bottom of append(). After the while loop, you unconditionally increment s and then write to the location it now points to. The problem is that this is a location that has never been allocated (since s is a reference to splitstr()'s expression, which is a single byte long). Furthermore, because expression is declared as a string constant, depending on your platform it may be placed in an area of memory marked read-only. Consequently, this is an attempt to write into memory that may not actually belong to the process and may also not be writable, raising the fault.
expression points to a string literal, and trying to modify a string literal leads to undefined behavior.
You need to define expression as an array of char large enough to store your final result:
char expression[strlen(string)+1]; // VLA
Since your result isn’t going to be any longer than the source string, this should be sufficient (provided your implementation supports VLAs).
I have to do an exercise and I have this structure given:
typedef struct {
char *str;
unsigned int len;
} String;
My Task is to write a String Concat which concats "Kartoffel" and "puffer" to "Kartoffelpuffer" (potato fritter).
String concat(String l, String r)
Both Strings l and r should not be changed after running the function.
First I created the two Strings in the main:
String1 *l = malloc(sizeof(String1));
String1 *r = malloc(sizeof(String1));
(*l).str = malloc(sizeof("Kartoffel"));
(*r).str = malloc(sizeof("puffer"));
(*l).str = "Kartoffel";
(*r).str = "puffer";
(*l).len = 9;
(*r).len = 6;
Then I wrote the concat function:
String1 concat(String1 l, String1 r) {
unsigned int i = 0;
String1 *newStr = malloc(sizeof(String1));
/* +1 for '\0' at the end */
newStr->str = malloc(l.len + r.len + 1);
newStr->str = l.str;
/* The following line is not working */
newStr->str[l.len] = *r.str;
newStr->len = l.len + r.len;
return *newStr;
}
What Im trying to do is working with pointer arithmetic.
When there is a pointer which points to the beginning of a storage area like char *str, it should be possible to move the pointer with a[b] or *((a) + (b)) right? When I run the code I get Segmentation fault (I hope its the right translation. Original: "Speicherzugriffsfehler").
If someone could give me a hint I would be thankful. PS: Sorry for my English.
First, (*l).str = "Kartoffel"; makes (*l).str point to the "Kartoffel" string literal, meaning that the original memory allocated to (*l).str with malloc() is lost. Same for (*r).str = "puffer";. One of the things you can do to avoid this is copy the string into the allocated buffer by looping over the characters in a for loop (since you can't use string.h).
Then, in your concat() function, you do the same thing. You allocate the memory for newStr->str with malloc() (properly allocating an extra char for the null-terminator), but on the next line you re-assign that pointer to point to l.str, which is still pointing to the string literal. Then, with newStr->str[l.len] = *r.str; you are attempting to modify the string literal, which in C is undefined behavior.
The way to fix this could be, again, to copy the two strings into the buffer allocated with newStr->str = malloc(l.len+r.len+1);.
After allocating memeory to newStr and newStr->str
Two pointers could be used. char *to, *from;
Set the pointers with to = newStr->str; and from = l.str;
copy the characters with *to = *from;
Advance the pointers with to++; and from++;
Repeat until *from == 0
Set from with from = r.str;
to does not need to be reset as it is correctly positioned at the end of newStr->str.
Repeat the copy of characters.
Repeat advancing the pointers.
Set a terminating 0 with *to = 0;
Thank you very much for your help!
I wrote another method to copy the string as you guys said.
char * copyStr (char * dest,char * src){
unsigned int index;
for (index = 0; src[index] != '\0'; index++) {
dest[index] = src[index];
}
dest[index] = '\0';
return dest;
}
And I edited my concat like that:
String1 concat (String1 l, String1 r){
String1 *newStr = malloc(sizeof(String1));
newStr->str = malloc(l.len+r.len+1);
copyStr(newStr->str,l.str);
copyStr((newStr->str+l.len),r.str);
newStr->len = l.len+r.len;
return *newStr;
}
with newStr->str+l.len the pointer will be moved. If l.len is 9, the pointer will point to the 10th byte, which is the end of the first string l. So the the String r will be copied in the memory storage behind the first string l.
while this Code works:
char * k = "asd";
char * j = malloc(sizeof(char) * 3);
memmove(j,k,3);
printf("%s",j);
while code gives error:
char * k = "asd";
char * j = malloc(sizeof(char) * 3);
memmove(k,k+1,3);
printf("%s",k); // output should be "sd"
I am thinking wrong? Why it gives an erorr? I'm planning to use it for deleting the multiple whitespaces ("aaa.......bbb"(dots are spaces) -> "aaa bbb")
Thank you.
A declaration like
char *k = "asd";
causes the string literal to be stored in the read-only data segment. (C compilers tend to not warn for this case even though declaring the pointer as const char *k = "asd" would be safer, for historical reasons.)
If you want the string contents to be modifiable, you will need to use an array instead, like
char k[] = "asd";
When you do char *k = "asd", the string "asd" is placed in the read only parts of memory and the pointer k is made to point there. You cannot write to this location using memmove().
You should instead use char k[] = "asd".
The statement
memmove(k,k+1,3);
tries to shift the elements of string literal asd by 1. String literals are non modifiable. Any attempt to modify it will invoke undefined behavior.
I'm trying to learn C programming and spent some time practicing with pointers this morning, by writing a little function to replace the lowercase characters in a string to their uppercase counterparts. This is what I got:
#include <stdio.h>
#include <string.h>
char *to_upper(char *src);
int main(void) {
char *a = "hello world";
printf("String at %p is \"%s\"\n", a, a);
printf("Uppercase becomes \"%s\"\n", to_upper(a));
printf("Uppercase becomes \"%s\"\n", to_upper(a));
return 0;
}
char *to_upper(char *src) {
char *dest;
int i;
for (i=0;i<strlen(src);i++) {
if ( 71 < *(src + i) && 123 > *(src + i)){
*(dest+i) = *(src + i) ^ 32;
} else {
*(dest+i) = *(src + i);
}
}
return dest;
}
This runs fine and prints exactly what it should (including the repetition of the "HELLO WORLD" line), but afterwards ends in a Segmentation fault. What I can't understand is that the function is clearly compiling, executing and returning successfully, and the flow in main continues. So is the Segmentation fault happening at return 0?
dest is uninitialised in your to_upper() function. So, you're overwriting some random part of memory when you do that, and evidently that causes your program to crash as you try to return from main().
If you want to modify the value in place, initialise dest:
char *dest = src;
If you want to make a copy of the value, try:
char *dest = strdup(src);
If you do this, you will need to make sure somebody calls free() on the pointer returned by to_upper() (unless you don't care about memory leaks).
Like everyone else has pointed out, the problem is that dest hasn't been initialized and is pointing to a random location that contains something important. You have several choices of how to deal with this:
Allocate the dest buffer dynamically and return that pointer value, which the caller is responsible for freeing;
Assign dest to point to src and modify the value in place (in which case you'll have to change the declaration of a in main() from char *a = "hello world"; to char a[] = "hello world";, otherwise you'll be trying to modify the contents of a string literal, which is undefined);
Pass the destination buffer as a separate argument.
Option 1 -- allocate the target buffer dynamically:
char *to_upper(char *src)
{
char *dest = malloc(strlen(src) + 1);
...
}
Option 2 -- have dest point to src and modify the string in place:
int main(void)
{
char a[] = "hello world";
...
}
char *to_upper(char *src)
{
char *dest = src;
...
}
Option 3 -- have main() pass the target buffer as an argument:
int main(void)
{
char *a = "hello world";
char *b = malloc(strlen(a) + 1); // or char b[12];
...
printf("Uppercase becomes %s\n", to_upper(a,b));
...
free(b); // omit if b is statically allocated
return 0;
}
char *to_upper(char *src, char *dest)
{
...
return dest;
}
Of the three, I prefer the third option; you're not modifying the input (so it doesn't matter whether a is an array of char or a pointer to a string literal) and you're not splitting memory management responsibilities between functions (i.e., main() is solely responsible for allocating and freeing the destination buffer).
I realize you're trying to familiarize yourself with how pointers work and some other low-level details, but bear in mind that a[i] is easier to read and follow than *(a+i). Also, there are number of functions in the standard library such as islower() and toupper() that don't rely on specific encodings (such as ASCII):
#include <ctype.h>
...
if (islower(src[i])
dest[i] = toupper(src[i]);
As others have said, your problem is not allocating enough space for dest. There is another, more subtle problem with your code.
To convert to uppercase, you are testing a given char to see if it lies between 71 ans 123, and if it does, you xor the value with 32. This assumes ASCII encoding of characters. ASCII is the most widely used encoding, but it is not the only one.
It is better to write code that works for every type of encoding. If we were sure that 'a', 'b', ..., 'z', and 'A', 'B', ..., 'Z', are contiguous, then we could calculate the offset from the lowercase letters to the uppercase ones and use that to change case:
/* WARNING: WRONG CODE */
if (c >= 'a' && c <= 'z') c = c + 'A' - 'a';
But unfortunately, there is no such guarantee given by the C standard. In fact EBCDIC encoding is an example.
So, to convert to uppercase, you can either do it the easy way:
#include <ctype.h>
int d = toupper(c);
or, roll your own:
/* Untested, modifies it in-place */
char *to_upper(char *src)
{
static const char *lower = "abcdefghijklmnopqrstuvwxyz";
static const char *upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
static size_t n = strlen(lower);
size_t i;
size_t m = strlen(src);
for (i=0; i < m; ++i) {
char *tmp;
while ((tmp = strchr(lower, src[i])) != NULL) {
src[i] = upper[tmp-lower];
}
}
}
The advantage of toupper() is that it checks the current locale to convert characters to upper case. This may make æ to Æ for example, which is usually the correct thing to do. Note: I use only English and Hindi characters myself, so I could be wrong about my particular example!
As noted by others, your problem is that char *dest is uninitialized. You can modify src's memory in place, as Greg Hewgill suggests, or you can use malloc to reserve some:
char *dest = (char *)malloc(strlen(src) + 1);
Note that the use of strdup suggested by Greg performs this call to malloc under the covers. The '+ 1' is to reserve space for the null terminator, '\0', which you should also be copying from src to dest. (Your current example only goes up to strlen, which does not include the null terminator.) Can I suggest that you add a line like this after your loop?
*(dest + i) = 0;
This will correctly terminate the string. Note that this only applies if you choose to go the malloc route. Modifying the memory in place or using strdup will take care of this problem for you. I'm just pointing it out because you mentioned you were trying to learn.
Hope this helps.
void reverse(char *str){
int i,j;
char temp;
for(i=0,j=strlen(str)-1; i<j; i++, j--){
temp = *(str + i);
*(str + i) = *(str + j);
*(str + j) = temp;
printf("%c",*(str + j));
}
}
int main (int argc, char const *argv[])
{
char *str = "Shiv";
reverse(str);
printf("%s",str);
return 0;
}
When I use char *str = "Shiv" the lines in the swapping part of my reverse function i.e str[i]=str[j] dont seem to work, however if I declare str as char str[] = "Shiv", the swapping part works? What is the reason for this. I was a bit puzzled by the behavior, I kept getting the message "Bus error" when I tried to run the program.
When you use char *str = "Shiv";, you don't own the memory pointed to, and you're not allowed to write to it. The actual bytes for the string could be a constant inside the program's code.
When you use char str[] = "Shiv";, the 4(+1) char bytes and the array itself are on your stack, and you're allowed to write to them as much as you please.
The char *str = "Shiv" gets a pointer to a string constant, which may be loaded into a protected area of memory (e.g. part of the executable code) that is read only.
char *str = "Shiv";
This should be :
const char *str = "Shiv";
And now you'll have an error ;)
Try
int main (int argc, char const *argv[])
{
char *str = malloc(5*sizeof(char)); //4 chars + '\0'
strcpy(str,"Shiv");
reverse(str);
printf("%s",str);
free(str); //Not needed for such a small example, but to illustrate
return 0;
}
instead. That will get you read/write memory when using pointers. Using [] notation allocates space in the stack directly, but using const pointers doesn't.
String literals are non-modifiable objects in both C and C++. An attempt to modify a string literal always results in undefined behavior. This is exactly what you observe when you get your "Bus error" with
char *str = "Shiv";
variant. In this case your 'reverse' function will make an attempt to modify a string literal. Thus, the behavior is undefined.
The
char str[] = "Shiv";
variant will create a copy of the string literal in a modifiable array 'str', and then 'reverse' will operate on that copy. This will work fine.
P.S. Don't create non-const-qualified pointers to string literals. You first variant should have been
const char *str = "Shiv";
(note the extra 'const').
String literals (your "Shiv") are not modifiable.
You assign to a pointer the address of such a string literal, then you try to change the contents of the string literal by dereferencing the pointer value. That's a big NO-NO.
Declare str as an array instead:
char str[] = "Shiv";
This creates str as an array of 5 characters and copies the characters 'S', 'h', 'i', 'v' and '\0' to str[0], str[1], ..., str[4]. The values in each element of str are modifiable.
When I want to use a pointer to a string literal, I usually declare it const. That way, the compiler can help me by issuing a message when my code wants to change the contents of a string literal
const char *str = "Shiv";
Imagine you could do the same with integers.
/* Just having fun, this is not C! */
int *ptr = &5; /* address of 5 */
*ptr = 42; /* change 5 to 42 */
printf("5 + 1 is %d\n", *(&5) + 1); /* 6? or 43? :) */
Quote from the Standard:
6.4.5 String literals
...
6 ... If the program attempts to modify such an array [a string literal], the behavior is undefined.
char *str is a pointer / reference to a block of characters (the string). But its sitting somewhere in a block of memory so you cannot just assign it like that.
Interesting that I've never noticed this. I was able to replicate this condition in VS2008 C++.
Typically, it is a bad idea to do in-place modification of constants.
In any case, this post explains this situation pretty clearly.
The first (char[]) is local data you can edit
(since the array is local data).
The second (char *) is a local pointer to
global, static (constant) data. You
are not allowed to modify constant
data.
If you have GNU C, you can compile
with -fwritable-strings to keep the
global string from being made
constant, but this is not recommended.