I have worked on Python and I know that to concatenate a string to be --- you can simply "multiply" an integer by the char "-", so in this case we can simply do it like result=3*"-". I am stuck on trying to do this in C language.
How can I do this in C, for example:
#include <stdio.h>
int main (void)
{
int height=0;
int n=0;
char symbol='#';
printf("Height: ");
scanf("%d",&height);
n=height+1;
while (n>=2)
{
printf("symbol*n");
n=n-1;
}
return 0;
}
So it prints an inverted pyramid for height=5:
#####
####
###
##
#
Thank you in advance!!
There isn't a built-in way to repeat the output like that. You have to code it yourself.
void multiputchar(char c, size_t count)
{
for (int i = 0; i < count; i++)
putchar(c);
}
For a library function, you might care about whether putchar() fails, so you might be better to write:
int multiputchar(char c, size_t count)
{
for (int i = 0; i < count; i++)
{
if (putchar(c) == EOF)
return(EOF);
}
return (unsigned char)c;
}
But if the return value will always be ignored, the first is simpler. The cast is necessary to ensure that if your char type is signed, you can tell the difference between a failure and successful output of ÿ (y-umlaut, U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS, 0xFF in 8859-1 and related code sets).
In C, you also need to handle the memory you use. So if you wanted a "---" string, you would also need to allocate space for that string. Once allocated the space, you would fill it with the given character.
And afterwards, you'd have to free the area.
So:
char *charmul(char c, int n)
{
int i;
char *buffer; // Buffer to allocate
buffer = malloc(n+1); // To store N characters we need N bytes plus a zero
for (i = 0; i < n; i++)
buffer[i] = c;
buffer[n] = 0;
return buffer;
}
Then we'd need to add error checking:
char *charmul(char c, int n)
{
int i;
char *buffer; // Buffer to allocate
buffer = malloc(n+1); // To store N characters we need N bytes plus a zero
if (NULL == buffer)
return NULL;
for (i = 0; i < n; i++)
buffer[i] = c;
buffer[n] = 0;
return buffer;
}
Your source would become:
#include <stdio.h>
// charmul here
int main (void)
{
int height=0;
int n=0;
char symbol='#';
printf("Height: ");
scanf("%d",&height);
n=height+1;
while (n>=2)
{
char *s;
s = charmul(symbol, n);
printf("%s\n", s);
free(s); s = NULL;
n=n-1;
}
return 0;
}
An alternative implementation would seek to reduce the number of malloc's, to enhance performance. To do so you'd need to also pass to the function a pointer to the previous buffer, which, if shorter, could be recycled with no need for a further malloc, and if longer, would be free'd and reallocated (or one could use realloc). You would then do a free() only of the last nonrecycled value:
char *charmul_recycle(char c, int n, char *prevbuf)
{
int i;
if (prevbuf && (n > strlen(*prevbuf)))
{
free(prevbuf); prevbuf = NULL;
}
if ((NULL == prevbuf)
{
prevbuf = malloc(n+1); // To store N characters we need N bytes plus a zero
if (NULL == prevbuf)
return NULL;
}
for (i = 0; i < n; i++)
prevbuf[i] = c;
prevbuf[n] = 0;
return prevbuf;
}
char *buffer = NULL;
while(n > 2)
{
buffer = charmul_recycle(symbol, n, buffer);
if (NULL == buffer)
{
fprintf(stderr, "out of memory\n");
abort();
}
printf("%s\n", buffer);
n--;
}
Of course the whole thing can be done with a single straight allocation and a progressive shortening of the string (by placing s[n] to be zero), but then we wouldn't be using the "generating multiple character" features:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (void)
{
char *string;
int height=0;
int n=0;
char symbol='#';
printf("Height: ");
scanf("%d",&height);
n=height+1;
string = malloc(n); // allocate just enough memory (n is height+1)
memset(string, symbol, n); // fill string with symbols
while (--n) // with n ever decreasing...
{
string[n] = 0; // Truncate string after n characters
printf("%s\n", string); // output string
}
free(string); // string = NULL; // finally free the string
return 0;
}
Update (thanks to Jonathan Leffler): above, there's a potentially dangerous "overoptimization". Immediately before each use of string, string is correctly zero-terminated ("string[n] = 0;"). But it remains true that I have allocated a string variable and filled it with stuff, and did not immediately zero-terminate it. In the above code, it all works out perfectly. It's still bad coding practice, because if the code was reused and the cycle removed, and the string used for some other purpose (which in this case is unlikely, but still...), the nonterminated string might become a subtle bug.
The quickest fix is to slap a termination after the allocation:
string = malloc(n); // allocate just enough memory (n is height+1)
memset(string, symbol, n-1); // fill string with symbols
string[n-1] = 0; // zero-terminate string
I've now wandered far from the original topic, but this would mean that in this instance the string is correctly zero-terminated twice. To avoid this, the code can be rewritten into a "cut and paste safe" version, also more clearly showing the extra zero as addition to n:
n=height; // Number of characters
string = malloc(n+1); // Allocate memory for characters plus zero
memset(string, symbol, n); // Store the characters
string[n] = 0; // Store the zero
while (n) // While there are characters
{
printf("%s\n", string); // Print the string
string[--n] = 0; // Reduce it to one character less than before
}
The cycle now accepts any valid string with meaningful n, and if it is removed, the string is left in a useable state.
Related
My str_split function returns (or at least I think it does) a char** - so a list of strings essentially. It takes a string parameter, a char delimiter to split the string on, and a pointer to an int to place the number of strings detected.
The way I did it, which may be highly inefficient, is to make a buffer of x length (x = length of string), then copy element of string until we reach delimiter, or '\0' character. Then it copies the buffer to the char**, which is what we are returning (and has been malloced earlier, and can be freed from main()), then clears the buffer and repeats.
Although the algorithm may be iffy, the logic is definitely sound as my debug code (the _D) shows it's being copied correctly. The part I'm stuck on is when I make a char** in main, set it equal to my function. It doesn't return null, crash the program, or throw any errors, but it doesn't quite seem to work either. I'm assuming this is what is meant be the term Undefined Behavior.
Anyhow, after a lot of thinking (I'm new to all this) I tried something else, which you will see in the code, currently commented out. When I use malloc to copy the buffer to a new string, and pass that copy to aforementioned char**, it seems to work perfectly. HOWEVER, this creates an obvious memory leak as I can't free it later... so I'm lost.
When I did some research I found this post, which follows the idea of my code almost exactly and works, meaning there isn't an inherent problem with the format (return value, parameters, etc) of my str_split function. YET his only has 1 malloc, for the char**, and works just fine.
Below is my code. I've been trying to figure this out and it's scrambling my brain, so I'd really appreciate help!! Sorry in advance for the 'i', 'b', 'c' it's a bit convoluted I know.
Edit: should mention that with the following code,
ret[c] = buffer;
printf("Content of ret[%i] = \"%s\" \n", c, ret[c]);
it does indeed print correctly. It's only when I call the function from main that it gets weird. I'm guessing it's because it's out of scope ?
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#define DEBUG
#ifdef DEBUG
#define _D if (1)
#else
#define _D if (0)
#endif
char **str_split(char[], char, int*);
int count_char(char[], char);
int main(void) {
int num_strings = 0;
char **result = str_split("Helo_World_poopy_pants", '_', &num_strings);
if (result == NULL) {
printf("result is NULL\n");
return 0;
}
if (num_strings > 0) {
for (int i = 0; i < num_strings; i++) {
printf("\"%s\" \n", result[i]);
}
}
free(result);
return 0;
}
char **str_split(char string[], char delim, int *num_strings) {
int num_delim = count_char(string, delim);
*num_strings = num_delim + 1;
if (*num_strings < 2) {
return NULL;
}
//return value
char **ret = malloc((*num_strings) * sizeof(char*));
if (ret == NULL) {
_D printf("ret is null.\n");
return NULL;
}
int slen = strlen(string);
char buffer[slen];
/* b is the buffer index, c is the index for **ret */
int b = 0, c = 0;
for (int i = 0; i < slen + 1; i++) {
char cur = string[i];
if (cur == delim || cur == '\0') {
_D printf("Copying content of buffer to ret[%i]\n", c);
//char *tmp = malloc(sizeof(char) * slen + 1);
//strcpy(tmp, buffer);
//ret[c] = tmp;
ret[c] = buffer;
_D printf("Content of ret[%i] = \"%s\" \n", c, ret[c]);
//free(tmp);
c++;
b = 0;
continue;
}
//otherwise
_D printf("{%i} Copying char[%c] to index [%i] of buffer\n", c, cur, b);
buffer[b] = cur;
buffer[b+1] = '\0'; /* extend the null char */
b++;
_D printf("Buffer is now equal to: \"%s\"\n", buffer);
}
return ret;
}
int count_char(char base[], char c) {
int count = 0;
int i = 0;
while (base[i] != '\0') {
if (base[i++] == c) {
count++;
}
}
_D printf("Found %i occurence(s) of '%c'\n", count, c);
return count;
}
You are storing pointers to a buffer that exists on the stack. Using those pointers after returning from the function results in undefined behavior.
To get around this requires one of the following:
Allow the function to modify the input string (i.e. replace delimiters with null-terminator characters) and return pointers into it. The caller must be aware that this can happen. Note that supplying a string literal as you are doing here is illegal in C, so you would instead need to do:
char my_string[] = "Helo_World_poopy_pants";
char **result = str_split(my_string, '_', &num_strings);
In this case, the function should also make it clear that a string literal is not acceptable input, and define its first parameter as const char* string (instead of char string[]).
Allow the function to make a copy of the string and then modify the copy. You have expressed concerns about leaking this memory, but that concern is mostly to do with your program's design rather than a necessity.
It's perfectly valid to duplicate each string individually and then clean them all up later. The main issue is that it's inconvenient, and also slightly pointless.
Let's address the second point. You have several options, but if you insist that the result be easily cleaned-up with a call to free, then try this strategy:
When you allocate the pointer array, also make it large enough to hold a copy of the string:
// Allocate storage for `num_strings` pointers, plus a copy of the original string,
// then copy the string into memory immediately following the pointer storage.
char **ret = malloc((*num_strings) * sizeof(char*) + strlen(string) + 1);
char *buffer = (char*)&ret[*num_strings];
strcpy(buffer, string);
Now, do all your string operations on buffer. For example:
// Extract all delimited substrings. Here, buffer will always point at the
// current substring, and p will search for the delimiter. Once found,
// the substring is terminated, its pointer appended to the substring array,
// and then buffer is pointed at the next substring, if any.
int c = 0;
for(char *p = buffer; *buffer; ++p)
{
if (*p == delim || !*p) {
char *next = p;
if (*p) {
*p = '\0';
++next;
}
ret[c++] = buffer;
buffer = next;
}
}
When you need to clean up, it's just a single call to free, because everything was stored together.
The string pointers you store into the res with ret[c] = buffer; array point to an automatic array that goes out of scope when the function returns. The code subsequently has undefined behavior. You should allocate these strings with strdup().
Note also that it might not be appropriate to return NULL when the string does not contain a separator. Why not return an array with a single string?
Here is a simpler implementation:
#include <stdlib.h>
char **str_split(const char *string, char delim, int *num_strings) {
int i, n, from, to;
char **res;
for (n = 1, i = 0; string[i]; i++)
n += (string[i] == delim);
*num_strings = 0;
res = malloc(sizeof(*res) * n);
if (res == NULL)
return NULL;
for (i = from = to = 0;; from = to + 1) {
for (to = from; string[to] != delim && string[to] != '\0'; to++)
continue;
res[i] = malloc(to - from + 1);
if (res[i] == NULL) {
/* allocation failure: free memory allocated so far */
while (i > 0)
free(res[--i]);
free(res);
return NULL;
}
memcpy(res[i], string + from, to - from);
res[i][to - from] = '\0';
i++;
if (string[to] == '\0')
break;
}
*num_strings = n;
return res;
}
I wanted to know if there was a way to use scanf so I can take in an unknown number of string arguments and put them into a char* array. I have seen it being done with int values, but can't find a way for it to be done with char arrays. Also the arguments are entered on the same line separated by spaces.
Example:
user enters hello goodbye yes, hello gets stored in array[0], goodbye in array[1] and yes in array[2]. Or the user could just enter hello and then the only thing in the array would be hello.
I do not really have any code to post, as I have no real idea how to do this.
You can do something like, read until the "\n" :
scanf("%[^\n]",buffer);
you need to allocate before hand a big enough buffer.
Now go through the buffer count the number of words, and allocate the necessary space char **array = ....(dynamic string allocation), go to the buffer and copy string by string into the array.
An example:
int words = 1;
char buffer[128];
int result = scanf("%127[^\n]",buffer);
if(result > 0)
{
char **array;
for(int i = 0; buffer[i]!='\0'; i++)
{
if(buffer[i]==' ' || buffer[i]=='\n' || buffer[i]=='\t')
{
words++;
}
}
array = malloc(words * sizeof(char*));
// Using RoadRunner suggestion
array[0] = strtok (buffer," ");
for(int w = 1; w < words; w++)
{
array[w] = strtok (NULL," ");
}
}
As mention in the comments you should use (if you can) fgets instead fgets(buffer,128,stdin);.
More about strtok
If you have an upper bound to the number of strings you may receive from the user, and to the number of characters in each string, and all strings are entered on a single line, you can do this with the following steps:
read the full line with fgets(),
parse the line with sscanf() with a format string with the maximum number of %s conversion specifiers.
Here is an example for up to 10 strings, each up to 32 characters:
char buf[400];
char s[10][32 + 1];
int n = 0;
if (fgets(buf, sizeof buf, sdtin)) {
n = sscanf("%32s%32s%32s%32s%32s%32s%32s%32s%32s%32s",
s[0], s[1], s[2], s[3], s[4], s[5], s[6], s[7], s[8], s[9]));
}
// `n` contains the number of strings
// s[0], s[1]... contain the strings
If the maximum number is not known of if the maximum length of a single string is not fixed, or if the strings can be input on successive lines, you will need to iterate with a simple loop:
char buf[200];
char **s = NULL;
int n;
while (scanf("%199s", buf) == 1) {
char **s1 = realloc(s, (n + 1) * sizeof(*s));
if (s1 == NULL || (s1[n] = strdup(buf)) == NULL) {
printf("allocation error");
exit(1);
}
s = s1;
n++;
}
// `n` contains the number of strings
// s[0], s[1]... contain pointers to the strings
Aside from the error handling, this loop is comparable to the hard-coded example above but it still has a maximum length for each string. Unless you can use a scanf() extension to allocate the strings automatically (%as on GNU systems), the code will be more complicated to handle any number of strings with any possible length.
You can use:
fgets to read input from user. You have an easier time using this instead of scanf.
malloc to allocate memory for pointers on the heap. You can use a starting size, like in this example:
size_t currsize = 10
char **strings = malloc(currsize * sizeof(*strings)); /* always check
return value */
and when space is exceeded, then realloc more space as needed:
currsize *= 2;
strings = realloc(strings, currsize * sizeof(*strings)); /* always check
return value */
When finished using the requested memory from malloc() and realloc(), it's always to good to free the pointers at the end.
strtok to parse the input at every space. When copying over the char * pointer from strtok(), you must also allocate space for strings[i], using malloc() or strdup.
Here is an example I wrote a while ago which does something very similar to what you want:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define INITSIZE 10
#define BUFFSIZE 100
int
main(void) {
char **strings;
size_t currsize = INITSIZE, str_count = 0, slen;
char buffer[BUFFSIZE];
char *word;
const char *delim = " ";
int i;
/* Allocate initial space for array */
strings = malloc(currsize * sizeof(*strings));
if(!strings) {
printf("Issue allocating memory for array of strings.\n");
exit(EXIT_FAILURE);
}
printf("Enter some words(Press enter again to end): ");
while (fgets(buffer, BUFFSIZE, stdin) != NULL && strlen(buffer) > 1) {
/* grow array as needed */
if (currsize == str_count) {
currsize *= 2;
strings = realloc(strings, currsize * sizeof(*strings));
if(!strings) {
printf("Issue reallocating memory for array of strings.\n");
exit(EXIT_FAILURE);
}
}
/* Remove newline from fgets(), and check for buffer overflow */
slen = strlen(buffer);
if (slen > 0) {
if (buffer[slen-1] == '\n') {
buffer[slen-1] = '\0';
} else {
printf("Exceeded buffer length of %d.\n", BUFFSIZE);
exit(EXIT_FAILURE);
}
}
/* Parsing of words from stdin */
word = strtok(buffer, delim);
while (word != NULL) {
/* allocate space for one word, including nullbyte */
strings[str_count] = malloc(strlen(word)+1);
if (!strings[str_count]) {
printf("Issue allocating space for word.\n");
exit(EXIT_FAILURE);
}
/* copy strings into array */
strcpy(strings[str_count], word);
str_count++;
word = strtok(NULL, delim);
}
}
/* print and free strings */
printf("Your array of strings:\n");
for (i = 0; i < str_count; i++) {
printf("strings[%d] = %s\n", i, strings[i]);
free(strings[i]);
strings[i] = NULL;
}
free(strings);
strings = NULL;
return 0;
}
I would like to create an array of string variables, and the number of elements is depends on the user's input. For example, if the user's input is 3, then he can input 3 strings. Let's say "aaa", "bbb" and "ccc". They are stored by the same pointer to char(*ptr) but with different index.
code:
int main()
{
int t;
scanf("%d", &t);
getchar();
char *ptr = malloc(t*sizeof(char));
int i;
for(i=0;i<t;i++)
{
gets(*(ptr[i]));
}
for(i=0;i<t;i++)
{
puts(*(ptr[i]));
}
return 0;
}
t is the number of elements, *ptr is the pointer to array. I would like to store "aaa", "bbb" and "ccc" in ptr[0], ptr[1] and ptr[2]. However, errors have been found in gets and puts statement and i am not able to work out a solution. Would someone give a help to me? Thank you!
You shouldn't use gets(), which has unavoidable risk of buffer overrun, deprecated in C99 and deleted from C11.
Only one character can be stored in char. If the maximum length of strings to be inputted is fixed, you can allocate an array whose elements are arrays of char. Otherwise, you should use an array of char*.
Try this (this is for former case):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* the maximum length of strings to be read */
#define STRING_MAX 8
int main(void)
{
int t;
if (scanf("%d", &t) != 1)
{
fputs("read t error\n", stderr);
return 1;
}
getchar();
/* +2 for newline and terminating null-character */
char (*ptr)[STRING_MAX + 2] = malloc(t*sizeof(char[STRING_MAX + 2]));
if (ptr == NULL)
{
perror("malloc");
return 1;
}
int i;
for(i=0;i<t;i++)
{
if (fgets(ptr[i], sizeof(ptr[i]), stdin) == NULL)
{
fprintf(stderr, "read ptr[%d] error\n", i);
return 1;
}
/* remove newline character */
char *lf;
if ((lf = strchr(ptr[i], '\n')) != NULL) *lf = '\0';
}
for(i=0;i<t;i++)
{
puts(ptr[i]);
}
free(ptr);
return 0;
}
You can use this code which is given below because string array is like char 2D array so use can use pointer of pointer and when you allocate memory at run time by malloc then you need to cast into pointer to pointer char type.
int main()
{
int t;
scanf("%d", &t);
char **ptr = (char **)malloc(t*sizeof(char));
int i,j;
for( i=0;i<t;i++)
{
scanf("%s",ptr[i]);
}
for(i=0;i<t;i++)
{
puts(ptr[i]);
}
return 0;
}
Here is an example, of a clean if slightly memory inefficient way to handle this. A more memory efficient solution would use one string of MAX_LINE_LENGTH and copy to strings of precise lengths.. which is why one contiguous block of memory for the strings is a bad idea.
The asserts also just demonstrate where real checks are needed as malloc is allowed to fail in production where asserts do nothing.
#include <stdio.h>
#include <malloc.h>
#include <assert.h>
#define MAX_LINE_LENGTH 2048
int
main(void) {
int tot, i;
char **strheads; /* A pointer to the start of the char pointers */
if (scanf("%d\n", &tot) < 1)
return (1);
strheads = malloc(tot * sizeof (char *));
assert(strheads != NULL);
/* now we have our series of n pointers to chars,
but nowhere allocated to put the char strings themselves. */
for (i = 0; i < tot; i++) {
strheads[i] = malloc(sizeof (char *) * MAX_LINE_LENGTH);
assert(strheads[i] != NULL);
/* now we have a place to put the i'th string,
pointed to by pointer strheads[i] */
(void) fgets(strheads[i], MAX_LINE_LENGTH, stdin);
}
(void) printf("back at ya:\n");
for (i = 0; i < tot; i++) {
fputs(strheads[i], stdout);
free(strheads[i]); /* goodbye, i'th string */
}
free(strheads); /* goodbye, char pointers [0...tot] */
return (0);
}
I have a program that reverses a string from an input of a variable length character array. The function returns a variable length character array and is printed. When I print the output, I do get the reversed string, but there are garbage characters appended to it in my console print.
Is this a "legal" operation in terms of returning to buffers? Can someone please critique my code and suggest a better alternative if it is not the right approach?
Thanks.
#include <stdio.h>
#include <stdlib.h>
char *reverse_string(char *input_string);
char *reverse_string(char *input_string)
{
int i=0;
int j=0;
char *return_string;
char filled_buffer[16];
while (input_string[i]!='\0')
i++;
while (i!=0)
{
filled_buffer[j]=input_string[i-1];
i--;
j++;
}
return_string=filled_buffer;
printf("%s", return_string);
return return_string;
}
int main (void)
{
char *returned_string;
returned_string=reverse_string("tasdflkj");
printf("%s", returned_string);
return 1;
}
This is my output from Xcode - jklfdsat\347\322̲\227\377\231\235
No, it isn't safe to return a pointer to a local string in a function. C won't stop you doing it (though sometimes the compiler will warn you if you ask it to; in this case, the local variable return_string prevents it giving the warning unless you change the code to return filled_buffer;). But it is not safe. Basically, the space gets reused by other functions, and so they merrily trample on what was once a neatly formatted string.
Can you explain this comment in more detail — "No, it isn't safe..."
The local variables (as opposed to string constants) go out of scope when the function returns. Returning a pointer to an out-of-scope variable is undefined behaviour, which is something to be avoided at all costs. When you invoke undefined behaviour, anything can happen — including the program appearing to work — and there are no grounds for complaint, even if the program reformats your hard drive. Further, it is not guaranteed that the same thing will happen on different machines, or even with different versions of the same compiler on your current machine.
Either pass the output buffer to the function, or have the function use malloc() to allocate memory which can be returned to and freed by the calling function.
Pass output buffer to function
#include <stdio.h>
#include <string.h>
int reverse_string(char *input_string, char *buffer, size_t bufsiz);
int reverse_string(char *input_string, char *buffer, size_t bufsiz)
{
size_t j = 0;
size_t i = strlen(input_string);
if (i >= bufsiz)
return -1;
buffer[i] = '\0';
while (i != 0)
{
buffer[j] = input_string[i-1];
i--;
j++;
}
printf("%s\n", buffer);
return 0;
}
int main (void)
{
char buffer[16];
if (reverse_string("tasdflkj", buffer, sizeof(buffer)) == 0)
printf("%s\n", buffer);
return 0;
}
Memory allocation
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *reverse_string(char *input_string);
char *reverse_string(char *input_string)
{
size_t j = 0;
size_t i = strlen(input_string) + 1;
char *string = malloc(i);
if (string != 0)
{
string[--i] = '\0';
while (i != 0)
{
string[j] = input_string[i-1];
i--;
j++;
}
printf("%s\n", string);
}
return string;
}
int main (void)
{
char *buffer = reverse_string("tasdflkj");
if (buffer != 0)
{
printf("%s\n", buffer);
free(buffer);
}
return 0;
}
Note that the sample code includes a newline at the end of each format string; it makes it easier to tell where the ends of the strings are.
This is an alternative main() which shows that the allocated memory returned is OK even after multiple calls to the reverse_string() function (which was modified to take a const char * instead of a plain char * argument, but was otherwise unchanged).
int main (void)
{
const char *strings[4] =
{
"tasdflkj",
"amanaplanacanalpanama",
"tajikistan",
"ablewasiereisawelba",
};
char *reverse[4];
for (int i = 0; i < 4; i++)
{
reverse[i] = reverse_string(strings[i]);
if (reverse[i] != 0)
printf("[%s] reversed [%s]\n", strings[i], reverse[i]);
}
for (int i = 0; i < 4; i++)
{
printf("Still valid: %s\n", reverse[i]);
free(reverse[i]);
}
return 0;
}
Also (as pwny pointed out in his answer before I added this note to mine), you need to make sure your string is null terminated. It still isn't safe to return a pointer to the local string, even though you might not immediately spot the problem with your sample code. This accounts for the garbage at the end of your output.
First, returning a pointer to a local like that isn't safe. The idiom is to receive a pointer to a large enough buffer as a parameter to the function and fill it with the result.
The garbage is probably because you're not null-terminating your result string. Make sure you append '\0' at the end.
EDIT: This is one way you could write your function using idiomatic C.
//buffer must be >= string_length + 1
void reverse_string(char *input_string, char* buffer, size_t string_length)
{
int i = string_length;
int j = 0;
while (i != 0)
{
buffer[j] = input_string[i-1];
i--;
j++;
}
buffer[j] = '\0'; //null-terminate the string
printf("%s", buffer);
}
Then, you call it somewhat like:
#define MAX_LENGTH 16
int main()
{
char* foo = "foo";
size_t length = strlen(foo);
char buffer[MAX_LENGTH];
if(length < MAX_LENGTH)
{
reverse_string(foo, buffer, length);
printf("%s", buffer);
}
else
{
printf("Error, string to reverse is too long");
}
}
I am writing some code that needs to read fasta files, so part of my code (included below) is a fasta parser. As a single sequence can span multiple lines in the fasta format, I need to concatenate multiple successive lines read from the file into a single string. I do this, by realloc'ing the string buffer after reading every line, to be the current length of the sequence plus the length of the line read in. I do some other stuff, like stripping white space etc. All goes well for the first sequence, but fasta files can contain multiple sequences. So similarly, I have a dynamic array of structs with a two strings (title, and actual sequence), being "char *". Again, as I encounter a new title (introduced by a line beginning with '>') I increment the number of sequences, and realloc the sequence list buffer. The realloc segfaults on allocating space for the second sequence with
*** glibc detected *** ./stackoverflow: malloc(): memory corruption: 0x09fd9210 ***
Aborted
For the life of me I can't see why. I've run it through gdb and everything seems to be working (i.e. everything is initialised, the values seems sane)... Here's the code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#include <math.h>
#include <errno.h>
//a struture to keep a record of sequences read in from file, and their titles
typedef struct {
char *title;
char *sequence;
} sequence_rec;
//string convenience functions
//checks whether a string consists entirely of white space
int empty(const char *s) {
int i;
i = 0;
while (s[i] != 0) {
if (!isspace(s[i])) return 0;
i++;
}
return 1;
}
//substr allocates and returns a new string which is a substring of s from i to
//j exclusive, where i < j; If i or j are negative they refer to distance from
//the end of the s
char *substr(const char *s, int i, int j) {
char *ret;
if (i < 0) i = strlen(s)-i;
if (j < 0) j = strlen(s)-j;
ret = malloc(j-i+1);
strncpy(ret,s,j-i);
return ret;
}
//strips white space from either end of the string
void strip(char **s) {
int i, j, len;
char *tmp = *s;
len = strlen(*s);
i = 0;
while ((isspace(*(*s+i)))&&(i < len)) {
i++;
}
j = strlen(*s)-1;
while ((isspace(*(*s+j)))&&(j > 0)) {
j--;
}
*s = strndup(*s+i, j-i);
free(tmp);
}
int main(int argc, char**argv) {
sequence_rec *sequences = NULL;
FILE *f = NULL;
char *line = NULL;
size_t linelen;
int rcount;
int numsequences = 0;
f = fopen(argv[1], "r");
if (f == NULL) {
fprintf(stderr, "Error opening %s: %s\n", argv[1], strerror(errno));
return EXIT_FAILURE;
}
rcount = getline(&line, &linelen, f);
while (rcount != -1) {
while (empty(line)) rcount = getline(&line, &linelen, f);
if (line[0] != '>') {
fprintf(stderr,"Sequence input not in valid fasta format\n");
return EXIT_FAILURE;
}
numsequences++;
sequences = realloc(sequences,sizeof(sequence_rec)*numsequences);
sequences[numsequences-1].title = strdup(line+1); strip(&sequences[numsequences-1].title);
rcount = getline(&line, &linelen, f);
sequences[numsequences-1].sequence = malloc(1); sequences[numsequences-1].sequence[0] = 0;
while ((!empty(line))&&(line[0] != '>')) {
strip(&line);
sequences[numsequences-1].sequence = realloc(sequences[numsequences-1].sequence, strlen(sequences[numsequences-1].sequence)+strlen(line)+1);
strcat(sequences[numsequences-1].sequence,line);
rcount = getline(&line, &linelen, f);
}
}
return EXIT_SUCCESS;
}
You should use strings that look something like this:
struct string {
int len;
char *ptr;
};
This prevents strncpy bugs like what it seems you saw, and allows you to do strcat and friends faster.
You should also use a doubling array for each string. This prevents too many allocations and memcpys. Something like this:
int sstrcat(struct string *a, struct string *b)
{
int len = a->len + b->len;
int alen = a->len;
if (a->len < len) {
while (a->len < len) {
a->len *= 2;
}
a->ptr = realloc(a->ptr, a->len);
if (a->ptr == NULL) {
return ENOMEM;
}
}
memcpy(&a->ptr[alen], b->ptr, b->len);
return 0;
}
I now see you are doing bioinformatics, which means you probably need more performance than I thought. You should use strings like this instead:
struct string {
int len;
char ptr[0];
};
This way, when you allocate a string object, you call malloc(sizeof(struct string) + len) and avoid a second call to malloc. It's a little more work but it should help measurably, in terms of speed and also memory fragmentation.
Finally, if this isn't actually the source of error, it looks like you have some corruption. Valgrind should help you detect it if gdb fails.
One potential issue is here:
strncpy(ret,s,j-i);
return ret;
ret might not get a null terminator. See man strncpy:
char *strncpy(char *dest, const char *src, size_t n);
...
The strncpy() function is similar, except that at most n bytes of src
are copied. Warning: If there is no null byte among the first n bytes
of src, the string placed in dest will not be null terminated.
There's also a bug here:
j = strlen(*s)-1;
while ((isspace(*(*s+j)))&&(j > 0)) {
What if strlen(*s) is 0? You'll end up reading (*s)[-1].
You also don't check in strip() that the string doesn't consist entirely of spaces. If it does, you'll end up with j < i.
edit: Just noticed that your substr() function doesn't actually get called.
I think the memory corruption problem might be the result of how you're handling the data used in your getline() calls. Basically, line is reallocated via strndup() in the calls to strip(), so the buffer size being tracked in linelen by getline() will no longer be accurate. getline() may overrun the buffer.
while ((!empty(line))&&(line[0] != '>')) {
strip(&line); // <-- assigns a `strndup()` allocation to `line`
sequences[numsequences-1].sequence = realloc(sequences[numsequences-1].sequence, strlen(sequences[numsequences-1].sequence)+strlen(line)+1);
strcat(sequences[numsequences-1].sequence,line);
rcount = getline(&line, &linelen, f); // <-- the buffer `line` points to might be
// smaller than `linelen` bytes
}