error while reading input from file c - c

I had been trying to read input from a file, but it seems that something doesn't work correctly...Instead of reading the word "Words" that exists in the text,the printf is always showing 2 additional random characters not included in the file...
The function is:
void search(struct word *w,FILE *f){
char *c;
char c2;
int i,j,k,l;
c=(char*)malloc(120*sizeof(char));
i=1;
while(f!=NULL) {
c2=fgetc(f);
while(c2!=EOF) {
while(c2!='\n') {
k=0;
while(c2!=' ') {
*(c+k)=c2;
k=k+1;
c2=getc(f);
}
if(w->name==c)
insert(i,j+1,name,&w);
}
memset(c, 0, sizeof(c));
j=j+k+1;
}
i=i+1;
}
}
}
the main function is
int main()
{
struct word *s;
s=(struct word*)malloc(sizeof(struct word));
s->name=(char*)malloc(20*sizeof(char));
s->result=NULL;
scanf("%s",s->name);
search(s);
printres(s);
system("pause");
exit(0);
}
and the structs are
struct position
{
char *filename;
int line;
int place;
struct position *next;
};
struct word
{
char *name;
struct word *right;
struct word *left;
struct position *result;
};
Why do these additional 2 characters appear? What should I do?

at first glance, this seems wrong to me
memset(c, 0, sizeof(c));
mainly because c it's a char*, then sizeof(c) will depend of compilation details...
And this test is wrong also:
if(w->name=c)
you probably want
if(strcmp(w->name,c) == 0)
after terminating the buffer by '\0'.
Also, you should check to not overflow your buffer, otherwise results will be unpredictable.

Indentation means you're not arranging your code in a properly readable manner.
I assume you're trying to store words line by line and word by word from a text file. But I'm unable to make out what your insert() function does. Besides, this code has a lot of faults.
I'll list them out first, and then you can state what you're exactly doing in this.
in main(): search(s)->file pointer parameter missing.
The prototype is : void search(struct word *w,FILE *f).
You have to do the following:
first open the file by using fopen() function.
use the pointer obtained from step i. and insert it as a parameter in your search() function.
This is just the solution I'm giving, you lack knowledge in this. You'll have to read a lot more on using files in c.
statement if(w->name=c):
It's "==" and not "=". Here you only assigned w->name as c.
You were trying to compare pointers and not the characters in them! Comparing pointers will be no use. You allocated different memory for both, how can the addresses be same? Read more on comparing strings in c.
Before comparing, you have to terminate a string with '\0'(null character). This often leads to unwanted characters being printed otherwise.
There'll be a lot of resources online which will have the exact answer to what you want to do. You can use a Google search. I can only point out so many faults, since your entire code has faults here and there. Learn more I'd say.

I haven't studied the logic of your code, so I'm just going to point out some of the problems:
c=(char*)malloc(120*sizeof(char));
The cast here is mostly useless. It can be used to warn you when you try to allocate data of the wrong type, but many people recommend the following way to allocate memory:
c = malloc (120 * sizeof *c);
This automatically allocates 120 items of the right size, regardless of the actual type of c. You should also check the return value of malloc(). Allocations can fail.
char c2;
c2=fgetc(f);
while(c2!=EOF) {
The fgetc() function returns a signed int representing a character in the range of an unsigned char, or EOF which is a negative value.
Plain char is either compatible with signed char or unsigned char. If it is unsigned, then it can never store the value of EOF, so the comparison c2 != EOF will always be false. If it is signed, then it can (typically, not necessarily) store EOF, but it will have the same value as an actual character.
You should store the return value of fgetc() in an int variable, compare it to EOF, and then convert it to char.
if(w->name==c)
Is this meant to be a string comparison? It doesn't work that way in C. You're only comparing pointers. To compare the actual strings, you'd have to call strcmp() or similar.
insert(i,j+1,name,&w);
(Undefined function)
memset(c, 0, sizeof(c));
I assume that you're trying to set the entire buffer to 0, but the size is wrong. This zeroes sizeof (char *) bytes, not the 120 bytes you allocated.
s=(struct word*)malloc(sizeof(struct word));
s->name=(char*)malloc(20*sizeof(char));
Pointless casts an no check for allocation failure. This should be (in my opinion):
s = malloc (sizeof *s);
if (s) {
s->name = malloc (20 * sizeof *s->name);
}
if (s && s->name) {
/* Go to work */
} else {
/* Allocation failure */
}
/* All done. Free memory. */
if (s) {
free (s->name);
}
free (s);
Because free(0) is a no-op, this cleanup works with all of the branches above.
search(s);
The search() function wants a second argument (FILE *).
printres(s);
(Undefined function)

Related

How to return a string to main function?

I am trying to write code to implement strchr function in c. But, I'm not able to return the string.
I have seen discussions on how to return string but I'm not getting desired output
const char* stchr(const char *,char);
int main()
{
char *string[50],*p;
char ch;
printf("Enter a sentence\n");
gets(string);
printf("Enter the character from which sentence should be printed\n");
scanf("%c",&ch);
p=stchr(string,ch);
printf("\nThe sentence from %c is %s",ch,p);
}
const char* stchr(const char *string,char ch)
{
int i=0,count=0;
while(string[i]!='\0'&&count==0)
{
if(string[i++]==ch)
count++;
}
if(count!=0)
{
char *temp[50];
int size=(strlen(string)-i+1);
strncpy(temp,string+i-1,size);
temp[strlen(temp)+1]='\0';
printf("%s",temp);
return (char*)temp;
}
else
return 0;
}
I should get the output similar to strchr function but output is as follows
Enter a sentence
i love cooking
Enter the character from which sentence should be printed
l
The sentence from l is (null)
There are basically only two real errors in your code, plus one line that, IMHO, should certainly be changed. Here are the errors, with the solutions:
(1) As noted in the comments, the line:
char *string[50],*p;
is declaring string as an array of 50 character pointers, whereas you just want an array of 50 characters. Use this, instead:
char string[50], *p;
(2) There are two problems with the line:
char *temp[50];
First, as noted in (1), your are declaring an array of character pointers, not an array of characters. Second, as this is a locally-defined ('automatic') variable, it will be deleted when the function exits, so your p variable in main will point to some memory that has been deleted. To fix this, you can declare the (local) variable as static, which means it will remain fixed in memory (but see the added footnote on the use of static variables):
static char temp[50];
Lastly, again as mentioned in the comments, you should not be using the gets function, as this is now obsolete (although some compilers still support it). Instead, you should use the fgets function, and use stdin as the 'source file':
fgets(string, 49, stdin);/// gets() has been removed! Here, 2nd argument is max length.
Another minor issue is your use of the strlen and strncpy functions. The former actually returns a value of type size_t (always an unsigned integral type) not int (always signed); the latter uses such a size_t type as its final argument. So, you should have this line, instead of what you currently have:
size_t size = (strlen(string) - i + 1);
Feel free to ask for further clarification and/or explanation.
EDIT: Potential Problem when using the static Solution
As noted in the comments by Basya, the use of static data can cause issues that can be hard to track down when developing programs that have multiple threads: if two different threads try to access the data at the same time, you will get (at best) a "data race" and, more likely, difficult-to-trace unexpected behaviour. A better way, in such circumstances, is to dynamically allocate memory for the variable from the "heap," using the standard malloc function (defined in <stdlib.h> - be sure to #include this header):
char* temp = malloc(50);
If you use this approach, be sure to release the memory when you're done with it, using the free() function. In your example, this would be at the end of main:
free(p);

I don't know what's going wrong with this code?

I am writing a simple code which accepts a string from the user of any length and just displays it. But my code is not doing it correctly as it accepts the string but not prints it correctly.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
main()
{
int i,len;
static int n=5;
char a[20];
char **s;
s=malloc(5*sizeof(char));
char *p;
for(i=0;i<n;i++)
{
scanf("%s",a);
if(*a=='1') /*to exit from loop*/
{
break;
}
len=strlen(a);
p=malloc((len+1)*sizeof(char));
strcpy(p,a);
s[i]=p;
if(i==n-1)
{
s=realloc(s,(5+i*5)*sizeof(char));
n=5+i;
}
}
for(i=0;i<n-1;i++)
{
printf("%s ",s[i]);
}
free(p);
p=NULL;
return 0;
}
There are multiple issues, but at first look, the most prominent one is,
s=malloc(5*sizeof(char));
is wrong. s is of type char **, so you'd need to allocate memory worth of char * there. In other words, you expect s to point to a char * element, so, you need to allocate memory accordingly.
To avoid these sort of mistakes, never rely on hardcoded data types, rather, use the form
s = malloc( 5 * sizeof *s); // same as s=malloc( 5 * sizeof (*s))
where, the size oid essentially determined from the type of the variable. Two advantages
You avoid mistakes like above.
The code becomes more resilient, you don;t need to change the malloc() statement in case you choose to change the data type
That said, scanf("%s",a); is also potentially dangerous and cause buffer overflow by longer-than-expected-input. You should always limit the input scanning length, using the maximum field width, like
scanf("%19s",a); // a is array of dimension 20, one for terminating null
That said, to advice about the logic, when you don't know or don't dictate the length of the input string beforehand, you cannot use a string type to scan the input. The basic way of getting this done would be
Allocate a moderate length buffer, dynamically, using allocator functions like malloc().
Keep reading the input stream one by one, fgetc() or alike.
If the read is complete (for example, return of EOF), you've read the complete input.
If the allocated memory has run out, re-allocate the original buffer and continue to step 3.
and, don't forget to free() the memory.
Otherwise, you may use fgets() to read chunks of memory and keep realloacting as mentioned above.

How to put a char into a empty pointer of a string in pure C

I want to store a single char into a char array pointer and that action is in a while loop, adding in a new char every time. I strictly want to be into a variable and not printed because I am going to compare the text. Here's my code:
#include <stdio.h>
#include <string.h>
int main()
{
char c;
char *string;
while((c=getchar())!= EOF) //gets the next char in stdin and checks if stdin is not EOF.
{
char temp[2]; // I was trying to convert c, a char to temp, a const char so that I can use strcat to concernate them to string but printf returns nothing.
temp[0]=c; //assigns temp
temp[1]='\0'; //null end point
strcat(string,temp); //concernates the strings
}
printf(string); //prints out the string.
return 0;
}
I am using GCC on Debain (POSIX/UNIX operating system) and want to have windows compatability.
EDIT:
I notice some communication errors with what I actually intend to do so I will explain: I want to create a system where I can input a unlimited amount of characters and have the that input be store in a variable and read back from a variable to me, and to get around using realloc and malloc I made it so it would get the next available char until EOF. Keep in mind that I am a beginner to C (though most of you have probably guess it first) and haven't had a lot of experience memory management.
If you want unlimited amount of character input, you'll need to actively manage the size of your buffer. Which is not as hard as it sounds.
first use malloc to allocate, say, 1000 bytes.
read until this runs out.
use realloc to allocate 2000
read until this runs out.
like this:
int main(){
int buf_size=1000;
char* buf=malloc(buf_size);
char c;
int n=0;
while((c=getchar())!= EOF)
buf[n++] = c;
if(n=>buf_size-1)
{
buf_size+=1000;
buf=realloc(buf, buf_size);
}
}
buf[n] = '\0'; //add trailing 0 at the end, to make it a proper string
//do stuff with buf;
free(buf);
return 0;
}
You won't get around using malloc-oids if you want unlimited input.
You have undefined behavior.
You never set string to point anywhere, so you can't dereference that pointer.
You need something like:
char buf[1024] = "", *string = buf;
that initializes string to point to valid memory where you can write, and also sets that memory to an empty string so you can use strcat().
Note that looping strcat() like this is very inefficient, since it needs to find the end of the destination string on each call. It's better to just use pointers.
char *string;
You've declared an uninitialised variable with this statement. With some compilers, in debug this may be initialised to 0. In other compilers and a release build, you have no idea what this is pointing to in memory. You may find that when you build and run in release, your program will crash, but appears to be ok in debug. The actual behaviour is undefined.
You need to either create a variable on the stack by doing something like this
char string[100]; // assuming you're not going to receive more than 99 characters (100 including the NULL terminator)
Or, on the heap: -
char string* = (char*)malloc(100);
In which case you'll need to free the character array when you're finished with it.
Assuming you don't know how many characters the user will type, I suggest you keep track in your loop, to ensure you don't try to concatenate beyond the memory you've allocated.
Alternatively, you could limit the number of characters that a user may enter.
const int MAX_CHARS = 100;
char string[MAX_CHARS + 1]; // +1 for Null terminator
int numChars = 0;
while(numChars < MAX_CHARS) && (c=getchar())!= EOF)
{
...
++numChars;
}
As I wrote in comments, you cannot avoid malloc() / calloc() and probably realloc() for a problem such as you have described, where your program does not know until run time how much memory it will need, and must not have any predetermined limit. In addition to the memory management issues on which most of the discussion and answers have focused, however, your code has some additional issues, including:
getchar() returns type int, and to correctly handle all possible inputs you must not convert that int to char before testing against EOF. In fact, for maximum portability you need to take considerable care in converting to char, for if default char is signed, or if its representation has certain other allowed (but rare) properties, then the value returned by getchar() may exceed its maximum value, in which case direct conversion exhibits undefined behavior. (In truth, though, this issue is often ignored, usually to no ill effect in practice.)
Never pass a user-provided string to printf() as the format string. It will not do what you want for some inputs, and it can be exploited as a security vulnerability. If you want to just print a string verbatim then fputs(string, stdout) is a better choice, but you can also safely do printf("%s", string).
Here's a way to approach your problem that addresses all of these issues:
#include <stdio.h>
#include <string.h>
#include <limits.h>
#define INITIAL_BUFFER_SIZE 1024
int main()
{
char *string = malloc(INITIAL_BUFFER_SIZE);
size_t cap = INITIAL_BUFFER_SIZE;
size_t next = 0;
int c;
if (!string) {
// allocation error
return 1;
}
while ((c = getchar()) != EOF) {
if (next + 1 >= cap) {
/* insufficient space for another character plus a terminator */
cap *= 2;
string = realloc(string, cap);
if (!string) {
/* memory reallocation failure */
/* memory was leaked, but it's ok because we're about to exit */
return 1;
}
}
#if (CHAR_MAX != UCHAR_MAX)
/* char is signed; ensure defined behavior for the upcoming conversion */
if (c > CHAR_MAX) {
c -= UCHAR_MAX;
#if ((CHAR_MAX != (UCHAR_MAX >> 1)) || (CHAR_MAX == (-1 * CHAR_MIN)))
/* char's representation has more padding bits than unsigned
char's, or it is represented as sign/magnitude or ones' complement */
if (c < CHAR_MIN) {
/* not representable as a char */
return 1;
}
#endif
}
#endif
string[next++] = (char) c;
}
string[next] = '\0';
fputs(string, stdout);
return 0;
}

too many open files c

I have been trying to create a simple program. However, I encountered an error:
gmon.out:too many open files
I am not clear on why it says I have "too many open files". It does not appear I am using files.
#include<stdio.h>
#include<ctype.h>
#include<math.h>
#include<stdlib.h>
#include<string.h>
struct position
{
int line;
int place;
struct position *next;
};
struct file
{
struct position *info;
struct file *next;
char *name;
};
struct word
{
char *name;
struct word *right;
struct word *left;
struct file *result;
};
int main()
{
int i;
struct word *d,*c;
char *s="brutus";
printf("%s",s);
c=(struct word*)malloc(sizeof(struct word));
strcpy(c->name,s);
c->left=NULL;
c->right=NULL;
for(i=1;i<=10;i++)
{
d=(struct word*)malloc(sizeof(struct word));
if(d==NULL)
exit(0);
scanf("%s",s);
printf("4");
s=d->name;
printf("%s",d->name);
d->left=NULL;
d->right=NULL;
}
system("pause");
exit(0);
}
What should I do about it?Thank you in advnace for your time!
First off:
gmon.out:too many open files
Means that you're compiling with the -p flag (profiling). gmon.out is the default file-name used by gprof. Just ditch-the-switch, and you won't get that problem anymore.
Of course, not profiling code isn't great, but you'd do well to address a coupe of issues first, before setting about actually profiling your code.
Some of these, quite numerous, issues are:
char *s="brutus";
printf("%s",s);
c=(struct word*)malloc(sizeof(struct word));
strcpy(c->name,s);
List of issues:
char *s should be const char *s, because it points to read-only memory.
Next, Do not cast the return of malloc
Check the return value of functions like malloc, they tell you something
struct wordis a struct of which all members are pointers. After allocating the struct, those pointers are invalid: you need to allocate memory for those members, too
strcpy expects the destination (c->name) to be a valid pointer, as I explained above: this is not the case here
What, then, should this code look like:
const char *s = "brutus";
c = malloc(sizeof *c);
if (c == NULL)
{
fprintf(stderr, "Could not allocate memory for struct word\n");
exit( EXIT_FAILURE );
}
//allocate enough memory to store the string
c->name = malloc(
(strlen(s)+1) * sizeof *c->name
);
//OR same, but shorter, works because the type char is guaranteed by the standard to be 1 byte in size
c->name = malloc(strlen(s)+1);
if (c->name == NULL)
exit( EXIT_FAILURE );//could not allocate mem
c->name[0] = '\0';//set to empty string, now we can use safer functions:
strncat(c->name, s, strlen(s));
After you address these issues, seriously re-think your approach, and ask yourself what it is you're actually trying to do here:
for(i=1;i<=10;i++)
{
d=(struct word*)malloc(sizeof(struct word));
if(d==NULL)
exit(0);
scanf("%s",s);
printf("4");
s=d->name;
}
You're allocating a struct 10 times, each time re-assigning it to d. You never free this memory, though. which is bad practice.
Again: don't cast the return of malloc, but that's the least of your worries.
if (d == NULL)
exit(0);
Ok, now you check the return of malloc. Great. But why on earth are you terminating with 0 (indicative of a successful run). There's a macro for this, too. You could've written:
if (d == NULL)
exit( EXIT_SUCCESS);
Clearly, EXIT_SUCCESS is not what you should communicate.
that const char *s is now being used to store user input. That's not going to work, though, as it points to read-only memory, so forget about the unsafe scanf("%s", s); statement. Use a stack variable, and make sure the input buffer is cleared, or use a safe alternative.
But then you go and do something as absurd as this:
s = d->name;
Again, d->name, like in the case with c, is an invalid pointer. Why assign it to s here? there's no point, no reason... only madness.
Bottom line: Kill this code before it hatches, start again, and please use these tips/recommendations and critiques as a guideline.
I have no idea why you're getting a 'too many open files', but this line:
strcpy(c->name,s)
is writing data to random memory, which could cause all kinds of problems.
You need to malloc() that c->name first.
Also that scanf to s looks suspicious, and d->name is never assigned anything either.
The reason that you're getting 'too many open files' is probably because some memory is getting overwritten in such a way that just happens to trigger that particular error. Welcome to the world of undefined behaviour. IE: If you overwrite random memory, basically anything can happen.
The first bug is in the line
strcpy(c->name,s);
At that point, c->name is an uninitialised pointer so the program will crash if you are lucky.
Reading your comment: You fixed the second bug. The first bug is still unfixed. And there's the third bug in the line
s=d->name;
This string copy will run off through memory, starting at whatever c->name points to until it finds a null terminator.
strcpy(c->name,s);
You have allocated space for c but not for the name pointer in c.
c->name = malloc([some length]);
c->name points somewhere, but you don't know where until you malloc it. That's why you're getting a seemingly random error, because your executing a string copy from an unknown location for an unknown number of bytes and you are clobbering whatever s points to for an unknown number of bytes.

best practice for returning a variable length string in c

I have a string function that accepts a pointer to a source string and returns a pointer to a destination string. This function currently works, but I'm worried I'm not following the best practice regrading malloc, realloc, and free.
The thing that's different about my function is that the length of the destination string is not the same as the source string, so realloc() has to be called inside my function. I know from looking at the docs...
http://www.cplusplus.com/reference/cstdlib/realloc/
that the memory address might change after the realloc. This means I have can't "pass by reference" like a C programmer might for other functions, I have to return the new pointer.
So the prototype for my function is:
//decode a uri encoded string
char *net_uri_to_text(char *);
I don't like the way I'm doing it because I have to free the pointer after running the function:
char * chr_output = net_uri_to_text("testing123%5a%5b%5cabc");
printf("%s\n", chr_output); //testing123Z[\abc
free(chr_output);
Which means that malloc() and realloc() are called inside my function and free() is called outside my function.
I have a background in high level languages, (perl, plpgsql, bash) so my instinct is proper encapsulation of such things, but that might not be the best practice in C.
The question: Is my way best practice, or is there a better way I should follow?
full example
Compiles and runs with two warnings on unused argc and argv arguments, you can safely ignore those two warnings.
example.c:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *net_uri_to_text(char *);
int main(int argc, char ** argv) {
char * chr_input = "testing123%5a%5b%5cabc";
char * chr_output = net_uri_to_text(chr_input);
printf("%s\n", chr_output);
free(chr_output);
return 0;
}
//decodes uri-encoded string
//send pointer to source string
//return pointer to destination string
//WARNING!! YOU MUST USE free(chr_result) AFTER YOU'RE DONE WITH IT OR YOU WILL GET A MEMORY LEAK!
char *net_uri_to_text(char * chr_input) {
//define variables
int int_length = strlen(chr_input);
int int_new_length = int_length;
char * chr_output = malloc(int_length);
char * chr_output_working = chr_output;
char * chr_input_working = chr_input;
int int_output_working = 0;
unsigned int uint_hex_working;
//while not a null byte
while(*chr_input_working != '\0') {
//if %
if (*chr_input_working == *"%") {
//then put correct char in
sscanf(chr_input_working + 1, "%02x", &uint_hex_working);
*chr_output_working = (char)uint_hex_working;
//printf("special char:%c, %c, %d<\n", *chr_output_working, (char)uint_hex_working, uint_hex_working);
//realloc
chr_input_working++;
chr_input_working++;
int_new_length -= 2;
chr_output = realloc(chr_output, int_new_length);
//output working must be the new pointer plys how many chars we've done
chr_output_working = chr_output + int_output_working;
} else {
//put char in
*chr_output_working = *chr_input_working;
}
//increment pointers and number of chars in output working
chr_input_working++;
chr_output_working++;
int_output_working++;
}
//last null byte
*chr_output_working = '\0';
return chr_output;
}
It's perfectly ok to return malloc'd buffers from functions in C, as long as you document the fact that they do. Lots of libraries do that, even though no function in the standard library does.
If you can compute (a not too pessimistic upper bound on) the number of characters that need to be written to the buffer cheaply, you can offer a function that does that and let the user call it.
It's also possible, but much less convenient, to accept a buffer to be filled in; I've seen quite a few libraries that do that like so:
/*
* Decodes uri-encoded string encoded into buf of length len (including NUL).
* Returns the number of characters written. If that number is less than len,
* nothing is written and you should try again with a larger buffer.
*/
size_t net_uri_to_text(char const *encoded, char *buf, size_t len)
{
size_t space_needed = 0;
while (decoding_needs_to_be_done()) {
// decode characters, but only write them to buf
// if it wouldn't overflow;
// increment space_needed regardless
}
return space_needed;
}
Now the caller is responsible for the allocation, and would do something like
size_t len = SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH;
char *result = xmalloc(len);
len = net_uri_to_text(input, result, len);
if (len > SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH) {
// try again
result = xrealloc(input, result, len);
}
(Here, xmalloc and xrealloc are "safe" allocating functions that I made up to skip NULL checks.)
The thing is that C is low-level enough to force the programmer to get her memory management right. In particular, there's nothing wrong with returning a malloc()ated string. It's a common idiom to return mallocated obejcts and have the caller free() them.
And anyways, if you don't like this approach, you can always take a pointer to the string and modify it from inside the function (after the last use, it will still need to be free()d, though).
One thing, however, that I don't think is necessary is explicitly shrinking the string. If the new string is shorter than the old one, there's obviously enough room for it in the memory chunk of the old string, so you don't need to realloc().
(Apart from the fact that you forgot to allocate one extra byte for the terminating NUL character, of course...)
And, as always, you can just return a different pointer each time the function is called, and you don't even need to call realloc() at all.
If you accept one last piece of good advice: it's advisable to const-qualify your input strings, so the caller can ensure that you don't modify them. Using this approach, you can safely call the function on string literals, for example.
All in all, I'd rewrite your function like this:
char *unescape(const char *s)
{
size_t l = strlen(s);
char *p = malloc(l + 1), *r = p;
while (*s) {
if (*s == '%') {
char buf[3] = { s[1], s[2], 0 };
*p++ = strtol(buf, NULL, 16); // yes, I prefer this over scanf()
s += 3;
} else {
*p++ = *s++;
}
}
*p = 0;
return r;
}
And call it as follows:
int main()
{
const char *in = "testing123%5a%5b%5cabc";
char *out = unescape(in);
printf("%s\n", out);
free(out);
return 0;
}
It's perfectly OK to return newly-malloc-ed (and possibly internally realloced) values from functions, you just need to document that you are doing so (as you do here).
Other obvious items:
Instead of int int_length you might want to use size_t. This is "an unsigned type" (usually unsigned int or unsigned long) that is the appropriate type for lengths of strings and arguments to malloc.
You need to allocate n+1 bytes initially, where n is the length of the string, as strlen does not include the terminating 0 byte.
You should check for malloc failing (returning NULL). If your function will pass the failure on, document that in the function-description comment.
sscanf is pretty heavy-weight for converting the two hex bytes. Not wrong, except that you're not checking whether the conversion succeeds (what if the input is malformed? you can of course decide that this is the caller's problem but in general you might want to handle that). You can use isxdigit from <ctype.h> to check for hexadecimal digits, and/or strtoul to do the conversion.
Rather than doing one realloc for every % conversion, you might want to do a final "shrink realloc" if desirable. Note that if you allocate (say) 50 bytes for a string and find it requires only 49 including the final 0 byte, it may not be worth doing a realloc after all.
I would approach the problem in a slightly different way. Personally, I would split your function in two. The first function to calculate the size you need to malloc. The second would write the output string to the given pointer (which has been allocated outside of the function). That saves several calls to realloc, and will keep the complexity the same. A possible function to find the size of the new string is:
int getNewSize (char *string) {
char *i = string;
int size = 0, percent = 0;
for (i, size; *i != '\0'; i++, size++) {
if (*i == '%')
percent++;
}
return size - percent * 2;
}
However, as mentioned in other answers there is no problem in returning a malloc'ed buffer as long as you document it!
Additionally what was already mentioned in the other postings, you should also document the fact that the string is reallocated. If your code is called with a static string or a string allocated with alloca, you may not reallocate it.
I think you are right to be concerned about splitting up mallocs and frees. As a rule, whatever makes it, owns it and should free it.
In this case, where the strings are relatively small, one good procedure is to make the string buffer larger than any possible string it could contain. For example, URLs have a de facto limit of about 2000 characters, so if you malloc 10000 characters you can store any possible URL.
Another trick is to store both the length and capacity of the string at its front, so that (int)*mystring == length of string and (int)*(mystring + 4) == capacity of string. Thus, the string itself only starts at the 8th position *(mystring+8). By doing this you can pass around a single pointer to a string and always know how long it is and how much memory capacity the string has. You can make macros that automatically generate these offsets and make "pretty code".
The value of using buffers this way is you do not need to do a reallocation. The new value overwrites the old value and you update the length at the beginning of the string.

Resources