Can the following be made simpler / more efficient? - c

I'm trying to convert some code from a dynamic-typed language to C. Please
bear with me as I have no practical experience yet with C.
I have a dispatcher function that decides how to convert it's input based on
the value of the flag argument.
void output_dispatcher(char *str, int strlen, int flag) {
char output[501];
char *result;
switch (flag) {
/* No conversion */
case 0:
result = str;
break;
case 1:
result = convert_type1(output, str, strlen);
len = strlen(result);
break;
/* ... */
}
/* do something with result */
}
I currently have 5 different output converters and they all (even future
ones) are guaranteed to only produce 300-500 characters. From my reading, it
is preferable to use a heap variable than dynamically allocate space on the
stack, if possible. The function declaration for one looks like:
static char * convert_type1(char *out, const char *in, int inlen);
I want to avoid the strlen in the dispatcher, since it is uncessary to
recalculate the output size because the output converters know it when they
construct the output. Also, since I'm passing in a pointer to the output
variable, I shouldn't need to return the result pointer, right? So I modify
it to the following, but get an 'incompatible type' compilation error.
void output_dispatcher(char *str, int strlen, int flag) {
char output[501];
switch (flag) {
/* No conversion */
case 0:
output = str; /* ERROR: incompatible type */
break;
case 1:
strlen = convert_type1(output, str, strlen);
break;
/* ... */
}
/* do something with result */
}
Can this approach work, or is there a better way to go?

To avoid the recalculation your output converters would need to have a prototype like this:
static char * convert_type1(char *out, const char *in, int *len);
called thus:
result = convert_type1(output, str, &strlen);
Internally the output converter would need to read the contents of the pointer now containing the string length, and overwrite the contents of that pointer before returning.
On the issue of heap vs stack, indeed you need to use the heap since variables allocated on the stack will disappear as soon as the function ends.

The line:
output = str;
is giving you problems because, while arrays and pointers are similar, they're not the same.
"output" is an array, not a pointer.
str = output;
will work, because a char ptr is much like an array variable.
But the opposite does not because the "output" variable is not just the pointer to the array, but the array itself.
For example, if you had:
char output[501];
char output1[501];
and you did:
output1 = output;
This would be ok, and C would copy the contents of the output array in to the output1 array.
So, you're just a little confused about arrays and ptrs.

char output[501];
output = str; /* ERROR: incompatible type */
=>
strncpy(output, str, sizeof(output));
Note, you should check if 'output' is big enough to hold 'str'

The error in this case makes sense. output is a buffer that will hold come char data, while str is a pointer to some other area in memory. You don't want to assign the address of what str is pointing to output, right? If you want to go with this approach I think would just copy the data pointed to by str into output. Better yet just use str if no conversion is required.

C does not allow arrays to be modified by direct assignment - you must individually modify the array members. Thus, if you want to copy the string pointed to by str into the array output, you must use:
strcpy(output, str);
or perhaps
memcpy(output, str, strlen + 1);
(In both cases, after first checking that strlen < sizeof output).
Note that naming a local variable strlen, thus shadowing the standard function of that name, is going to more than a little confusing for someone who looks at your code later. I'd pick another name.

Related

Returing and printing a string from a function in C

I am trying to return and print a function in C. Printing it out works in the function just fine, but when I try to print it after returning it from the function, I get nonsense.
I have already tried a lot and I think I have seen at least 6 stack overflow posts similar to this and this is the closest thing I can get to working that is not a segmentation fault or an error.
Code:
char* getBitstring(unsigned short int instr) {
//this is what the code below is going to convert into. It is set to default
//as a 16 bit string full of zeros to act as a safety default.
char binaryNumber[] = "0000000000000000";
//....
//doing things to binaryNumber
//.....
printf("don't get excited yet %s\n", binaryNumber); //note, this works
return binaryNumber;
}
int main(int argc, char *argv[]) {
char *a = getBitstring(0x1234);
printf("%s", a); //this fails
return 0;
}
Here is the output:
don't get excited yet 0001001000110100
������e��r�S�����$�r�#�t�$�r�����ͅS�������t����
This is because you are returning a pointer to an object allocated in automatic memory - an undefined behavior.
If you want to return a string from a function, you need to return either a dynamically-allocated block, or a statically allocated block.
Another choice is to pass the buffer into the function, and provide the length as the return value of the function, in the way the file reading functions do it:
size_t getBitstring(unsigned short int instr, char* buf, size_t buf_size) {
... // Fill in the buffer without going over buf_size
printf("don't get excited yet %s\n", binaryNumber);
return strlen(buf);
}
Here is how you call this function:
char binaryNumber[] = "0000000000000000";
size_t len = getBitstring(instr, binaryNumber, sizeof(binaryNumber));
Now binaryNumber is an automatic variable in the context of the caller, so the memory would be around while the caller needs it.
This is a good example of a problem people hit when they're learning C that they probably wouldn't hit if they were operating in Java or another more modern language that doesn't expose the details of memory layout to the user.
While everyone's answer here is probably technically correct, I'm going to try a different tack to see if I can answer your question without just giving you a line of code that will fix it.
First you need to understand what's going on when returning a
variable defined inside a function, like this:
void f(void) {
int x = 0;
/* do some crazy stuff with x */
return x;
}
What happens when you call return x;? I'm probably omitting some
details here, but what is essentially going on is that the calling
context gets the value stored inside the variable named x at
that time. The storage that is allocated to store that value is
no longer guaranteed to contain that value after the function is
over.
Second, we need to understand what happens when we refer to an array
by its name. Say we have a 'string':
char A[] = "12345";
In C, this is actually equivalent to declaring an array of
characters that ends in a \0:
char A[6] = { '1' , '2' , '3' , '4' , '5' , '\0' };
Then this is sort of like declaring six chars A[0], A[1], ...
, A[5]. The variable A is actually of type char * i.e. it is a
pointer containing the memory address storing A[0] (the beginning
of the array).
Finally, we need to understand what happens when you call printf to
print a string like this:
printf("%s", A);
What you're saying here is "print all the bytes starting at memory
address A until you hit a byte that contains \0".
So, let's put it all together. When you run your code, the variable binaryNumber is a pointer containing the memory address of the first character of the array: binaryNumber[0], and this address is what's returned by the function. BUT, you've declared the array inside of getBitString, so as we know that the memory allocated for the array is no longer guaranteed to store the same values after getBitString is over.
When you run printf("%s", a) you're telling C to "print all the bytes starting at memory address a until you get to a byte containing \0 -- but since that memory address is only guaranteed to contain valid values inside getBitString, what you get is whatever garbage it happens to contain at the time when you call it outside of getBitString.
So what can you do to resolve the problem? Well you have several options, here are is a (non-exhaustive) list:
You declare binaryString outside of getBitString so that it's still valid when you try to access it in main
You declare binaryString as static as some others have suggested, which is effectively the same thing as above, except that the actual variable name binaryString is only valid inside the function, but the memory allocated to store the array is still valid outside the function.
You make a copy of the string using the strdup() function before you return it in your function. Remember that if you do this, you have to free() the pointer returned by strdup() after you're done with it, otherwise what you've got is a memory leak.
Your binaryNumber character array only exists inside of your getBitstring function. Once that function returns, that memory is no longer allocated to your binaryNumber. To keep that memory allocated for use outside of that function you can do one of two things:
Return a dynamically allocated array
char* getBitstring(unsigned short int instr) {
// dynamically allocate array to hold 16 characters (+1 null terminator)
char* binaryNumber = malloc(17 * sizeof(char));
memset(binaryNumber, '0', 16); // Set the first 16 array elements to '0'
binaryNumber[16] = '\0'; // null terminate the string
//....
//doing things to binaryNumber
//.....
return binaryNumber;
}
int main(int argc, char *argv[]) {
char *a = getBitstring(0x1234);
printf("%s", a);
free(a); // Need to free memory, because you dynamically allocated it
return 0;
}
or pass the array into the function as an argument
void* getBitstring(unsigned short int instr, char* binaryNumber, unsigned int arraySize ) {
//....
//doing things to binaryNumber (use arraySize)
//.....
}
int main(int argc, char *argv[]) {
char binaryNumber[] = "0000000000000000";
getBitstring(0x1234, binaryNumber, 16); // You must pass the size of the array
printf("%s", binaryNumber);
return 0;
}
Others have suggested making your binaryNumber array static for a quick fix. This would work, but I would avoid this solution, as it is unnecessary and has other side effects.
Create your return type in dynamic way with malloc() function. create 16+1 blocks for end of string. this is not safe but easy to understand.
char * binaryNumber = (char*) malloc(17*sizeof(char));//create dynamic char sequence
strcpy(binaryNumber,"0000000000000000");//assign the default value with String copy
The final result will be;
char* getBitstring(unsigned short int instr) {
//this is what the code below is going to convert into. It is set to default
//as a 16 bit string full of zeros to act as a safety default.
char * binaryNumber = (char*) malloc(17*sizeof(char));//create dynamic char sequence
strcpy(binaryNumber,"0000000000000000");//assign the default value with String copy function
//....
//doing things to binaryNumber
//.....
printf("don't get excited yet %s\n", binaryNumber); //note, this works
return binaryNumber;
}
int main(int argc, char *argv[]) {
char *a = getBitstring(0x1234);
printf("%s", a); //this fails
return 0;
}
of course include the <string.h> library.
your char binaryNumber[] is local to the function getBitstring(). you cannot return a local variable to other function.
The scope of binaryNumber is over when getBitstring() finishes execution. So, in your main(), char *a is not initialized.
The workaround:
define the array as static so that it does not go out-of-scope. [Not a good approach, but works]
or
use dynamic memory allocation and return the pointer. [don't forget to free later, to avoid memory leak.]
IMO, the second approach is way better.
You need static:
static char binaryNumber[] = "0000000000000000";
Without static keyword, the value of automatic variable is lost after function returns. Probably you know this.

How do I make a function return a pointer to a new string in C?

I'm reading K&R and I'm almost through the chapter on pointers. I'm not entirely sure if I'm going about using them the right way. I decided to try implementing itoa(n) using pointers. Is there something glaringly wrong about the way I went about doing it? I don't particularly like that I needed to set aside a large array to work as a string buffer in order to do anything, but then again, I'm not sure if that's actually the correct way to go about it in C.
Are there any general guidelines you like to follow when deciding to use pointers in your code? Is there anything I can improve on in the code below? Is there a way I can work with strings without a static string buffer?
/*Source file: String Functions*/
#include <stdio.h>
static char stringBuffer[500];
static char *strPtr = stringBuffer;
/* Algorithm: n % 10^(n+1) / 10^(n) */
char *intToString(int n){
int p = 1;
int i = 0;
while(n/p != 0)
p*=10, i++;
for(;p != 1; p/=10)
*(strPtr++) = ((n % p)/(p/10)) + '0';
*strPtr++ = '\0';
return strPtr - i - 1;
}
int main(){
char *s[3] = {intToString(123), intToString(456), intToString(78910)};
printf("%s\n",s[2]);
int x = stringToInteger(s[2]);
printf("%d\n", x);
return 0;
}
Lastly, can someone clarify for me what the difference between an array and a pointer is? There's a section in K&R that has me very confused about it; "5.5 - Character Pointers and Functions." I'll quote it here:
"There is an important difference between the definitions:
char amessage[] = "now is the time"; /*an array*/
char *pmessage = "now is the time"; /*a pointer*/
amessage is an array, just big enough to hold the sequence of characters and '\0' that
initializes it. Individual characters within the array may be changed but amessage will
always refer to the same storage. On the other hand, pmessage is a pointer, initialized
to point to a string constant; the pointer may subsequently be modified to point
elsewhere, but the result is undefined if you try to modify the string contents."
What does that even mean?
For itoa the length of a resulting string can't be greater than the length of INT_MAX + minus sign - so you'd be safe with a buffer of that length. The length of number string is easy to determine by using log10(number) + 1, so you'd need buffer sized log10(INT_MAX) + 3, with space for minus and terminating \0.
Also, generally it's not a good practice to return pointers to 'black box' buffers from functions. Your best bet here would be to provide a buffer as a pointer argument in intToString, so then you can easily use any type of memory you like (dynamic, allocated on stack, etc.). Here's an example:
char *intToString(int n, char *buffer) {
// ...
char *bufferStart = buffer;
for(;p != 1; p/=10)
*(buffer++) = ((n % p)/(p/10)) + '0';
*buffer++ = '\0';
return bufferStart;
}
Then you can use it as follows:
char *buffer1 = malloc(30);
char buffer2[15];
intToString(10, buffer1); // providing pointer to heap allocated memory as a buffer
intToString(20, &buffer2[0]); // providing pointer to statically allocated memory
what the difference between an array and a pointer is?
The answer is in your quote - a pointer can be modified to be pointing to another memory address. Compare:
int a[] = {1, 2, 3};
int b[] = {4, 5, 6};
int *ptrA = &a[0]; // the ptrA now contains pointer to a's first element
ptrA = &b[0]; // now it's b's first element
a = b; // it won't compile
Also, arrays are generally statically allocated, while pointers are suitable for any allocation mechanism.
Regarding your code:
You are using a single static buffer for every call to intToString: this is bad because the string produced by the first call to it will be overwritten by the next.
Generally, functions that handle strings in C should either return a new buffer from malloc, or they should write into a buffer provided by the caller. Allocating a new buffer is less prone to problems due to running out of buffer space.
You are also using a static pointer for the location to write into the buffer, and it never rewinds, so that's definitely a problem: enough calls to this function, and you will run off the end of the buffer and crash.
You already have an initial loop that calculates the number of digits in the function. So you should then just make a new buffer that big using malloc, making sure to leave space for the \0, write in to that, and return it.
Also, since i is not just a loop index, change it to something more obvious like length:
That is to say: get rid of the global variables, and instead after computing length:
char *s, *result;
// compute length
s = result = malloc(length+1);
if (!s) return NULL; // out of memory
for(;p != 1; p/=10)
*(s++) = ((n % p)/(p/10)) + '0';
*s++ = '\0';
return result;
The caller is responsible for releasing the buffer when they're done with it.
Two other things I'd really recommend while learning about pointers:
Compile with all warnings turned on (-Wall etc) and if you get an error try to understand what caused it; they will have things to teach you about how you're using the language
Run your program under Valgrind or some similar checker, which will make pointer bugs more obvious, rather than causing silent corruption
Regarding your last question:
char amessage[] = "now is the time"; - is an array. Arrays cannot be reassigned to point to something else (unlike pointers), it points to a fixed address in memory. If the array was allocated in a block, it will be cleaned up at the end of the block (meaning you cannot return such an array from a function). You can however fiddle with the data inside the array as much as you like so long as you don't exceed the size of the array.
E.g. this is legal amessage[0] = 'N';
char *pmessage = "now is the time"; - is a pointer. A pointer points to a block in memory, nothing more. "now is the time" is a string literal, meaning it is stored inside the executable in a read only location. You cannot under any circumstances modify the data it is pointing to. You can however reassign the pointer to point to something else.
This is NOT legal -*pmessage = 'N'; - will segfault most likely (note that you can use the array syntax with pointers, *pmessage is equivalent to pmessage[0]).
If you compile it with gcc using the -S flag you can actually see "now is the time" stored in the read only part of the assembly executable.
One other thing to point out is that arrays decay to pointers when passed as arguments to a function. The following two declarations are equivalent:
void foo(char arr[]);
and
void foo(char* arr);
About how to use pointers and the difference between array and pointer, I recommend you read the "expert c programming" (http://www.amazon.com/Expert-Programming-Peter-van-Linden/dp/0131774298/ref=sr_1_1?ie=UTF8&qid=1371439251&sr=8-1&keywords=expert+c+programming).
Better way to return strings from functions is to allocate dynamic memory (using malloc) and fill it with the required string...return this pointer to the calling function and then free it.
Sample code :
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#define MAX_NAME_SIZE 20
char * func1()
{
char * c1= NULL;
c1 = (char*)malloc(sizeof(MAX_NAME_SIZE));
strcpy(c1,"John");
return c1;
}
main()
{
char * c2 = NULL;
c2 = func1();
printf("%s \n",c2);
free(c2);
}
And this works without the static strings.

best practice for returning a variable length string in c

I have a string function that accepts a pointer to a source string and returns a pointer to a destination string. This function currently works, but I'm worried I'm not following the best practice regrading malloc, realloc, and free.
The thing that's different about my function is that the length of the destination string is not the same as the source string, so realloc() has to be called inside my function. I know from looking at the docs...
http://www.cplusplus.com/reference/cstdlib/realloc/
that the memory address might change after the realloc. This means I have can't "pass by reference" like a C programmer might for other functions, I have to return the new pointer.
So the prototype for my function is:
//decode a uri encoded string
char *net_uri_to_text(char *);
I don't like the way I'm doing it because I have to free the pointer after running the function:
char * chr_output = net_uri_to_text("testing123%5a%5b%5cabc");
printf("%s\n", chr_output); //testing123Z[\abc
free(chr_output);
Which means that malloc() and realloc() are called inside my function and free() is called outside my function.
I have a background in high level languages, (perl, plpgsql, bash) so my instinct is proper encapsulation of such things, but that might not be the best practice in C.
The question: Is my way best practice, or is there a better way I should follow?
full example
Compiles and runs with two warnings on unused argc and argv arguments, you can safely ignore those two warnings.
example.c:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *net_uri_to_text(char *);
int main(int argc, char ** argv) {
char * chr_input = "testing123%5a%5b%5cabc";
char * chr_output = net_uri_to_text(chr_input);
printf("%s\n", chr_output);
free(chr_output);
return 0;
}
//decodes uri-encoded string
//send pointer to source string
//return pointer to destination string
//WARNING!! YOU MUST USE free(chr_result) AFTER YOU'RE DONE WITH IT OR YOU WILL GET A MEMORY LEAK!
char *net_uri_to_text(char * chr_input) {
//define variables
int int_length = strlen(chr_input);
int int_new_length = int_length;
char * chr_output = malloc(int_length);
char * chr_output_working = chr_output;
char * chr_input_working = chr_input;
int int_output_working = 0;
unsigned int uint_hex_working;
//while not a null byte
while(*chr_input_working != '\0') {
//if %
if (*chr_input_working == *"%") {
//then put correct char in
sscanf(chr_input_working + 1, "%02x", &uint_hex_working);
*chr_output_working = (char)uint_hex_working;
//printf("special char:%c, %c, %d<\n", *chr_output_working, (char)uint_hex_working, uint_hex_working);
//realloc
chr_input_working++;
chr_input_working++;
int_new_length -= 2;
chr_output = realloc(chr_output, int_new_length);
//output working must be the new pointer plys how many chars we've done
chr_output_working = chr_output + int_output_working;
} else {
//put char in
*chr_output_working = *chr_input_working;
}
//increment pointers and number of chars in output working
chr_input_working++;
chr_output_working++;
int_output_working++;
}
//last null byte
*chr_output_working = '\0';
return chr_output;
}
It's perfectly ok to return malloc'd buffers from functions in C, as long as you document the fact that they do. Lots of libraries do that, even though no function in the standard library does.
If you can compute (a not too pessimistic upper bound on) the number of characters that need to be written to the buffer cheaply, you can offer a function that does that and let the user call it.
It's also possible, but much less convenient, to accept a buffer to be filled in; I've seen quite a few libraries that do that like so:
/*
* Decodes uri-encoded string encoded into buf of length len (including NUL).
* Returns the number of characters written. If that number is less than len,
* nothing is written and you should try again with a larger buffer.
*/
size_t net_uri_to_text(char const *encoded, char *buf, size_t len)
{
size_t space_needed = 0;
while (decoding_needs_to_be_done()) {
// decode characters, but only write them to buf
// if it wouldn't overflow;
// increment space_needed regardless
}
return space_needed;
}
Now the caller is responsible for the allocation, and would do something like
size_t len = SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH;
char *result = xmalloc(len);
len = net_uri_to_text(input, result, len);
if (len > SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH) {
// try again
result = xrealloc(input, result, len);
}
(Here, xmalloc and xrealloc are "safe" allocating functions that I made up to skip NULL checks.)
The thing is that C is low-level enough to force the programmer to get her memory management right. In particular, there's nothing wrong with returning a malloc()ated string. It's a common idiom to return mallocated obejcts and have the caller free() them.
And anyways, if you don't like this approach, you can always take a pointer to the string and modify it from inside the function (after the last use, it will still need to be free()d, though).
One thing, however, that I don't think is necessary is explicitly shrinking the string. If the new string is shorter than the old one, there's obviously enough room for it in the memory chunk of the old string, so you don't need to realloc().
(Apart from the fact that you forgot to allocate one extra byte for the terminating NUL character, of course...)
And, as always, you can just return a different pointer each time the function is called, and you don't even need to call realloc() at all.
If you accept one last piece of good advice: it's advisable to const-qualify your input strings, so the caller can ensure that you don't modify them. Using this approach, you can safely call the function on string literals, for example.
All in all, I'd rewrite your function like this:
char *unescape(const char *s)
{
size_t l = strlen(s);
char *p = malloc(l + 1), *r = p;
while (*s) {
if (*s == '%') {
char buf[3] = { s[1], s[2], 0 };
*p++ = strtol(buf, NULL, 16); // yes, I prefer this over scanf()
s += 3;
} else {
*p++ = *s++;
}
}
*p = 0;
return r;
}
And call it as follows:
int main()
{
const char *in = "testing123%5a%5b%5cabc";
char *out = unescape(in);
printf("%s\n", out);
free(out);
return 0;
}
It's perfectly OK to return newly-malloc-ed (and possibly internally realloced) values from functions, you just need to document that you are doing so (as you do here).
Other obvious items:
Instead of int int_length you might want to use size_t. This is "an unsigned type" (usually unsigned int or unsigned long) that is the appropriate type for lengths of strings and arguments to malloc.
You need to allocate n+1 bytes initially, where n is the length of the string, as strlen does not include the terminating 0 byte.
You should check for malloc failing (returning NULL). If your function will pass the failure on, document that in the function-description comment.
sscanf is pretty heavy-weight for converting the two hex bytes. Not wrong, except that you're not checking whether the conversion succeeds (what if the input is malformed? you can of course decide that this is the caller's problem but in general you might want to handle that). You can use isxdigit from <ctype.h> to check for hexadecimal digits, and/or strtoul to do the conversion.
Rather than doing one realloc for every % conversion, you might want to do a final "shrink realloc" if desirable. Note that if you allocate (say) 50 bytes for a string and find it requires only 49 including the final 0 byte, it may not be worth doing a realloc after all.
I would approach the problem in a slightly different way. Personally, I would split your function in two. The first function to calculate the size you need to malloc. The second would write the output string to the given pointer (which has been allocated outside of the function). That saves several calls to realloc, and will keep the complexity the same. A possible function to find the size of the new string is:
int getNewSize (char *string) {
char *i = string;
int size = 0, percent = 0;
for (i, size; *i != '\0'; i++, size++) {
if (*i == '%')
percent++;
}
return size - percent * 2;
}
However, as mentioned in other answers there is no problem in returning a malloc'ed buffer as long as you document it!
Additionally what was already mentioned in the other postings, you should also document the fact that the string is reallocated. If your code is called with a static string or a string allocated with alloca, you may not reallocate it.
I think you are right to be concerned about splitting up mallocs and frees. As a rule, whatever makes it, owns it and should free it.
In this case, where the strings are relatively small, one good procedure is to make the string buffer larger than any possible string it could contain. For example, URLs have a de facto limit of about 2000 characters, so if you malloc 10000 characters you can store any possible URL.
Another trick is to store both the length and capacity of the string at its front, so that (int)*mystring == length of string and (int)*(mystring + 4) == capacity of string. Thus, the string itself only starts at the 8th position *(mystring+8). By doing this you can pass around a single pointer to a string and always know how long it is and how much memory capacity the string has. You can make macros that automatically generate these offsets and make "pretty code".
The value of using buffers this way is you do not need to do a reallocation. The new value overwrites the old value and you update the length at the beginning of the string.

Am I passing a copy of my char array, or a pointer?

I've been studying C, and I decided to practice using my knowledge by creating some functions to manipulate strings. I wrote a string reverser function, and a main function that asks for user input, sends it through stringreverse(), and prints the results.
Basically I just want to understand how my function works. When I call it with 'tempstr' as the first param, is that to be understood as the address of the first element in the array? Basically like saying &tempstr[0], right?
I guess answering this question would tell me: Would there be any difference if I assigned a char* pointer to my tempstr array and then sent that to stringreverse() as the first param, versus how I'm doing it now? I want to know whether I'm sending a duplicate of the array tempstr, or a memory address.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char* stringreverse(char* tempstr, char* returnptr);
printf("\nEnter a string:\n\t");
char tempstr[1024];
gets(tempstr);
char *revstr = stringreverse(tempstr, revstr); //Assigns revstr the address of the first character of the reversed string.
printf("\nReversed string:\n"
"\t%s\n", revstr);
main();
return 0;
}
char* stringreverse(char* tempstr, char* returnptr)
{
char revstr[1024] = {0};
int i, j = 0;
for (i = strlen(tempstr) - 1; i >= 0; i--, j++)
{
revstr[j] = tempstr[i]; //string reverse algorithm
}
returnptr = &revstr[0];
return returnptr;
}
Thanks for your time. Any other critiques would be helpful . . only a few weeks into programming :P
EDIT: Thanks to all the answers, I figured it out. Here's my solution for anyone wondering:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void stringreverse(char* s);
int main(void)
{
printf("\nEnter a string:\n\t");
char userinput[1024] = {0}; //Need to learn how to use malloc() xD
gets(userinput);
stringreverse(userinput);
printf("\nReversed string:\n"
"\t%s\n", userinput);
main();
return 0;
}
void stringreverse(char* s)
{
int i, j = 0;
char scopy[1024]; //Update to dynamic buffer
strcpy(scopy, s);
for (i = strlen(s) - 1; i >= 0; i--, j++)
{
*(s + j) = scopy[i];
}
}
First, a detail:
int main()
{
char* stringreverse(char* tempstr, char* returnptr);
That prototype should go outside main(), like this:
char* stringreverse(char* tempstr, char* returnptr);
int main()
{
As to your main question: the variable tempstr is a char*, i.e. the address of a character. If you use C's index notation, like tempstr[i], that's essentially the same as *(tempstr + i). The same is true of revstr, except that in that case you're returning the address of a block of memory that's about to be clobbered when the array it points to goes out of scope. You've got the right idea in passing in the address of some memory into which to write the reversed string, but you're not actually copying the data into the memory pointed to by that block. Also, the line:
returnptr = &revstr[0];
Doesn't do what you think. You can't assign a new pointer to returnptr; if you really want to modify returnptr, you'll need to pass in its address, so the parameter would be specified char** returnptr. But don't do that: instead, create a block in your main() that will receive the reversed string, and pass its address in the returnptr parameter. Then, use that block rather than the temporary one you're using now in stringreverse().
Basically I just want to understand how my function works.
One problem you have is that you are using revstr without initializing it or allocating memory for it. This is undefined behavior since you are writing into memory doesn't belong to you. It may appear to work, but in fact what you have is a bug and can produce unexpected results at any time.
When I call it with 'tempstr' as the first param, is that to be understood as the address of the first element in the array? Basically like saying &tempstr[0], right?
Yes. When arrays are passed as arguments to a function, they are treated as regular pointers, pointing to the first element in the array. There is no difference if you assigned &temp[0] to a char* before passing it to stringreverser, because that's what the compiler is doing for you anyway.
The only time you will see a difference between arrays and pointers being passed to functions is in C++ when you start learning about templates and template specialization. But this question is C, so I just thought I'd throw that out there.
When I call it with 'tempstr' as the first param, is that to be understood as the
address of the first element in the array? Basically like saying &tempstr[0],
right?
char tempstr[1024];
tempstr is an array of characters. When passed tempstr to a function, it decays to a pointer pointing to first element of tempstr. So, its basically same as sending &tempstr[0].
Would there be any difference if I assigned a char* pointer to my tempstr array and then sent that to stringreverse() as the first param, versus how I'm doing it now?
No difference. You might do -
char* pointer = tempstr ; // And can pass pointer
char *revstr = stringreverse(tempstr, revstr);
First right side expression's is evaluavated and the return value is assigned to revstr. But what is revstr that is being passed. Program should allocate memory for it.
char revstr[1024] ;
char *retValue = stringreverse(tempstr, revstr) ;
// ^^^^^^ changed to be different.
Now, when passing tempstr and revstr, they decayed to pointers pointing to their respective first indexes. In that case why this would go wrong -
revstr = stringreverse(tempstr, revstr) ;
Just because arrays are not pointers. char* is different from char[]. Hope it helps !
In response to your question about whether the thing passed to the function is an array or a pointer, the relevant part of the C99 standard (6.3.2.1/3) states:
Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue.
So yes, other than the introduction of another explicit variable, the following two lines are equivalent:
char x[] = "abc"; fn (x);
char x[] = "abc"; char *px = &(x[0]); fn (px);
As to a critique, I'd like to raise the following.
While legal, I find it incongruous to have function prototypes (such as stringreverse) anywhere other than at file level. In fact, I tend to order my functions so that they're not usually necessary, making one less place where you have to change it, should the arguments or return type need to be changed. That would entail, in this case, placing stringreverse before main.
Don't ever use gets in a real program.. It's unprotectable against buffer overflows. At a minimum, use fgets which can be protected, or use a decent input function such as the one found here.
You cannot create a local variable within stringreverse and pass back the address of it. That's undefined behaviour. Once that function returns, that variable is gone and you're most likely pointing to whatever happens to replace it on the stack the next time you call a function.
There's no need to pass in the revstr variable either. If it were a pointer with backing memory (i.e., had space allocated for it), that would be fine but then there would be no need to return it. In that case you would allocate both in the caller:
char tempstr[1024];
char revstr[1024];
stringreverse (tempstr, revstr); // Note no return value needed
// since you're manipulating revstr directly.
You should also try to avoid magic numbers like 1024. Better to have lines like:
#define BUFFSZ 1024
char tempstr[BUFFSZ];
so that you only need to change it in one place if you ever need a new value (that becomes particularly important if you have lots of 1024 numbers with different meanings - global search and replace will be your enemy in that case rather than your friend).
In order to make you function more adaptable, you may want to consider allowing it to handle any length. You can do that by passing both buffers in, or by using malloc to dynamically allocate a buffer for you, something like:
char *reversestring (char *src) {
char *dst = malloc (strlen (src) + 1);
if (dst != NULL) {
// copy characters in reverse order.
}
return dst;
}
This puts the responsibility for freeing that memory on the caller but that's a well-worn way of doing things.
You should probably use one of the two canonical forms for main:
int main (int argc, char *argv[]);
int main (void);
It's also a particularly bad idea to call main from anywhere. While that may look like a nifty way to get an infinite loop, it almost certainly will end up chewing up your stack space :-)
All in all, this is probably the function I'd initially write. It allows the user to populate their own buffer if they want, or to specify they don't have one, in which case one will be created for them:
char *revstr (char *src, char *dst) {
// Cache size in case compiler not smart enough to do so.
// Then create destination buffer if none provided.
size_t sz = strlen (src);
if (dst == NULL) dst = malloc (sz + 1);
// Assuming buffer available, copy string.
if (dst != NULL) {
// Run dst end to start, null terminator first.
dst += sz; *dst = '\0';
// Copy character by character until null terminator in src.
// We end up with dst set to original correct value.
while (*src != '\0')
*--dst = *src++;
}
// Return reversed string (possibly NULL if malloc failed).
return dst;
}
In your stringreverse() function, you are returning the address of a local variable (revstr). This is undefined behaviour and is very bad. Your program may appear to work right now, but it will suddenly fail sometime in the future for reasons that are not obvious.
You have two general choices:
Have stringreverse() allocate memory for the returned string, and leave it up to the caller to free it.
Have the caller preallocate space for the returned string, and tell stringreverse() where it is and how big it is.

Different results using %c and loop vs. %s in printf with null terminated string

I have a variable 'jmp_code' that is declared as a char *. When I run the following commands
printf("char by char, the code is '%c%c%c%c'\n", *jmp_code, *(jmp_code+1), *(jmp_code+2),*(jmp_code+3));
printf("printing the string, the code is '%s'\n", jmp_code);
I get the following results
char by char, the code is '0,0,0, ,'
printing the string, the code is 'ö\├w≡F┴w'
I am using codeblocks. Here is the sample code I am playing with.
#include <stdio.h>
#include <string.h>
char * some_func(char * code);
char * some_func(char * code) {
char char_array[4];
strcpy(char_array, "000");
code = char_array;
return code;
}
int main ( void ) {
char * jmp_code = NULL;
jmp_code = some_func(jmp_code);
printf("char by char, the code is '%c,%c,%c,%c,'\n", *jmp_code, *(jmp_code+1), *(jmp_code+2),*(jmp_code+3));
printf("printing the string, the code is '%s'\n", jmp_code);
return 0;
}
I am quite confused by this. Any help would be appreciated.
Thanks
Some quick observations:
char * some_func(char * code) {
char char_array[4];
strcpy(char_array, "000");
code = char_array;
return code;
}
You can't assign strings using = in C. That messes things up - you're assigning code the pointer of your locally allocated char_array to code, but you're not copying the contents of the memory. Also note that since char_array is allocated on the stack (usually), you'll find it disappears when you return from that function. You could work around that with the static keyword, but I don't think that's the nicest of solutions here. You should use something along the lines of (big warning this example is not massively secure, you do need to check string lengths, but for the sake of brevity):
void some_func(char * code) {
strcpy(code, "000");
return;
}
(Refer to this (and this) for secure string handling advice).
And call it via some_func(jmp_code) in main. If you're not sure what this does, read up on pointers.
Second problem.
char * jmp_code = NULL;
Currently, you've declared space enough for a pointer to a char type. If you want to use my suggestion above, you'll need either to use malloc() and free() or else declare char jmp_code[4] instead, such that the space is allocated.
What do I think's happening? Well, on my system, I'm getting:
and the code is '0,0,0,,' and the code
is ''
But I think it's chance that jmp_code points to the zeros on the stack provided by your some_func function. I think on your system that data has been overwritten.
Instead you're reading information that your terminal interprets as said character. Have a read of character encoding. I particularly recommend starting with The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
You're returning a reference to a temporary array. char_array goes away when some_func() retuns, but you keep using the address of it. You need to use malloc() to allocate an array and then free() it after you use it.
You're printing from an invalid pointer. char_array is on the stack of some_func() function.
The function returns the pointer of something that is on the stack and will be no more after the function returns!
The first printf finds the stack still unchanged, the second, maybe, found it filled with... garbage!
It might be interesting to see:
const char *pos = jmp_code;
while (*pos)
printf("%d ", *pos++);
I think char type can not use non-ascii char codes. Meaning your string contains UTF-8 or like symbols which code could be in (0, over9000) range, while char codes could be in (0, 255) range.

Resources