I'm using an array of strings in C to hold arguments given to a custom shell. I initialize the array of buffers using:
char *args[MAX_CHAR];
Once I parse the arguments, I send them to the following function to determine the type of IO redirection if there are any (this is just the first of 3 functions to check for redirection and it only checks for STDIN redirection).
int parseInputFile(char **args, char *inputFilePath) {
char *inputSymbol = "<";
int isFound = 0;
for (int i = 0; i < MAX_ARG; i++) {
if (strlen(args[i]) == 0) {
isFound = 0;
break;
}
if ((strcmp(args[i], inputSymbol)) == 0) {
strcpy(inputFilePath, args[i+1]);
isFound = 1;
break;
}
}
return isFound;
}
Once I compile and run the shell, it crashes with a SIGSEGV. Using GDB I determined that the shell is crashing on the following line:
if (strlen(args[i]) == 0) {
This is because the address of arg[i] (the first empty string after the parsed commands) is inaccessible. Here is the error from GDB and all relevant variables:
(gdb) next
359 if (strlen(args[i]) == 0) {
(gdb) p args[0]
$1 = 0x7fffffffe570 "echo"
(gdb) p args[1]
$2 = 0x7fffffffe575 "test"
(gdb) p args[2]
$3 = 0x0
(gdb) p i
$4 = 2
(gdb) next
Program received signal SIGSEGV, Segmentation fault.
parseInputFile (args=0x7fffffffd570, inputFilePath=0x7fffffffd240 "") at shell.c:359
359 if (strlen(args[i]) == 0) {
I believe that the p args[2] returning $3 = 0x0 means that because the index has yet to be written to, it is mapped to address 0x0 which is out of the bounds of execution. Although I can't figure out why this is because it was declared as a buffer. Any suggestions on how to solve this problem?
EDIT: Per Kaylum's comment, here is a minimal reproducible example
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>
#include <sys/stat.h>
#include<readline/readline.h>
#include<readline/history.h>
#include <fcntl.h>
// Defined values
#define MAX_CHAR 256
#define MAX_ARG 64
#define clear() printf("\033[H\033[J") // Clear window
#define DEFAULT_PROMPT_SUFFIX "> "
char PROMPT[MAX_CHAR], SPATH[1024];
int parseInputFile(char **args, char *inputFilePath) {
char *inputSymbol = "<";
int isFound = 0;
for (int i = 0; i < MAX_ARG; i++) {
if (strlen(args[i]) == 0) {
isFound = 0;
break;
}
if ((strcmp(args[i], inputSymbol)) == 0) {
strcpy(inputFilePath, args[i+1]);
isFound = 1;
break;
}
}
return isFound;
}
int ioRedirectHandler(char **args) {
char inputFilePath[MAX_CHAR] = "";
// Check if any redirects exist
if (parseInputFile(args, inputFilePath)) {
return 1;
} else {
return 0;
}
}
void parseArgs(char *cmd, char **cmdArgs) {
int na;
// Separate each argument of a command to a separate string
for (na = 0; na < MAX_ARG; na++) {
cmdArgs[na] = strsep(&cmd, " ");
if (cmdArgs[na] == NULL) {
break;
}
if (strlen(cmdArgs[na]) == 0) {
na--;
}
}
}
int processInput(char* input, char **args, char **pipedArgs) {
// Parse the single command and args
parseArgs(input, args);
return 0;
}
int getInput(char *input) {
char *buf, loc_prompt[MAX_CHAR] = "\n";
strcat(loc_prompt, PROMPT);
buf = readline(loc_prompt);
if (strlen(buf) != 0) {
add_history(buf);
strcpy(input, buf);
return 0;
} else {
return 1;
}
}
void init() {
char *uname;
clear();
uname = getenv("USER");
printf("\n\n \t\tWelcome to Student Shell, %s! \n\n", uname);
// Initialize the prompt
snprintf(PROMPT, MAX_CHAR, "%s%s", uname, DEFAULT_PROMPT_SUFFIX);
}
int main() {
char input[MAX_CHAR];
char *args[MAX_CHAR], *pipedArgs[MAX_CHAR];
int isPiped = 0, isIORedir = 0;
init();
while(1) {
// Get the user input
if (getInput(input)) {
continue;
}
isPiped = processInput(input, args, pipedArgs);
isIORedir = ioRedirectHandler(args);
}
return 0;
}
Note: If I forgot to include any important information, please let me know and I can get it updated.
When you write
char *args[MAX_CHAR];
you allocate room for MAX_CHAR pointers to char. You do not initialise the array. If it is a global variable, you will have initialised all the pointers to NULL, but you do it in a function, so the elements in the array can point anywhere. You should not dereference them before you have set the pointers to point at something you are allowed to access.
You also do this, though, in parseArgs(), where you do this:
cmdArgs[na] = strsep(&cmd, " ");
There are two potential issues here, but let's deal with the one you hit first. When strsep() is through the tokens you are splitting, it returns NULL. You test for that to get out of parseArgs() so you already know this. However, where your program crashes you seem to have forgotten this again. You call strlen() on a NULL pointer, and that is a no-no.
There is a difference between NULL and the empty string. An empty string is a pointer to a buffer that has the zero char first; the string "" is a pointer to a location that holds the character '\0'. The NULL pointer is a special value for pointers, often address zero, that means that the pointer doesn't point anywhere. Obviously, the NULL pointer cannot point to an empty string. You need to check if an argument is NULL, not if it is the empty string.
If you want to check both for NULL and the empty string, you could do something like
if (!args[i] || strlen(args[i]) == 0) {
If args[i] is NULL then !args[i] is true, so you will enter the if body if you have NULL or if you have a pointer to an empty string.
(You could also check the empty string with !(*args[i]); *args[i] is the first character that args[i] points at. So *args[i] is zero if you have the empty string; zero is interpreted as false, so !(*args[i]) is true if and only if args[i] is the empty string. Not that this is more readable, but it shows again the difference between empty strings and NULL).
I mentioned another issue with the parsed arguments. Whether it is a problem or not depends on the application. But when you parse a string with strsep(), you get pointers into the parsed string. You have to be careful not to free that string (it is input in your main() function) or to modify it after you have parsed the string. If you change the string, you have changed what all the parsed strings look at. You do not do this in your program, so it isn't a problem here, but it is worth keeping in mind. If you want your parsed arguments to survive longer than they do now, after the next command is passed, you need to copy them. The next command that is passed will change them as it is now.
In main
char input[MAX_CHAR];
char *args[MAX_CHAR], *pipedArgs[MAX_CHAR];
are all uninitialized. They contain indeterminate values. This could be a potential source of bugs, but is not the reason here, as
getInput modifies the contents of input to be a valid string before any reads occur.
pipedArgs is unused, so raises no issues (yet).
args is modified by parseArgs to (possibly!) contain a NULL sentinel value, without any indeterminate pointers being read first.
Firstly, in parseArgs it is possible to completely fill args without setting the NULL sentinel value that other parts of the program should rely on.
Looking deeper, in parseInputFile the following
if (strlen(args[i]) == 0)
contradicts the limits imposed by parseArgs that disallows empty strings in the array. More importantly, args[i] may be the sentinel NULL value, and strlen expects a non-NULL pointer to a valid string.
This termination condition should simply check if args[i] is NULL.
With
strcpy(inputFilePath, args[i+1]);
args[i+1] might also be the NULL sentinel value, and strcpy also expects non-NULL pointers to valid strings. You can see this in action when inputSymbol is a match for the final token in the array.
args[i+1] may also evaluate as args[MAX_ARGS], which would be out of bounds.
Additionally, inputFilePath has a string length limit of MAX_CHAR - 1, and args[i+1] is (possibly!) a dynamically allocated string whose length might exceed this.
Some edge cases found in getInput:
Both arguments to
strcat(loc_prompt, PROMPT);
are of the size MAX_CHAR. Since loc_prompt has a length of 1. If PROMPT has the length MAX_CHAR - 1, the resulting string will have the length MAX_CHAR. This would leave no room for the NUL terminating byte.
readline can return NULL in some situations, so
buf = readline(loc_prompt);
if (strlen(buf) != 0) {
can again pass the NULL pointer to strlen.
A similar issue as before, on success readline returns a string of dynamic length, and
strcpy(input, buf);
can cause a buffer overflow by attempting to copy a string greater in length than MAX_CHAR - 1.
buf is a pointer to data allocated by malloc. It's unclear what add_history does, but this pointer must eventually be passed to free.
Some considerations.
Firstly, it is a good habit to initialize your data, even if it might not matter.
Secondly, using constants (#define MAX_CHAR 256) might help to reduce magic numbers, but they can lead you to design your program too rigidly if used in the same way.
Consider building your functions to accept a limit as an argument, and return a length. This allows you to more strictly track the sizes of your data, and prevents you from always designing around the maximum potential case.
A slightly contrived example of designing like this. We can see that find does not have to concern itself with possibly checking MAX_ARGS elements, as it is told precisely how long the list of valid elements is.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_ARGS 100
char *get_input(char *dest, size_t sz, const char *display) {
char *res;
if (display)
printf("%s", display);
if ((res = fgets(dest, sz, stdin)))
dest[strcspn(dest, "\n")] = '\0';
return res;
}
size_t find(char **list, size_t length, const char *str) {
for (size_t i = 0; i < length; i++)
if (strcmp(list[i], str) == 0)
return i;
return length;
}
size_t split(char **list, size_t limit, char *source, const char *delim) {
size_t length = 0;
char *token;
while (length < limit && (token = strsep(&source, delim)))
if (*token)
list[length++] = token;
return length;
}
int main(void) {
char input[512] = { 0 };
char *args[MAX_ARGS] = { 0 };
puts("Welcome to the shell.");
while (1) {
if (get_input(input, sizeof input, "$ ")) {
size_t argl = split(args, MAX_ARGS, input, " ");
size_t redirection = find(args, argl, "<");
puts("Command parts:");
for (size_t i = 0; i < redirection; i++)
printf("%zu: %s\n", i, args[i]);
puts("Input files:");
if (redirection == argl)
puts("[[NONE]]");
else for (size_t i = redirection + 1; i < argl; i++)
printf("%zu: %s\n", i, args[i]);
}
}
}
Related
I am trying to write a simple program that will read words from a file and print the number of occurrences of a particular word passed to it as argument.
For that, I use fscanf to read the words and copy them into an array of strings that is dynamically allocated.
For some reason, I get an error message.
Here is the code for the readFile function:
void readFile(char** buffer, char** argv){
unsigned int i=0;
FILE* file;
file = fopen(argv[1], "r");
do{
buffer = realloc(buffer, sizeof(char*));
buffer[i] = malloc(46);
}while(fscanf(file, "%s", buffer[i++]));
fclose(file);
}
And here is the main function :
int main(int argc, char** argv){
char** buffer = NULL;
readFile(buffer, argv);
printf("%s\n", buffer[0]);
return 0;
}
I get the following error message :
realloc(): invalid next size
Aborted (core dumped)
I have looked at other threads on this topic but none of them seem to be of help. I could not apply whatever I learned there to my problem.
I used a debugger (VS Code with gdb). Data is written successfully into indices 0,1,2,3 of the buffer array but says error : Cannot access memory at address 0xfbad2488 for index 4 and pauses on exception.
Another thread on this topic suggests there might be a wild pointer somewhere. But I don't see one anywhere.
I have spent days trying to figure this out. Any help will be greatly appreciated.
Thanks.
Your algorithm is wrong on many fronts, including:
buffer is passed by-value. Any modifications where buffer = ... is the assignment will mean nothing to the caller. In C, arguments are always pass-by-value (arrays included, but their "value" is a conversion to temporary pointer to first element, so you get a by-ref synonym there whether you want it or not).
Your realloc usage is wrong. It should be expanding based on the iteration of the loop as a count multiplied by the size of a char *. You only have the latter, with no count multiplier. Therefore, you never allocate more than a single char * with that realloc call.
Your loop termination condition is wrong. Your fscanf call should check for the expected number of arguments to be processed, which in your case is 1. Instead, you're looking for any non-zero value, which EOF is going to be when you hit it. Therefore, the loop never terminates.
Your fscanf call is not protected from buffer overflow : You're allocating a static-sized string for each string read, but not limiting the %s format to the static size specified. This is a recipe for buffer-overflow.
No IO functions are ever checked for success/failure : The following APIs could fail, yet you never check that possibility: fopen, fscanf, realloc, malloc. In failing to do so, you're violating Henry Spencer's 6th Commandment for C Programmers : "If a function be advertised to return an error code in the event of difficulties, thou shalt check for that code, yea, even though the checks triple the size of thy code and produce aches in thy typing fingers, for if thou thinkest ``it cannot happen to me'', the gods shall surely punish thee for thy arrogance."
No mechanism for communicating the allocated string count to the caller : The caller of this function is expecting a resulting char**. Assuming you fix the first item in this list, you still have not provided the caller any means of knowing how long that pointer sequence is when readFile returns. An out-parameter and/or a formal structure is a possible solution to this. Or perhaps a terminating NULL pointer to indicate the list is finished.
(Moderate) You never check argc : Instead, you just send argv directly to readFile, and assume the file name will be at argv[1] and always be valid. Don't do that. readFile should take either a FILE* or a single const char * file name, and act accordingly. It would be considerably more robust.
(Minor) : Extra allocation : Even fixing the above items, you'll still leave one extra buffer allocation in your sequence; the one that failed to read. Not that it matter much in this case, as the caller has no idea how many strings were allocated in the first place (see previous item).
Shoring up all of the above would require a basic rewrite of nearly everything you have posted. In the end, the code would look so different, it's almost not worth trying to salvage what is here. Instead, look at what you have done, look at this list, and see where things went wrong. There's plenty to choose from.
Sample
#include <stdio.h>
#include <stdlib.h>
#define STR_MAX_LEN 46
char ** readFile(const char *fname)
{
char **strs = NULL;
int len = 0;
FILE *fp = fopen(fname, "r");
if (fp != NULL)
{
do
{
// array expansion
void *tmp = realloc(strs, (len+1) * sizeof *strs);
if (tmp == NULL)
{
// failed. cleanup prior success
perror("Failed to expand pointer array");
for (int i=0; i<len; ++i)
free(strs[i]);
free(strs);
strs = NULL;
break;
}
// allocation was good; save off new pointer
strs = tmp;
strs[len] = malloc( STR_MAX_LEN );
if (strs[len] == NULL)
{
// failed. cleanup prior sucess
perror("Failed to allocate string buffer");
for (int i=0; i<len; ++i)
free(strs[i]);
free(strs);
strs = NULL;
break;
}
if (fscanf(fp, "%45s", strs[len]) == 1)
{
++len;
}
else
{
// read failed. we're leaving regardless. the last
// allocation is thrown out, but we terminate the list
// with a NULL to indicate end-of-list to the caller
free(strs[len]);
strs[len] = NULL;
break;
}
} while (1);
fclose(fp);
}
return strs;
}
int main(int argc, char *argv[])
{
if (argc < 2)
exit(EXIT_FAILURE);
char **strs = readFile(argv[1]);
if (strs)
{
// enumerate and free in the same loop
for (char **pp = strs; *pp; ++pp)
{
puts(*pp);
free(*pp);
}
// free the now-defunct pointer array
free(strs);
}
return EXIT_SUCCESS;
}
Output (run against /usr/share/dict/words)
A
a
aa
aal
aalii
aam
Aani
aardvark
aardwolf
Aaron
Aaronic
Aaronical
Aaronite
Aaronitic
Aaru
Ab
aba
Ababdeh
Ababua
abac
abaca
......
zymotechny
zymotic
zymotically
zymotize
zymotoxic
zymurgy
Zyrenian
Zyrian
Zyryan
zythem
Zythia
zythum
Zyzomys
Zyzzogeton
Improvements
The secondary malloc in this code is completely pointless. You're using a fixed length word maximum size, so you could easily retool you array to be a pointer to use this:
char (*strs)[STR_MAX_LEN]
and simply eliminate the per-string malloc code entirely. That does leave the problem of how to tell the caller how many strings were allocated. In the prior version we used a NULL pointer to indicate end-of-list. In this version we can simply use a zero-length string. Doing this makes the declaration of readFile rather odd looking, but for returning a pointer-to-array-of-size-N, its' correct. See below:
#include <stdio.h>
#include <stdlib.h>
#define STR_MAX_LEN 46
char (*readFile(const char *fname))[STR_MAX_LEN]
{
char (*strs)[STR_MAX_LEN] = NULL;
int len = 0;
FILE *fp = fopen(fname, "r");
if (fp != NULL)
{
do
{
// array expansion
void *tmp = realloc(strs, (len+1) * sizeof *strs);
if (tmp == NULL)
{
// failed. cleanup prior success
perror("Failed to expand pointer array");
free(strs);
strs = NULL;
break;
}
// allocation was good; save off new pointer
strs = tmp;
if (fscanf(fp, "%45s", strs[len]) == 1)
{
++len;
}
else
{
// read failed. make the final string zero-length
strs[len][0] = 0;
break;
}
} while (1);
fclose(fp);
}
return strs;
}
int main(int argc, char *argv[])
{
if (argc < 2)
exit(EXIT_FAILURE);
char (*strs)[STR_MAX_LEN] = readFile(argv[1]);
if (strs)
{
// enumerate and free in the same loop
for (char (*s)[STR_MAX_LEN] = strs; (*s)[0]; ++s)
puts(*s);
free(strs);
}
return EXIT_SUCCESS;
}
The output is the same as before.
Another Improvement: Geometric Growth
With a few simple changes we can significantly cut down on the realloc invokes (we're currently doing one per string added) by only doing them in a double-size growth pattern. If each time we reallocate, we double the size of the prior allocation, we will make more and more space available for reading larger numbers of strings before the next allocation:
#include <stdio.h>
#include <stdlib.h>
#define STR_MAX_LEN 46
char (*readFile(const char *fname))[STR_MAX_LEN]
{
char (*strs)[STR_MAX_LEN] = NULL;
int len = 0;
int capacity = 0;
FILE *fp = fopen(fname, "r");
if (fp != NULL)
{
do
{
if (len == capacity)
{
printf("Expanding capacity to %d\n", (2 * capacity + 1));
void *tmp = realloc(strs, (2 * capacity + 1) * sizeof *strs);
if (tmp == NULL)
{
// failed. cleanup prior success
perror("Failed to expand string array");
free(strs);
strs = NULL;
break;
}
// save the new string pointer and capacity
strs = tmp;
capacity = 2 * capacity + 1;
}
if (fscanf(fp, "%45s", strs[len]) == 1)
{
++len;
}
else
{
// read failed. make the final string zero-length
strs[len][0] = 0;
break;
}
} while (1);
// shrink if needed. remember to retain the final empty string
if (strs && (len+1) < capacity)
{
printf("Shrinking capacity to %d\n", len);
void *tmp = realloc(strs, (len+1) * sizeof *strs);
if (tmp)
strs = tmp;
}
fclose(fp);
}
return strs;
}
int main(int argc, char *argv[])
{
if (argc < 2)
exit(EXIT_FAILURE);
char (*strs)[STR_MAX_LEN] = readFile(argv[1]);
if (strs)
{
// enumerate and free in the same loop
for (char (*s)[STR_MAX_LEN] = strs; (*s)[0]; ++s)
puts(*s);
// free the now-defunct pointer array
free(strs);
}
return EXIT_SUCCESS;
}
Output
The output is the same as before, but I added instrumentation to show when expansion happens to illustrate the expansions and final shrinking. I'll leave out the rest of the output (which is over 200k lines of words)
Expanding capacity to 1
Expanding capacity to 3
Expanding capacity to 7
Expanding capacity to 15
Expanding capacity to 31
Expanding capacity to 63
Expanding capacity to 127
Expanding capacity to 255
Expanding capacity to 511
Expanding capacity to 1023
Expanding capacity to 2047
Expanding capacity to 4095
Expanding capacity to 8191
Expanding capacity to 16383
Expanding capacity to 32767
Expanding capacity to 65535
Expanding capacity to 131071
Expanding capacity to 262143
Shrinking capacity to 235886
Right now, I'm attempting to familiarize myself with C by writing a function which, given a string, will replace all instances of a target substring with a new substring. However, I've run into a problem with a reallocation of a char* array. To my eyes, it seems as though I'm able to successfully reallocate the array string to a desired new size at the end of the main loop, then perform a strcpy to fill it with an updated string. However, it fails for the following scenario:
Original input for string: "use the restroom. Then I need"
Target to replace: "the" (case insensitive)
Desired replacement value: "th'"
At the end of the loop, the line printf("result: %s\n ",string); prints out the correct phrase "use th' restroom. Then I need". However, string seems to then reset itself: the call to strcasestr in the while() statement is successful, the line at the beginning of the loop printf("string: %s \n",string); prints the original input string, and the loop continues indefinitely.
Any ideas would be much appreciated (and I apologize in advance for my flailing debug printf statements). Thanks!
The code for the function is as follows:
int replaceSubstring(char *string, int strLen, char*oldSubstring,
int oldSublen, char*newSubstring, int newSublen )
{
printf("Starting replace\n");
char* strLoc;
while((strLoc = strcasestr(string, oldSubstring)) != NULL )
{
printf("string: %s \n",string);
printf("%d",newSublen);
char *newBuf = (char *) malloc((size_t)(strLen +
(newSublen - oldSublen)));
printf("got newbuf\n");
int stringIndex = 0;
int newBufIndex = 0;
char c;
while(true)
{
if(stringIndex > 500)
break;
if(&string[stringIndex] == strLoc)
{
int j;
for(j=0; j < newSublen; j++)
{
printf("new index: %d %c --> %c\n",
j+newBufIndex, newBuf[newBufIndex+j], newSubstring[j]);
newBuf[newBufIndex+j] = newSubstring[j];
}
stringIndex += oldSublen;
newBufIndex += newSublen;
}
else
{
printf("old index: %d %c --> %c\n", stringIndex,
newBuf[newBufIndex], string[stringIndex]);
newBuf[newBufIndex] = string[stringIndex];
if(string[stringIndex] == '\0')
break;
newBufIndex++;
stringIndex++;
}
}
int length = (size_t)(strLen + (newSublen - oldSublen));
string = (char*)realloc(string,
(size_t)(strLen + (newSublen - oldSublen)));
strcpy(string, newBuf);
printf("result: %s\n ",string);
free(newBuf);
}
printf("end result: %s ",string);
}
At first the task should be clarified regarding desired behavior and interface.
The topic "Char array..." is not clear.
You provide strLen, oldSublen newSublen, so it looks that you indeed want to work just with bulk memory buffers with given length.
However, you use strcasestr, strcpy and string[stringIndex] == '\0' and also mention printf("result: %s\n ",string);.
So I assume that you want to work with "null terminated strings" that can be passed by the caller as string literals: "abc".
It is not needed to pass all those lengths to the function.
It looks that you are trying to implement recursive string replacement. After each replacement you start from the beginning.
Let's consider more complicated sets of parameters, for example, replace aba by ab in abaaba.
Case 1: single pass through input stream
Each of both old substrings can be replaced: "abaaba" => "abab"
That is how the standard sed string replacement works:
> echo "abaaba" | sed 's/aba/ab/g'
abab
Case 2: recursive replacement taking into account possible overlapping
The first replacement: "abaaba" => "ababa"
The second replacement in already replaced result: "ababa" => "abba"
Note that this case is not safe, for example replace "loop" by "loop loop". It is an infinite loop.
Suppose we want to implement a function that takes null terminated strings and the replacement is done in one pass as with sed.
In general the replacement cannot be done in place of input string (in the same memory).
Note that realloc may allocate new memory block with new address, so you should return that address to the caller.
For implementation simplicity it is possible to calculate required space for result before memory allocation (Case 1 implementation). So reallocation is not needed:
#define _GNU_SOURCE
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
char* replaceSubstring(const char* string, const char* oldSubstring,
const char* newSubstring)
{
size_t strLen = strlen(string);
size_t oldSublen = strlen(oldSubstring);
size_t newSublen = strlen(newSubstring);
const char* strLoc = string;
size_t replacements = 0;
/* count number of replacements */
while ((strLoc = strcasestr(strLoc, oldSubstring)))
{
strLoc += oldSublen;
++replacements;
}
/* result size: initial size + replacement diff + sizeof('\0') */
size_t result_size = strLen + (newSublen - oldSublen) * replacements + 1;
char* result = malloc(result_size);
if (!result)
return NULL;
char* resCurrent = result;
const char* strCurrent = string;
strLoc = string;
while ((strLoc = strcasestr(strLoc, oldSubstring)))
{
memcpy(resCurrent, strCurrent, strLoc - strCurrent);
resCurrent += strLoc - strCurrent;
memcpy(resCurrent, newSubstring, newSublen);
resCurrent += newSublen;
strLoc += oldSublen;
strCurrent = strLoc;
}
strcpy(resCurrent, strCurrent);
return result;
}
int main()
{
char* res;
res = replaceSubstring("use the restroom. Then I need", "the", "th");
printf("%s\n", res);
free(res);
res = replaceSubstring("abaaba", "aba", "ab");
printf("%s\n", res);
free(res);
return 0;
}
Here is my code
//Split up the config by lines
int x;
int numberOfConfigLines = 0;
for (x = 0; x < strlen(buffer); x++)
{
if (buffer[x] == '\n') {
numberOfConfigLines++;
}
}
char *configLines[numberOfConfigLines];
tokenize(configLines, buffer, "\n", numberOfConfigLines);
The idea of this function is to count the amount of newlines in a buffer, then split the buffer into a strtok array using this:
#include <string.h>
#include <stdlib.h>
void tokenize(char **arrToStoreTokens, char *delimitedString, char *delimiter, int expectedTokenArraySize) {
//Create a clone of the original string to prevent making permanent changes to it
char *tempString = (char *)malloc(strlen(delimitedString) + 1);
strcpy(tempString, delimitedString);
if (expectedTokenArraySize >= 1) {
arrToStoreTokens[0] = strtok(tempString, delimiter);
int x;
for (x = 1; x < expectedTokenArraySize; x++ ) {
arrToStoreTokens[x] = strtok(NULL, delimiter);
}
}
//Dispose of temporary clone
free(tempString);
}
If I access arrToStoreTokens[0] directly, I get the correct result, however when I try to access configLines[0] once thetokenize function has ended, I get different results (can be unknown characters or simply empty)
Additionally, I believe this has only started occurring once I began running the program as root (for a different requirement) - I may be wrong though. - EDIT: Confirmed not to be the problem.
Any ideas?
strtok doesn't reallocate anything. It only makes cut and pointers of what you gave to it.
Your array stores pointers that strtok gives you, but don't copy contents.
So if you free your tempString variable, you free data that was pointed by return values of strtok. You have to keep it and free it only at the end.
Or you can make a strdup of each return of strtok to store it in your array to make a real copy of each token, but in this case, you shall have to free each token at the end.
The second solution would looks like this :
void tokenize(char **arrToStoreTokens, char *delimitedString, char *delimiter, int expectedTokenArraySize) {
//Create a clone of the original string to prevent making permanent changes to it
char *tempString = (char *)malloc(strlen(delimitedString) + 1);
strcpy(tempString, delimitedString);
if (expectedTokenArraySize >= 1) {
arrToStoreTokens[0] = strdup(strtok(tempString, delimiter)); // Here is the new part : strdup
int x;
for (x = 1; x < expectedTokenArraySize; x++ ) {
arrToStoreTokens[x] = strdup(strtok(NULL, delimiter)); // Here is the new part : strdup
}
}
//Dispose of temporary clone
free(tempString);
}
And after use of this array, you shall have to delete it, with a function like this :
void deleteTokens(char **arrToStoreTokens, int arraySize)
{
int x;
for (x = 0; x < arraySize; ++x)
{
free(arrToStoreTokens[x]);
}
}
I create an array (char *charheap;) of length 32 bytes in the heap, and initialize all the elements to be \0. Here is my main function:
int main(void) {
char *str1 = alloc_and_print(5, "hello");
char *str2 = alloc_and_print(5, "brian");
}
char *alloc_and_print(int s, const char *cpy) {
char *ncb = char_alloc(s);// allocate the next contiguous block
if (ret == NULL) {
printf("Failed\n");
} else {
strcpy(ncb, cpy);
arr_print();// print the array
}
return ncb;
}
Here is what I implement:
/char_alloc(s): find the FIRST contiguous block of s+1 NULL ('\0')
characters in charheap that does not contain the NULL terminator
of some previously allocated string./
char *char_alloc(int s) {
int len = strlen(charheap);
for (int i = 0; i < len; i++) {
if (charheap[0] == '\0') {
char a = charheap[0];
return &a;
} else if (charheap[i] == '\0') {
char b = charheap[i+1];
return &b;
}
}
return NULL;
}
Expected Output: (\ means \0)
hello\\\\\\\\\\\\\\\\\\\\\\\\\\\
hello\brian\\\\\\\\\\\\\\\\\\\\\
This solution is completely wrong and I just print out two failed. :(
Actually, the char_alloc should return a pointer to the start of contiguous block but I don't know how to implement it properly. Can someone give me a hint or clue ?
Your function is returning a pointer to a local variable, therefore the caller receives a pointer to invalid memory. Just return the pointer into the charheap, which is what you want.
return &charheap[0]; /* was return &a; which is wrong */
return &charheap[i+1]; /* was return &b; which is wrong */
Your for loop uses i < len for the terminating condition, but, since charheap is \0 filled, strlen() will return a size of 0. You want to iterate through the whole charheap, so just use the size of that array (32 in this case).
int len = 32; /* or sizeof(charheap) if it is declared as an array */
The above two fixes should be enough to get your program to behave as you expect (see demonstration).
However, you do not place a check to make sure there is enough room in your heap to accept the allocation check. Your allocation should fail if the distance between the start of the available memory and the end of the charheap is less than or equal to the desired size. You can enforce this easily enough by setting the len to be the last point you are willing to check before you know there will not be enough space.
int len = 32 - s;
Finally, when you try to allocate a third string, your loop will skip over the first allocated string, but will overwrite the second allocated string. Your loop logic needs to change to skip over each allocated string. You first check if the current location in your charheap is free or not. If it is not, you advance your position by the length of the string, plus one more to skip over the '\0' terminator for the string. If the current location is free, you return it. If you are not able to find a free location, you return NULL.
char *char_alloc(int s) {
int i = 0;
int len = 32 - s;
while (i < len) {
if (charheap[i] == '\0') return &charheap[i];
i += strlen(charheap+i) + 1;
}
return NULL;
}
I'm trying to make a quick function that gets a word/argument in a string by its number:
char* arg(char* S, int Num) {
char* Return = "";
int Spaces = 0;
int i = 0;
for (i; i<strlen(S); i++) {
if (S[i] == ' ') {
Spaces++;
}
else if (Spaces == Num) {
//Want to append S[i] to Return here.
}
else if (Spaces > Num) {
return Return;
}
}
printf("%s-\n", Return);
return Return;
}
I can't find a way to put the characters into Return. I have found lots of posts that suggest strcat() or tricks with pointers, but every one segfaults. I've also seen people saying that malloc() should be used, but I'm not sure of how I'd used it in a loop like this.
I will not claim to understand what it is that you're trying to do, but your code has two problems:
You're assigning a read-only string to Return; that string will be in your
binary's data section, which is read-only, and if you try to modify it you will get a segfault.
Your for loop is O(n^2), because strlen() is O(n)
There are several different ways of solving the "how to return a string" problem. You can, for example:
Use malloc() / calloc() to allocate a new string, as has been suggested
Use asprintf(), which is similar but gives you formatting if you need
Pass an output string (and its maximum size) as a parameter to the function
The first two require the calling function to free() the returned value. The third allows the caller to decide how to allocate the string (stack or heap), but requires some sort of contract about the minumum size needed for the output string.
In your code, when the function returns, then Return will be gone as well, so this behavior is undefined. It might work, but you should never rely on it.
Typically in C, you'd want to pass the "return" string as an argument instead, so that you don't have to free it all the time. Both require a local variable on the caller's side, but malloc'ing it will require an additional call to free the allocated memory and is also more expensive than simply passing a pointer to a local variable.
As for appending to the string, just use array notation (keep track of the current char/index) and don't forget to add a null character at the end.
Example:
int arg(char* ptr, char* S, int Num) {
int i, Spaces = 0, cur = 0;
for (i=0; i<strlen(S); i++) {
if (S[i] == ' ') {
Spaces++;
}
else if (Spaces == Num) {
ptr[cur++] = S[i]; // append char
}
else if (Spaces > Num) {
ptr[cur] = '\0'; // insert null char
return 0; // returns 0 on success
}
}
ptr[cur] = '\0'; // insert null char
return (cur > 0 ? 0 : -1); // returns 0 on success, -1 on error
}
Then invoke it like so:
char myArg[50];
if (arg(myArg, "this is an example", 3) == 0) {
printf("arg is %s\n", myArg);
} else {
// arg not found
}
Just make sure you don't overflow ptr (e.g.: by passing its size and adding a check in the function).
There are numbers of ways you could improve your code, but let's just start by making it meet the standard. ;-)
P.S.: Don't malloc unless you need to. And in that case you don't.
char * Return; //by the way horrible name for a variable.
Return = malloc(<some size>);
......
......
*(Return + index) = *(S+i);
You can't assign anything to a string literal such as "".
You may want to use your loop to determine the offsets of the start of the word in your string that you're looking for. Then find its length by continuing through the string until you encounter the end or another space. Then, you can malloc an array of chars with size equal to the size of the offset+1 (For the null terminator.) Finally, copy the substring into this new buffer and return it.
Also, as mentioned above, you may want to remove the strlen call from the loop - most compilers will optimize it out but it is indeed a linear operation for every character in the array, making the loop O(n**2).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *arg(const char *S, unsigned int Num) {
char *Return = "";
const char *top, *p;
unsigned int Spaces = 0;
int i = 0;
Return=(char*)malloc(sizeof(char));
*Return = '\0';
if(S == NULL || *S=='\0') return Return;
p=top=S;
while(Spaces != Num){
if(NULL!=(p=strchr(top, ' '))){
++Spaces;
top=++p;
} else {
break;
}
}
if(Spaces < Num) return Return;
if(NULL!=(p=strchr(top, ' '))){
int len = p - top;
Return=(char*)realloc(Return, sizeof(char)*(len+1));
strncpy(Return, top, len);
Return[len]='\0';
} else {
free(Return);
Return=strdup(top);
}
//printf("%s-\n", Return);
return Return;
}
int main(){
char *word;
word=arg("make a quick function", 2);//quick
printf("\"%s\"\n", word);
free(word);
return 0;
}