Trying to replace text in files throws a error - c

The point of the script is to take three parameters. Find, replace, prefix. Find being the text to replace, replace being what to replace the text with, and prefix is a special case. If prefix is in the text, you replace the prefix (some text) with prefix+replace. I would like to know why the below code throws a error right after saying opened file. It only seems to throw an error if the text being replaced is repeated like "aaa", "bbb" where "a" is what is being replaced.
Opened file.txt
*** Error in `./a.out': malloc(): memory corruption: 0x00005652fbc55980 ***
There's also the occasionally seg fault after printing "Trying to replace for file ...". I'm not fluent in C and GDB on my system resulted in just missing library errors which has nothing to do with this.
Here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dirent.h>
char concat(const char *s1, const char *s2)
{
char *result = calloc(strlen(s1)+strlen(s2)+1, 1);
strcpy(result, s1);
strcat(result, s2);
printf("Prefix will be replaced with %s.\n", result);
return result;
}
static int replaceString(char *buf, const char *find, const char *replace, const char *prefix)
{
int olen, rlen;
char *s, *d;
char *tmpbuf;
if (!buf || !*buf || !find || !*find || !replace)
return 0;
tmpbuf = calloc(strlen(buf) + 1, 1);
if (tmpbuf == NULL)
return 0;
olen = strlen(find);
rlen = strlen(replace);
s = buf;
d = tmpbuf;
while (*s) {
if (strncmp(s, find, olen) == 0) {
strcpy(d, replace);
s += olen;
d += rlen;
}
else
{
*d++ = *s++;
}
}
*d = '\0';
if(strcmp(buf, tmpbuf) == 0)
{
free(tmpbuf);
return 0;
}
else
{
strcpy(buf, tmpbuf);
free(tmpbuf);
printf("%s", buf);
printf("Replaced!\n");
return 1;
}
}
void getAndReplace(char* filename, char* find, char* replace, char* prefix)
{
long length;
FILE* f = fopen (filename, "r");
char* buffer = 0;
if (f)
{
fseek (f, 0, SEEK_END);
length = ftell (f);
fseek (f, 0, SEEK_SET);
buffer = calloc(length+1, 1); //If i use malloc here, any file other than the first has garbage added to it. Why?
if (buffer)
{
fread(buffer, 1, length, f);
}
fclose(f);
}
if(buffer)// && strlen(buffer) > 1)
{
int result = replaceString(buffer, find, replace, prefix);
if(result == 0)
{
printf("Trying to replace prefix.\n");
replace = concat(prefix, replace);
result = replaceString(buffer, prefix, replace, "");
}
else
{
printf("Successfully replaced %s with %s\n", find, replace);
}
if(result == 1)
{
FILE* fp = fopen(filename, "w+");
if(fp)
{
printf("Opened %s\n", filename);
fprintf(fp, buffer);
fclose(fp);
printf("File %s overwritten with changes.\n", filename);
}
}
else
{
printf("Nothing to replace for %s\n", filename);
}
}
else
{
printf("Empty file.");
}
if(buffer)
{
free(buffer);
}
}
int main(int argc, char **argv)
{
if(argc < 4)
{
printf("Not enough arguments given: ./hw3 <find> <replace> <prefix>\n");
return 1;
}
struct dirent *de;
DIR *dr = opendir(".");
if (dr == NULL)
{
printf("Could not open current directory\n");
return 0;
}
while ((de = readdir(dr)) != NULL)
{
if(strlen(de->d_name) > 4 && !strcmp(de->d_name + strlen(de->d_name) - 4, ".txt"))
{
printf("Trying to replace for file %s\n", de->d_name);
getAndReplace(de->d_name, argv[1], argv[2], argv[3]);
}
}
closedir(dr);
return 0;
}

I hope that you concat function
char concat(const char *s1, const char *s2);
is just a typo and you meant
char *concat(const char *s1, const char *s2);
otherwise the function would be returning a pointer as if it were a char.
Using valgrind would give more details where exactly you are reading/writing where you are not allowed to and
where you are leaking memory. Without that it's hard to pinpoint the exact
place. One thing I noticed is that depending on the length of find and replace,
you might not have enough memory for tmpbuf which would lead to a buffer
overflow.
I think that the best way to write the replaceString is by making it
allocate the memory it needs itself, rather than providing it a buffer to write into.
Because you are getting both find and replace from the user, you don't know
how large the resulting buffer will need to be. You could calculate it
beforehand, but you don't do that. If you want to pass a pre-allocated buffer to
replaceString, I'd pass it as a double pointer, so that replaceString can do
realloc on it when needed. Or allocate the memory in the function and return a
pointer to the allocated memory.
This would be my version:
char *replaceString(const char *haystack, const char *needle, const char *replace)
{
if(haystack == NULL || needle == NULL || replace == NULL)
return NULL;
char *dest = NULL, *tmp;
size_t needle_len = strlen(needle);
size_t replace_len = strlen(replace);
size_t curr_len = 0;
while(*haystack)
{
char *found = strstr(haystack, needle);
size_t copy_len1 = 0;
size_t new_size = 0;
size_t pre_found_len = 0;
if(found == NULL)
{
copy_len1 = strlen(haystack) + 1;
new_size = curr_len + copy_len1;
} else {
pre_found_len = found - haystack;
copy_len1 = pre_found_len;
new_size = curr_len + pre_found_len + replace_len + 1;
}
tmp = realloc(dest, new_size);
if(tmp == NULL)
{
free(dest);
return NULL;
}
dest = tmp;
strncpy(dest + curr_len, haystack, copy_len1);
if(found == NULL)
return dest; // last replacement, copied to the end
strncpy(dest + curr_len + pre_found_len, replace, replace_len + 1);
curr_len += pre_found_len + replace_len;
haystack += pre_found_len + needle_len;
}
return dest;
}
The idea in this version is similar to yours, but mine reallocates the memory as
it goes. I changed the name of the arguments to have the same name as the
strstr function does based on my documentation:
man strstr
char *strstr(const char *haystack, const char *needle);
Because I'm going to update haystack to point past the characters copied, I
use this loop:
while(*haystack)
{
...
}
which means it is going to stop when the '\0'-terminating byte is reached.
The first thing is to use strstr to locate a substring that matches needle.
Base on whether a substring is found, I calculate how much bytes I would need to
copy until the substring, and the new size of the buffer. After that I
reallocate the memory for the buffer and copy everything until the substring,
then append the replacement, update the curr_len variable and update the
haystack pointer to point past the substring.
If the substring is not found, no more replacements are needed. So we have to
copy the string pointed to by haystack and return the constructed string. The
new size of the destination is curr_len + strlen(haystack) + 1 (the +1
because I want the strncpy function to also copy the '\0'-terminating byte).
And it has to copy strlen(haystack) + 1 bytes. After the first strncpy, the
function returns dest.
If the substring is found, then we have to copy everything until the substring,
append the replacement and update the current length and the haystack pointer.
First I calculate the string until the found substring and save it in
pre_found_len. The new size of the destination will be
curr_len + pre_found_len + replace_len + 1 (the current length + length of
string until substring + the length of the replacement + 1 for the
'\0'-terminating byte). Now the first strncpy copies only pre_found_len
bytes. Then it copies the replacement.
Now you can call it like this:
int main(void)
{
const char *orig = "Is this the real life? Is this just fantasy?";
char *text = replaceString(orig, "a", "_A_");
if(text)
{
puts(orig);
puts(text);
}
free(text);
}
which will output:
Is this the real life? Is this just fantasy?
Is this the re_A_l life? Is this just f_A_nt_A_sy?
Now you can use this function in getAndReplace to replace the prefix:
char *getAndReplace(char* filename, char* find, char* replace, char* prefix)
{
...
char *rep1 = replaceString(buffer, find, replace);
if(rep1 == NULL)
{
// error
free(buffer);
return NULL;
}
char *prefix_rep = malloc(strlen(replace) + strlen(prefix) + 1);
if(prefix_rep == NULL)
{
// error
free(buffer);
free(rep1);
return NULL;
}
sprintf(prefix_rep, "%s%s", replace, prefix);
char *rep2 = replaceString(rep1, prefix, prefix_rep);
if(rep2 == NULL)
{
// error
free(buffer);
free(rep1);
free(prefix_rep);
return NULL;
}
// rep2 has all the replacements
...
// before leaving
free(buffer);
free(rep1);
free(prefix_rep);
// returning all replacements
return rep2;
}
When using malloc & co, don't forget to check if they return NULL and don't
forget to free the memory when not needed.

Related

How to append a char to a String in C using a function

I was writing a lexical analyzer in which I need to append a char to a string (a char *). For some reason, the code below is resulting in string having a value of "(null)" when I print it to stdout. The function is given below.
void append_char(char *buffer, char c) {
if(buffer == NULL) {
buffer = malloc(sizeof(char));
if(buffer == NULL) {
fprintf(stderr, "COuld not allcocate memory to buffer\n");
}
} else {
buffer = realloc(buffer, sizeof(buffer) + sizeof(char));
}
buffer[sizeof(buffer) - 1] = c;
}
When I run the lines
char *buf = NULL;
append_char(buf, 'a');
append_char(buf, '\0');
printf("buffer: %s\n", buf);
it prints (null) to stdout. How can I fix this?
Pass by value
append_char(char *buffer, char c) does not affect the caller's buf in main(): append_char(buf, 'a');. buf remains NULL. This leads to OP's output.
Insufficient size
Insufficient size for the newly allocated string. No room for the null character.
Wrong size
With char *buffer, sizeof(buffer) is the size of a pointer, not the amount allocated beforehand.
Lost memoery
When buffer = realloc(buffer, sizeof(buffer) + sizeof(char)); fails (realloc() returns NULL) , the original value of buffer is lost. Save the result and test.
Note: OK to call realloc(NULL, ...).
char *append_char(char *buffer, char c) {
size_t old_length = buffer ? strlen(buffer) : 0;
size_t new_length = old_length + 1; // +1 for c
// Size needed for a string is its length + 1
char *new_buffer = realloc(buffer, new_length + 1); // +1 for \0
if (new_buffer == NULL) {
fprintf(stderr, "Could not allocate memory to buffer\n");
free(buffer);
return NULL;
}
new_buffer[old_length] = c;
new_buffer[old_length + 1] = '\0';
return new_buffer;
}
// Usage
buf = append_char(buf, 'a');
There are a number of problems with your program:
buffer is a local variable of append_char(). As soon as this function returns, the buffer variable becomes unusable. You need to pass in the address of buffer to write the address returned by malloc() and realloc() to buffer.
buf is a pointer to a char. sizeof (buf) does not depend on the number of characters in buf.
You do not check the return value from realloc().
You are not allocating space for the NUL terminator.
You do not call free() after you are done.
Here is one way to achieve what you are trying to do (although I am not trying to fix all the problems I mentioned above):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void append_char(char **buffer, char c) {
if(*buffer == NULL) {
if((*buffer = malloc(sizeof(char) + 1)) == NULL) { /* + 1 for the NUL terminator */
fprintf(stderr, "COuld not allcocate memory to buffer\n");
return;
}
(*buffer)[0] = (*buffer)[1] = '\0';
} else {
*buffer = realloc(*buffer, strlen(*buffer) + sizeof(char) + 1 /* for the NUL terminator */);
}
(*buffer)[strlen(*buffer) + 1] = '\0';
(*buffer)[strlen(*buffer)] = c;
}
int main(void)
{
char *buf = NULL;
append_char(&buf, 'a');
append_char(&buf, '\0');
printf("buffer: %s\n", buf);
}

How to return an array of character pointers from a function back to main?

I'm trying to store a list of files with a .txt extension into an array from my current working directory. I've gotten as far as getting the array together, now I just need to return it. I'm new to C and I'm not used to the pointer situation. How can I return an array of char pointers? I have the array created, and its allocated based on how many Text files I find in the directory.
I'm getting two errors when I try to compile which leads me to believe that my understanding of pointers is not what I thought it was. I also have been reading that my array will be destroyed when returning from the function, because it is on the stack. I'm not sure how to fix that. Any help or criticism would be welcome.
// prototypes
char* getLine();
void replaceWord();
char * getTxtFilesInDir(char*);
bool isTxt(char *);
int main(int argc, char **argv) {
// Check to see if correct number of arguments is given
if (argc < 4) {
printf("ERROR: Not enough parameters supplied. Please supply a find,"
"replace, and prefix parameter and try again.\n");
return -1;
}
// Initialize variables
char *find, *replace, *prefix;
find=argv[1];
replace=argv[2];
prefix=argv[3];
char cd[1024];
// Get the Current working directory to grab files
if (getcwd(cd, sizeof(cd)) != NULL) {
printf("Current working dir is: %s\n", cd);
} else {
perror("getcwd() error");
}
// Get the values of the arguments
printf("Value of find: %s\nValue of replace: %s\nValue of prefix: %s\n",
find, replace, prefix);
// create an array of the files that are valid
char* files = getTxtFilesInDir(cd);
return 0;
}
char* getTxtFilesInDir(char* directory) {
DIR *d;
struct dirent *dir;
d = opendir(directory);
int n=0, i=0;
// get the number of text files in the directory
if (d) {
while((dir = readdir(d)) != NULL) {
if (isTxt(dir->d_name)) {
n++;
}
}
}
rewinddir(d);
// Create the array of text files depending on the size
static char* txtFiles[n];
// add the text files to the array
if (d) {
printf("Found: \n");
while ((dir = readdir(d)) != NULL)
{
if (isTxt(dir->d_name)) {
printf("%s\n", dir->d_name);
txtFiles[i]= (char*) malloc (strlen(dir->d_name)+1);
strncpy(txtFiles[i], dir->d_name, strlen(dir->d_name));
i++;
}
}
closedir(d);
}
return txtFiles;
}
bool isTxt(char *file) {
size_t len = strlen(file);
return len > 4 && strcmp(file + len -4, ".txt") == 0;
}
If you want that getTxtFilesInDir returns an array of strings, it should
return a char**, not a char*.
Also you don't need to declare the variable as static. You declare a variable
in a function as static, when you want that the variable remains the same for
all calls of the function. In this case this is probably not what you want to
do.
Without modifying too much of your initial code, you can first allocate memory
for an arrray of strings, and then resize it when you've got a new entry. In
this example I do that at once, because realloc(NULL, somesize) is the same as
doing malloc(somesize). That's why it's important to initialize *tmp = 0 and
txtFiles = NULL, so this trick works. You should also pass a pointer to an size_t where you store the number of entries:
char **getTxtFilesInDir(const char* directory, size_t *len) {
if(directory == NULL || len == NULL)
return NULL;
...
*len = 0;
char **txtFiles = NULL, **tmp;
char *str;
if (d) {
printf("Found: \n");
while ((dir = readdir(d)) != NULL)
{
if (isTxt(dir->d_name))
{
tmp = realloc(txtFiles, (len+1) * sizeof *textFiles);
if(tmp == NULL)
return txtFiles; // return what you've got so far
str = malloc(strlen(dir->d_name) + 1);
if(str == NULL)
{
if(txtFiles == NULL) // first time, free everything
{
free(tmp);
return NULL;
}
return tmp; // return all you've got so far
}
strcpy(str, dir->d_name); // no need of strcnpy, you've allocated
// enough memory
txtFiles = tmp;
txtFiles[(*len)++] = tmp;
}
}
closedir(d);
return txtFiles;
}
The important bits here is 1. how you expand the memory with realloc. Then it
allocates memory for the string using malloc. I do that before txtFiles = tmp;,
so that I don't have to write to many if(...==NULL). If something goes
wrong along the way, the function returns all the file names it already has
stored.
Now in main you do:
int main(int argc, char **argv)
{
...
size_t len = 0;
char **files = getTxtFilesInDir(cd, &len);
if(file == NULL)
{
fprintf(stderr, "No files found\n");
return 1;
}
for(size_t i = 0; i < len; ++i)
printf("file: %s\n", files[i]);
// freeing memory
for(size_t i = 0; i < len; ++i)
free(files[i]);
free(files);
return 0;
}
I also want to comment on your orginal way of copying:
strncpy(txtFiles[i], dir->d_name, strlen(dir->d_name));
strncpy is great when you want to limit the number of bytes to be copied. In
your case you've already allocated enough memory, so there is no need to. And
when you use strncpy, the limit should be bound to the number of bytes
available to the destination, not the source, otherwise you might copy more
bytes than it should. In your case you won't get a valid c-string, because you
are limiting to copy to the lenth of dir->d_name, so the '\0'-terminating
won't be copied.
man strcpy
#include <string.h>
char *strcpy(char *dest, const char *src);
char *strncpy(char *dest, const char *src, size_t n);
[...]
The strncpy() function is similar, except that at most n bytes of src are copied. Warning: If there is no null byte among the first n
bytes of src, the string placed in dest will not be null-terminated.
When you use strncpy you must make sure that the copy is a valid string by
setting the '\0'-terminating byte yourself. Because you've allocated
strlen(dir->d_name)+1 bytes for the string:
strncpy(txtFiles[i], dir->d_name, strlen(dir->d_name));
textFiles[i][strlen(dir->d_name)] = 0;
Also the exit value of a program is a unsigned value, it goes from 0 to 255.
In main you should return 1 instead of -1 on error.

Extract the file name and its extension in C

So we have a path string /home/user/music/thomas.mp3.
Where is the easy way to extract file name(without extension, "thomas") and it's extension ("mp3") from this string? A function for filename, and for extension. And only GNU libc in our hands.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_FILENAME_SIZE 256
char *filename(char *str) {
char *result;
char *last;
if ((last = strrchr(str, '.')) != NULL ) {
if ((*last == '.') && (last == str))
return str;
else {
result = (char*) malloc(MAX_FILENAME_SIZE);
snprintf(result, sizeof result, "%.*s", (int)(last - str), str);
return result;
}
} else {
return str;
}
}
char *extname(char *str) {
char *result;
char *last;
if ((last = strrchr(str, '.')) != NULL) {
if ((*last == '.') && (last == str))
return "";
else {
result = (char*) malloc(MAX_FILENAME_SIZE);
snprintf(result, sizeof result, "%s", last + 1);
return result;
}
} else {
return ""; // Empty/NULL string
}
}
Use basename to get the filename and then you can use something like this to get the extension.
char *get_filename_ext(const char *filename) {
const char *dot = strrchr(filename, '.');
if(!dot || dot == filename) return "";
return dot + 1;
}
Edit:
Try something like.
#include <string.h>
#include <libgen.h>
static void printFileInfo(char *path) {
char *bname;
char *path2 = strdup(path);
bname = basename(path2);
printf("%s.%s\n",bname, get_filename_ext(bname));
free(path2);
}
Regarding your actual code (all the other answers so far say to scrap that and do something else, which is good advice, however I am addressing your code as it contains blunders that it'd be good to learn about in advance of next time you try to write something).
Firstly:
strncpy(str, result, (size_t) (last-str) + 1);
is not good. You have dest and src around the wrong way; and further this function does not null-terminate the output (unless the input is short enough, which it isn't). Generally speaking strncpy is almost never a good solution to a problem; either strcpy if you know the length, or snprintf.
Simpler and less error-prone would be:
snprintf(result, sizeof result, "%.*s", (int)(last - str), str);
Similary in the other function,
snprintf(result, sizeof result, "%s", last + 1);
The snprintf function never overflows buffer and always produces a null-terminated string, so long as you get the buffer length right!
Now, even if you fixed those then you have another fundamental problem in that you are returning a pointer to a buffer that is destroyed when the function returns. You could fix ext by just returning last + 1, since that is null-terminated anyway. But for filename you have the usual set of options:
return a pointer and a length, and treat it as a length-counted string, not a null-terminated one
return pointer to mallocated memory
return pointer to static buffer
expect the caller to pass in a buffer and a buffer length, which you just write into
Finally, returning NULL on failure is probably a bad idea; if there is no . then return the whole string for filename, and an empty string for ext. Then the calling code does not have to contort itself with checks for NULL.
Here is a routine I use for that problem:
Separates original string into separate strings of path, file_name and extension.
Will work for Windows and Linux, relative or absolute style paths. Will handle directory names with embedded ".". Will handle file names without extensions.
/////////////////////////////////////////////////////////
//
// Example:
// Given path == "C:\\dir1\\dir2\\dir3\\file.exe"
// will return path_ as "C:\\dir1\\dir2\\dir3"
// Will return base_ as "file"
// Will return ext_ as "exe"
//
/////////////////////////////////////////////////////////
void GetFileParts(char *path, char *path_, char *base_, char *ext_)
{
char *base;
char *ext;
char nameKeep[MAX_PATHNAME_LEN];
char pathKeep[MAX_PATHNAME_LEN];
char pathKeep2[MAX_PATHNAME_LEN]; //preserve original input string
char File_Ext[40];
char baseK[40];
int lenFullPath, lenExt_, lenBase_;
char *sDelim={0};
int iDelim=0;
int rel=0, i;
if(path)
{ //determine type of path string (C:\\, \\, /, ./, .\\)
if( (strlen(path) > 1) &&
(
((path[1] == ':' ) &&
(path[2] == '\\'))||
(path[0] == '\\') ||
(path[0] == '/' ) ||
((path[0] == '.' ) &&
(path[1] == '/' ))||
((path[0] == '.' ) &&
(path[1] == '\\'))
)
)
{
sDelim = calloc(5, sizeof(char));
/* // */if(path[0] == '\\') iDelim = '\\', strcpy(sDelim, "\\");
/* c:\\ */if(path[1] == ':' ) iDelim = '\\', strcpy(sDelim, "\\"); // also satisfies path[2] == '\\'
/* / */if(path[0] == '/' ) iDelim = '/' , strcpy(sDelim, "/" );
/* ./ */if((path[0] == '.')&&(path[1] == '/')) iDelim = '/' , strcpy(sDelim, "/" );
/* .\\ */if((path[0] == '.')&&(path[1] == '\\')) iDelim = '\\' , strcpy(sDelim, "\\" );
/* \\\\ */if((path[0] == '\\')&&(path[1] == '\\')) iDelim = '\\', strcpy(sDelim, "\\");
if(path[0]=='.')
{
rel = 1;
path[0]='*';
}
if(!strstr(path, ".")) // if no filename, set path to have trailing delim,
{ //set others to "" and return
lenFullPath = strlen(path);
if(path[lenFullPath-1] != iDelim)
{
strcat(path, sDelim);
path_[0]=0;
base_[0]=0;
ext_[0]=0;
}
}
else
{
nameKeep[0]=0; //works with C:\\dir1\file.txt
pathKeep[0]=0;
pathKeep2[0]=0; //preserves *path
File_Ext[0]=0;
baseK[0]=0;
//Get lenth of full path
lenFullPath = strlen(path);
strcpy(nameKeep, path);
strcpy(pathKeep, path);
strcpy(pathKeep2, path);
strcpy(path_, path); //capture path
//Get length of extension:
for(i=lenFullPath-1;i>=0;i--)
{
if(pathKeep[i]=='.') break;
}
lenExt_ = (lenFullPath - i) -1;
base = strtok(path, sDelim);
while(base)
{
strcpy(File_Ext, base);
base = strtok(NULL, sDelim);
}
strcpy(baseK, File_Ext);
lenBase_ = strlen(baseK) - lenExt_;
baseK[lenBase_-1]=0;
strcpy(base_, baseK);
path_[lenFullPath -lenExt_ -lenBase_ -1] = 0;
ext = strtok(File_Ext, ".");
ext = strtok(NULL, ".");
if(ext) strcpy(ext_, ext);
else strcpy(ext_, "");
}
memset(path, 0, lenFullPath);
strcpy(path, pathKeep2);
if(rel)path_[0]='.';//replace first "." for relative path
free(sDelim);
}
}
}
Here is an old-school algorithm that will do the trick.
char path[100] = "/home/user/music/thomas.mp3";
int offset_extension, offset_name;
int len = strlen(path);
int i;
for (i = len; i >= 0; i--) {
if (path[i] == '.')
break;
if (path[i] == '/') {
i = len;
break;
}
}
if (i == -1) {
fprintf(stderr,"Invalid path");
exit(EXIT_FAILURE);
}
offset_extension = i;
for (; i >= 0; i--)
if (path[i] == '/')
break;
if (i == -1) {
fprintf(stderr,"Invalid path");
exit(EXIT_FAILURE);
}
offset_name = i;
char *extension, name[100];
extension = &path[offset_extension+1];
memcpy(name, &path[offset_name+1], offset_extension - offset_name - 1);
Then you have both information under the variables name and extension
printf("%s %s", name, extension);
This will print:
thomas mp3
I know this is old. But I tend to use strtok for things like this.
/* strtok example */
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAX_TOKENS 20 /* Some reasonable values */
#define MAX_STRING 128 /* Easy enough to make dynamic with mallocs */
int main ()
{
char str[] ="/home/user/music/thomas.mp3";
char sep[] = "./";
char collect[MAX_TOKENS][MAX_STRING];
/* Not really necessary, since \0 is added inplace. I do this out of habit. */
memset(collect, 0, MAX_TOKENS * MAX_STRING);
char * pch = strtok (str, sep);
int ccount = 0;
if(pch != NULL) {
/* collect all seperated text */
while(pch != NULL) {
strncpy( collect[ccount++], pch, strlen(pch));
pch = strtok (NULL, sep);
}
}
/* output tokens. */
for(int i=0; i<ccount; ++i)
printf ("Token: %s\n", collect[i]);
return 0;
}
This is a rough example, and it makes it easy to deal with the tokens afterwards. Ie the last token is the extension. Second last is the basename and so on.
I also find it useful for rebuilding paths for different platforms - replace / with \.

remove characters from a c string

gcc 4.4.4 c89
I am reading in from a text file and the text file consists of names in double quotes.
"Simpson, Homer"
etc
However, I want to remove the double quotes from the string.
This is how I have done it, but I am not sure its the best way.
int get_string(FILE *in, char *temp)
{
char *quote = NULL;
/* Get the first line */
fgets(temp, STRING_SIZE, in);
printf("temp before [ %s ]\n", temp);
/* Find the second quote */
if((quote = strrchr(temp, '"')) == NULL) {
fprintf(stderr, "Text file incorrectly formatted\n");
return FALSE;
}
/* Replace with a nul to get rid of the second quote */
*quote = '\0';
/* Move the pointer to point pass the first quote */
temp++;
printf("temp after [ %s ]\n", temp);
return TRUE;
}
Many thanks for any suggestions,
No, this won't work. You are changing the parameter temp, but the calling function will still have an old value. The temp outside the function will point to the opening quote. You ought to move the characters in your buffer.
However I would suggest allocating the buffer in heap and returning a pointer to it, letting the caller free the buffer when needed. This seems to be a cleaner solution. Again, this way you won't rely on the caller to pass a sufficiently large buffer.
In general, a robust reading lines from a text file is not a trivial task in C, with its lack of automatic memory allocating functions. If possible to switch to C++, I would suggest trying much simpler C++ getline.
char *foo(char *str, int notme)
{
char *tmp=strdup(str);
char *p, *q;
for(p=str, q=tmp; *p; p++)
{
if((int)*p == notme) continue;
*q=*p;
q++;
}
strcpy(str, tmp);
free(tmp);
return str;
}
simple generic remove a char
is all lines look that way why not simple remove the first and the last char?
quote++; // move over second char
quote[strlen(quote)-1]='\0'; // remove last char
Don't know if this will help, it is a simple tokenizer i use
#include <stdlib.h>
#include <string.h>
int token(char* start, char* delim, char** tok, char** nextpos, char* sdelim, char* edelim) {
// Find beginning:
int len = 0;
char *scanner;
int dictionary[8];
int ptr;
for(ptr = 0; ptr < 8; ptr++) {
dictionary[ptr] = 0;
}
for(; *delim; delim++) {
dictionary[*delim / 32] |= 1 << *delim % 32;
}
if(sdelim) {
*sdelim = 0;
}
for(; *start; start++) {
if(!(dictionary[*start / 32] & 1 << *start % 32)) {
break;
}
if(sdelim) {
*sdelim = *start;
}
}
if(*start == 0) {
if(nextpos != NULL) {
*nextpos = start;
}
*tok = NULL;
return 0;
}
for(scanner = start; *scanner; scanner++) {
if(dictionary[*scanner / 32] & 1 << *scanner % 32) {
break;
}
len++;
}
if(edelim) {
*edelim = *scanner;
}
if(nextpos != NULL) {
*nextpos = scanner;
}
*tok = (char*)malloc(sizeof(char) * (len + 1));
if(*tok == NULL) {
return 0;
}
memcpy(*tok, start, len);
*(*tok + len) = 0;
return len + 1;
}
The parameters are:
char* start, (pointer to the string)
char* delim, (pointer to the delimiters used to break up the string)
char** tok, a reference (using &) to a char* variable that will hold the toke
char** nextpos, a reference (using &) to a char* variable that will hold the position after the last token.
char* sdelim, a reference (using &) to a char variable that will hold the value of the -start delimiter
char* edelim, a reference (using &) to a char varaible that will hold the value of the end delimiter
The last three are optional.
Pass in the start address, the delimeter is a ", and pass reference to a char * to hold the actual middle string.
The result is a newly allocated string so you have to free it.
int get_string(FILE *in, char *temp)
{
char *token = NULL;
/* Get the first line */
fgets(temp, STRING_SIZE, in);
printf("temp before [ %s ]\n", temp);
/* Find the second quote */
int length = token(temp, "\"", &token, NULL, NULL, NULL)
// DO STUFF WITH THE TOKEN
printf("temp after [ %s ]\n", token);
// DO STUFF WITH THE TOKEN
// FREE IT!!!
free(token);
return TRUE;
}
The tokenizer is a multipurpose tool that can be used in a crap ton of places, this being a very small example.
Suppose
string="\"Simpson, Homer\""
then
string_without_quotes=string+1;
string_without_quotes[strlen(string)-2]='\0';
ready!

how to remove extension from file name?

I want to throw the last three character from file name and get the rest?
I have this code:
char* remove(char* mystr) {
char tmp[] = {0};
unsigned int x;
for (x = 0; x < (strlen(mystr) - 3); x++)
tmp[x] = mystr[x];
return tmp;
}
Try:
char *remove(char* myStr) {
char *retStr;
char *lastExt;
if (myStr == NULL) return NULL;
if ((retStr = malloc (strlen (myStr) + 1)) == NULL) return NULL;
strcpy (retStr, myStr);
lastExt = strrchr (retStr, '.');
if (lastExt != NULL)
*lastExt = '\0';
return retStr;
}
You'll have to free the returned string yourself. It simply finds the last . in the string and replaces it with a null terminator character. It will handle errors (passing NULL or running out of memory) by returning NULL.
It won't work with things like /this.path/is_bad since it will find the . in the non-file portion but you could handle this by also doing a strrchr of /, or whatever your path separator is, and ensuring it's position is NULL or before the . position.
A more general purpose solution to this problem could be:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// remove_ext: removes the "extension" from a file spec.
// myStr is the string to process.
// extSep is the extension separator.
// pathSep is the path separator (0 means to ignore).
// Returns an allocated string identical to the original but
// with the extension removed. It must be freed when you're
// finished with it.
// If you pass in NULL or the new string can't be allocated,
// it returns NULL.
char *remove_ext (char* myStr, char extSep, char pathSep) {
char *retStr, *lastExt, *lastPath;
// Error checks and allocate string.
if (myStr == NULL) return NULL;
if ((retStr = malloc (strlen (myStr) + 1)) == NULL) return NULL;
// Make a copy and find the relevant characters.
strcpy (retStr, myStr);
lastExt = strrchr (retStr, extSep);
lastPath = (pathSep == 0) ? NULL : strrchr (retStr, pathSep);
// If it has an extension separator.
if (lastExt != NULL) {
// and it's to the right of the path separator.
if (lastPath != NULL) {
if (lastPath < lastExt) {
// then remove it.
*lastExt = '\0';
}
} else {
// Has extension separator with no path separator.
*lastExt = '\0';
}
}
// Return the modified string.
return retStr;
}
int main (int c, char *v[]) {
char *s;
printf ("[%s]\n", (s = remove_ext ("hello", '.', '/'))); free (s);
printf ("[%s]\n", (s = remove_ext ("hello.", '.', '/'))); free (s);
printf ("[%s]\n", (s = remove_ext ("hello.txt", '.', '/'))); free (s);
printf ("[%s]\n", (s = remove_ext ("hello.txt.txt", '.', '/'))); free (s);
printf ("[%s]\n", (s = remove_ext ("/no.dot/in_path", '.', '/'))); free (s);
printf ("[%s]\n", (s = remove_ext ("/has.dot/in.path", '.', '/'))); free (s);
printf ("[%s]\n", (s = remove_ext ("/no.dot/in_path", '.', 0))); free (s);
return 0;
}
and this produces:
[hello]
[hello]
[hello]
[hello.txt]
[/no.dot/in_path]
[/has.dot/in]
[/no]
Use rindex to locate the "." character. If the string is writable, you can replace it with the string terminator char ('\0') and you're done.
char * rindex(const char *s, int c);
DESCRIPTION
The rindex() function locates the last character matching c (converted to a char) in the null-terminated string s.
If you literally just want to remove the last three characters, because you somehow know that your filename has an extension exactly three chars long (and you want to keep the dot):
char *remove_three(const char *filename) {
size_t len = strlen(filename);
char *newfilename = malloc(len-2);
if (!newfilename) /* handle error */;
memcpy(newfilename, filename, len-3);
newfilename[len - 3] = 0;
return newfilename;
}
Or let the caller provide the destination buffer (which they must ensure is long enough):
char *remove_three(char *dst, const char *filename) {
size_t len = strlen(filename);
memcpy(dst, filename, len-3);
dst[len - 3] = 0;
return dst;
}
If you want to generically remove a file extension, that's harder, and should normally use whatever filename-handling routines your platform provides (basename on POSIX, _wsplitpath_s on Windows) if there's any chance that you're dealing with a path rather than just the final part of the filename:
/* warning: may modify filename. To avoid this, take a copy first
dst may need to be longer than filename, for example currently
"file.txt" -> "./file.txt". For this reason it would be safer to
pass in a length with dst, and/or allow dst to be NULL in which
case return the length required */
void remove_extn(char *dst, char *filename) {
strcpy(dst, dirname(filename));
size_t len = strlen(dst);
dst[len] = '/';
dst += len+1;
strcpy(dst, basename(filename));
char *dot = strrchr(dst, '.');
/* retain the '.' To remove it do dot[0] = 0 */
if (dot) dot[1] = 0;
}
Come to think of it, you might want to pass dst+1 rather than dst to strrchr, since a filename starting with a dot maybe shouldn't be truncated to just ".". Depends what it's for.
I would try the following algorithm:
last_dot = -1
for each char in str:
if char = '.':
last_dot = index(char)
if last_dot != -1:
str[last_dot] = '\0'
Just replace the dot with "0". If you know that your extension is always 3 characters long you can just do:
char file[] = "test.png";
file[strlen(file) - 4] = 0;
puts(file);
This will output "test". Also, you shouldn't return a pointer to a local variable. The compiler will also warn you about this.
To get paxdiablo's second more general purpose solution to work in a C++ compiler I changed this line:
if ((retstr = malloc (strlen (mystr) + 1)) == NULL)
to:
if ((retstr = static_cast<char*>(malloc (strlen (mystr) + 1))) == NULL)
Hope this helps someone.
This should do the job:
char* remove(char* oldstr) {
int oldlen = 0;
while(oldstr[oldlen] != NULL){
++oldlen;
}
int newlen = oldlen - 1;
while(newlen > 0 && mystr[newlen] != '.'){
--newlen;
}
if (newlen == 0) {
newlen = oldlen;
}
char* newstr = new char[newlen];
for (int i = 0; i < newlen; ++i){
newstr[i] = oldstr[i];
}
return newstr;
}
Get location and just copy up to that location into a new char *.
i = 0;
n = 0;
while(argv[1][i] != '\0') { // get length of filename
i++; }
for(ii = 0; i > -1; i--) { // look for extension working backwards
if(argv[1][i] == '.') {
n = i; // char # of exension
break; } }
memcpy(new_filename, argv[1], n);
This is simple way to change extension name.
....
char outputname[255]
sscanf(inputname,"%[^.]",outputname); // foo.bar => foo
sprintf(outputname,"%s.txt",outputname) // foo.txt <= foo
....
With configurable minimum file length and configurable maximum extension length. Returns index where extension was changed to null character, or -1 if no extension was found.
int32_t strip_extension(char *in_str)
{
static const uint8_t name_min_len = 1;
static const uint8_t max_ext_len = 4;
/* Check chars starting at end of string to find last '.' */
for (ssize_t i = sizeof(in_str); i > (name_min_len + max_ext_len); i--)
{
if (in_str[i] == '.')
{
in_str[i] = '\0';
return i;
}
}
return -1;
}
I use this code:
void remove_extension(char* s) {
char* dot = 0;
while (*s) {
if (*s == '.') dot = s; // last dot
else if (*s == '/' || *s == '\\') dot = 0; // ignore dots before path separators
s++;
}
if (dot) *dot = '\0';
}
It handles the Windows path convention correctly (both / and \ can be path separators).

Resources