Function strtok() does not work with strcat() (illegal hardware instruction) - c

I'm new at c and I'm writing a script that inputs a file path as arguments. I want to have the last element of the path and the rest of the path.
Here is what I wrote:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
char *file = argv[1];
char base[sizeof(file)] = "";
char *tok = strtok(file, "/");
while (tok != NULL)
{
strcat(base, tok);
tok = strtok(NULL, "/");
}
printf("Base folder: %s\n", base);
printf("Last element: %s\n", tok);
return 0;
}
Input: ./getlast /this/is/some/path/file.txt
Expected result:
Base folder: /this/is/some/path
Last element: file.txt
It gives me this error when I concatenate base with tok:
[1] 15245 illegal hardware instruction ./getlast /Users/<myusername>/Desktop/getlast/getlast.c
I keep trying different solutions but I can't figure out what is wrong.
(I haven't a good English so sorry for that)

To use strtok() for this task would be complicated.
Determining which is the last token (filename) involves not commiting to concatentating until the next token is retrieved from the string.
One could 'hang onto' one pointer and append when another token is found, but that's confusing.
To find what might be called "the last token", strrchr() is perfectly suited to the task.
I regard "enviroment vars" (like argv) to be readonly, so strdup() makes a working copy that can be altered with impunity.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void main( int argc, char *argv[] ) {
if( argc != 2 ) { // always test before using
printf( "Bad usage\n" );
return;
}
char *copy = strdup( argv[1] ); // temp working copy to manipulate
char *lastSep = strrchr( copy, '/' ); // find LAST slash char
if( lastSep == NULL )
printf( "No prefix path: %s\n", argv[1] );
else {
*lastSep++ = '\0'; // split into two strings
printf( "Base folder: %s\n", copy );
printf( "Last element: %s\n", lastSep );
}
free( copy );
}
Update
You can avoid the overhead of making a working copy by capitalising on some pointer arithmetic and printf()'s length specifiers. Here's another way to achieve the same thing without using the heap:
char *p = "/one/two/three/four.txt";
char *lastSep = strrchr( p, '/' ); // find LAST slash char
if( lastSep == NULL )
printf( "No prefix path: %s\n", p );
else {
printf( "Base folder: %.*s\n", lastSep - p, p );
printf( "Last element: %s\n", lastSep + 1 );
}

Related

splitting string and counting tokens in c

I have a text file that contains multiple strings that are different lengths that I need to split into tokens.
Is it best to use strtok to split these strings and how can I count the tokens?
Example of strings from the file
Emma Stone#1169876#COMP242#COMP333#COMP336#COMP133#COMP231
Emma Watson#1169875#COMP336#COMP2421#COMP231#COMP338#CCOMP3351
Kevin Hart#1146542#COMP142#COMP242#COMP231#COMP336#COMP331#COMP334
George Clooney#1164561#COMP336#COMP2421#COMP231#COMP338#CCOMP3351
Matt Damon#1118764#COMP439#COMP4232#COMP422#COMP311#COMP338
Johnny Depp#1019876#COMP311#COMP242#COMP233#COMP3431#COMP333#COMP432
Generally, using strtok is a good solution to the problem:
#include <stdio.h>
#include <string.h>
int main( void )
{
char line[] =
"Emma Stone#1169876#COMP242#COMP333#COMP336#COMP133#COMP231";
char *p;
int num_tokens = 0;
p = strtok( line, "#" );
while ( p != NULL )
{
num_tokens++;
printf( "Token #%d: %s\n", num_tokens, p );
p = strtok( NULL, "#" );
}
}
This program has the following output:
Token #1: Emma Stone
Token #2: 1169876
Token #3: COMP242
Token #4: COMP333
Token #5: COMP336
Token #6: COMP133
Token #7: COMP231
However, one disadvantage of using strtok is that it is destructive in the sense that it modifies the string, by replacing the # delimiters with terminating null characters. If you do not want this, then you can use strchr instead:
#include <stdio.h>
#include <string.h>
int main( void )
{
const char *const line =
"Emma Stone#1169876#COMP242#COMP333#COMP336#COMP133#COMP231";
const char *p = line, *q;
int num_tokens = 1;
while ( ( q = strchr( p, '#' ) ) != NULL )
{
printf( "Token #%d: %.*s\n", num_tokens, q-p, p );
num_tokens++;
p = q + 1;
}
printf( "Token #%d: %s\n", num_tokens, p );
}
This program has identical output to the first program:
Token #1: Emma Stone
Token #2: 1169876
Token #3: COMP242
Token #4: COMP333
Token #5: COMP336
Token #6: COMP133
Token #7: COMP231
Another disadvantage with strtok is that it is not reentrant or thread-safe, whereas strchr is. However, some platforms provide a function strtok_r, which does not have these disadvantages. But that function does still has the disadvantage of being destructive.
Yes, you should use strtok to split these strings.
On
how can I count the tokens
You can simply add a counter inside while and increment it by one in each iteration to get the total number of tokens.
#include <stdio.h>
#include <string.h>
int main(void) {
char string[] = "Hello world this is a simple string";
char *token = strtok(string, " ");
int count = 0;
while (token != NULL) {
count++;
token = strtok(NULL, " ");
}
printf("Total number of tokens = %d", count);
return 0;
}
strtok() is rarely the right tool for anything. In this case, it is unclear whether a sequence of ## is equivalent to a single # and whether a # appearing at the beginning or end of line is to be ignored...
strtok() makes strong assumptions for these cases that may not be the expected behavior.
Furthermore, strtok() modifies its string argument and uses a hidden static state that makes it unsafe in multithreaded programs and prone to programming errors in nested use cases. strtok_r(), where available, solves these issues but the semantics are still somewhat counterintuitive.
For your purpose, you must define precisely what is a token and a separator. If empty tokens are allowed, strtok() is definitely not a solution.
You can also write your own function to handle this quite trivial split:
char **split(char *str, char **argv, size_t *argc, const char delim)
{
*argc = 0;
if(*str && *str)
{
argv[0] = str;
*argc = 1;
while(*str)
{
if(*str == delim)
{
*str = 0;
str++;
if(*str)
{
argv[*argc] = str;
*argc += 1;
continue;
}
}
str++;
}
}
return argv;
}
int main(void)
{
char *argv[10];
size_t argc;
char str[] = "Emma Stone#1169876#COMP242#COMP333#COMP336#COMP133#COMP231";
split(str, argv, &argc, '#');
printf("Numner of substrings: %zu\n", argc);
for(size_t i = 0; i < argc; i++)
printf("token [%2zu] = `%s`\n", i, argv[i]);
}
https://godbolt.org/z/b1aarnfWs
Remarks: same as strtok it requires str to me modifiable. str will be modified.

Dynamic string concatenation with strcat in C

I have experienced an issue while using strcat, using realloc however, strcat overwrites destination string
char *splitStr(char *line) {
char *str_;
str_ = (char *) malloc(1);
char *ptr = strtok(line,"\n");
int a;
while (ptr != NULL) {
if (ptr[0] != '$') {
printf("oncesi %s\n", str_);
a = strlen(ptr) + strlen(str_) + 1;
str_ = realloc(str_, a);
strcat(str_, ptr);
str_[a] = '\0';
printf("sontasi:%s\n", str_);
}
ptr = strtok(NULL, "\n");
}
printf("splitStr %d\n", strlen(str_));
printf("%s", str_);
return str_;
}
and my input value is ;
*4
$3
200
$4
4814
$7
SUCCESS
$4
3204
so I want to split this input value via strtok;
strtok(line,'\n');
and concat all line without start "$" char to new char. However, this code give following output;
line: *4
oncesi
sontasi:*4
oncesi *4
200tasi:*4
200esi *4
4814asi:*4
4814si *4
SUCCESS:*4
SUCCESS*4
3204ESS:*4
splitStr 25
seems to overwrite source string.
do you have any idea why this issue could be happening ?
the following proposed code:
cleanly compiles
performs the indicated functionality
is slightly reformated for readability of output
checks for errors from malloc() and realloc()
shows how to initialize the str[] array, which is the problem in the OPs posted code.
the function: strlen() returns a size_t, not an int. so the proper output format conversion specifier is: %zu
does not use trailing underscores on variable names
and now, the proposed code:
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
char *splitStr( char *line )
{
printf("original line: %s\n", line);
char *str = malloc(1);
if( !str )
{
perror( "malloc failed" );
exit( EXIT_FAILURE );
}
str[0] = '\0'; // critical statement
char *token = strtok(line,"\n");
while( token )
{
if( token[0] != '$')
{
char* temp = realloc( str, strlen( token ) + strlen( str ) + 1 );
if( ! temp )
{
perror( "realloc failed" );
free( str );
exit( EXIT_FAILURE );
}
str = temp; // update pointer
strcat(str, token);
printf( "concat result: %s\n", str );
}
token = strtok(NULL, "\n");
}
printf("splitStr %zu\n", strlen(str));
return str;
}
int main( void )
{
char firstStr[] = "$abcd\n$defg\nhijk\n";
char *firstNewStr = splitStr( firstStr );
printf( "returned: %s\n\n\n\n", firstNewStr );
free( firstNewStr );
char secondStr[] = "abcd\ndefg\nhijk\n";
char *secondNewStr = splitStr( secondStr );
printf( "returned: %s\n\n\n\n", secondNewStr );
free( secondNewStr );
}
a run of the proposed code results in:
original line: $abcd
$defg
hijk
concat result: hijk
splitStr 4
returned: hijk
original line: abcd
defg
hijk
concat result: abcd
concat result: abcddefg
concat result: abcddefghijk
splitStr 12
returned: abcddefghijk
Your input contains Windows/DOS end-of-line codings "\r\n".
Since strtok() just replaces '\n' with '\0', the '\r' stays in the string. On output it moves the cursor to the left and additional characters overwrite old characters, at least visually.
Your concatenated string should be OK, however. Count the characters, and don't forget to include a '\r' for each line: "*4\r200\r4814\rSUCCESS\r3204\r" are 25 characters as the output splitStr 25 shows.
Additional notes:
As others already said, str_ = (char *) malloc(1); does not initialize the space str_ points to. You need to do this yourself, in example by str_[0] = '\0';.
Don't use underscores that way.
You don't need to cast the result of malloc(), it is a void* that is compatible to char* (and any other).

Why is my program crashing when I try to print a char pointer

So i have this program for my C class its to make a function that will tell you the character at a certain number position in the string.
char charAt( char * string, int position );
int main( int argc, char * argv[ ] ) {
int pos = atoi( argv[3] );
if( strcmp( argv[1], "charAt" ) == 0 )
{
printf( "%s", charAt( argv[2], pos ) );
}
}
char charAt( char *string, int position ){
int count = 0;
while( count != position ){
string++;
count++;
}
return *string;
}
When i compile it shows no errors when i run it in the command line using
name charAt string 3
It crashes at
printf( "%s", charAt( argv[2], pos) );
that line
Why when i pass the char pointer to the printf function does it crash?
You're referencing argv without checking that argc is high enough to permit those references. What if argc is only 1?
The real problem is using %s to display a single character. That needs to be %c. Using %s is going to treat that as a character pointer, which it isn't, and then your program is deep into undefined behaviour.

C - Obtaining directory path

I'm writing a program that when executed in a directory will generate a text file with all of the contents in that directory. I'm getting the directory path from the **argv to main and because I'm using netbeans and cygwin I have to do some string manipulation of the obtained path in my char* get_path(char **argv) function. The directory path size will always vary therefore I'm assigning space with malloc to store it in the memory.
Program snippet:
#include <stdio.h>
#include <stdlib.h>
#include <dirent.h>
#include "dbuffer.h" //my own library
#include "darray.h" //my own library
ARR* get_dir_contents( char* path)
{
DBUFF *buff = NULL;
ARR *car = NULL;
DIR *dir_stream = NULL;
struct dirent *entry = NULL;
dir_stream = opendir(path);
if(opendir(path)==NULL) printf("NULL");
//... more code here
return car;
}
char* get_path(char **argv)
{
char *path = malloc(sizeof(char)* sizeof_pArray( &argv[0][11] ) + 3 );
strcpy(path, "C:");
strcat(path, &argv[0][11]);
printf("%s, sizeof: %d \n",path, sizeof_pArray(path));
return path;
}
int main(int argc, char** argv)
{
char *p = get_path(argv);
ARR *car = get_dir_contents(&p[0]);
//... more code here
return (EXIT_SUCCESS);
}
The problem is that the string that I have doesn't initialize the dir_stream pointer. I suspect it is because of some discrepancy between pointers and string literals but I can't pinpoint what it is exactly. Also the fact that dirent library function expects DIR *opendir(const char *dirname); const char might have something to do with it.
Output:
C:/Users/uk676643/Documents/NetBeansProjects/__OpenDirectoryAndListFiles/dist/Debug/Cygwin_4.x-Windows/__opendirectoryandlistfiles, sizeof: 131
NULL
RUN FAILED (exit value -1,073,741,819, total time: 2s)
there are some things here that can go wrong so
I would suggest doing something like this instead
char* get_path(char *argv)
{
char *path = malloc(sizeof(char)* strlen(argv) );
if (path != NULL)
{
strcpy(path, "C:");
strcat(path, argv + 11);
printf("%s, sizeof: %d \n",path, strlen(path));
}
return path;
}
...
char* p = get_path(*argv);
note: you don't need the extra 3 bytes, since you allocate including the 11 bytes you later skip. although instead of having the 11 bytes offset you may want to decompose the string and then later put it together so that it is portable. E.g. using strtok you could split that path and replace the parts you don't need.
Could it be a simple confusion about argv ? Please insert the following lines
just at the beginning of your main() , is it what you expected ?
printf("\n argv[0]== %s" , argv[0] );
getchar();
printf("\n argv[1]== %s" , argv[1] );
getchar();
OK, so we work from argv[0] , please try this for get_path
char *get_path(char *argv)
{
int i=0;
// +2 to add the drive letter
char *path = malloc(sizeof(char)* strlen(argv)+2 );
if (path != NULL)
{
strcpy(path, "C:");
strcat(path, argv);
// we get the path and the name of calling program
printf("\n path and program== %s",path);
printf("%s, sizeof: %d \n",path, strlen(path));
// now remove calling program name
for( i=strlen(path) ; ; i--)
{
// we are working in windows
if(path[i]=='\\') break;
path[i]='\0';
}
}
return path;
}

Getting folder from a path

Let's say that I have a path as a string (like this one):
/ROOT/DIRNAME/FILE.TXT
How can I get the parent folder of file.txt (DIRNAME in this case)?
For a path that should have at least one directory in it:
char str[1024]; // arbitrary length. just for this example
char *p;
strcpy(str, "/ROOT/DIRNAME/FILE.TXT"); // just get the string from somewhere
p = strrchr(str, '/');
if (p && p != str+1)
{
*p = 0;
p = strrchr(p-1, '/');
if (p)
print("folder : %s\n", p+1); // print folder immediately before the last path element (DIRNAME as requested)
else
printf("folder : %s\n", str); // print from beginning
}
else
printf("not a path with at least one directory in it\n");
Locate last occurrence of / using strrchr. Copy everything from beginning of string to the found location. Here is the code:
char str[] = "/ROOT/DIRNAME/FILE.TXT";
char * ch = strrchr ( str, '/' );
int len = ch - str + 1;
char base[80];
strncpy ( base, str, len );
printf ( "%s\n", base );
Working just with string; no knowledge of symlink or other types assumed.
You can also do it simply using pointers. Just iterate to the end of the path and then backup until you hit a /, replace it with a null-terminating char and then print the string:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main (int argc, char *argv[]) {
if (argc < 2 ) {
fprintf (stderr, "Error: insufficient input, usage: %s path\n", argv[0]);
return 1;
}
char *path = strdup (argv[1]);
char *p = path;
while (*p != 0) p++;
while (--p)
if (*p == '/') {
*p = 0;
break;
}
printf ("\n path = %s\n\n", path);
if (path) free (path);
return 0;
}
output:
$ ./bin/spath "/this/is/a/path/to/file.txt"
path = /this/is/a/path/to

Resources