Simple way to extract part of a file path? - c

I'm not very good at C, and I always get stuck on simple string manipulation tasks (that's why I love Perl!).
I have a string that contains a file path like "/Volumes/Media/Music/Arcade Fire/Black Mirror.aac". I need to extract the drive name ("Media" or preferably "/Volumes/Media") from that path.
Any help would be greatly appreciated, just as I try to return the favor on the Perl questions!
Jim

I think sscanf could be appropriate:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
void test(char const* path) {
int len;
if(sscanf(path, "/Volumes/%*[^/]%n", &len) != EOF) {
char *drive = malloc(len + 1);
strncpy(drive, path, len);
drive[len] = '\0';
printf("drive is %s\n", drive);
free(drive);
} else {
printf("match failure\n");
}
}
int main() {
test("/Volumes/Media/Foo");
test("/Volumes/Media");
test("/Volumes");
}
Output:
drive is /Volumes/Media
drive is /Volumes/Media
match failure

I think that you need to be a little more exact in the specification of your problem.
When you say that you want to extract "Media", do you mean everything between the second and third '/' character, or is there a more complex heuristic at work?
Also, is the string in a buffer that's suitable to be modified?
Typically the way to do this would be to use strchr or strstr one or more times to find a pointer to where you want to extract the substring from (say p), and a pointer to the character after the last character that you need to extract (say q), if the buffer is a temporary buffer that you don't mind destroying then you can just do *q = 0 and p will be a pointer to the required string. Otherwise you need to have a buffer of at least q - p + 1 chars ( +1 is to include space for the null terminator as well as the q - p interesting characters. e.g. char *buffer = malloc(q - p + 1); ) and you can extract the string with memcpy. e.g. memcpy(buffer, p, q - p + 1).

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * extractDriveName(const char *path, char separator, int maxLen)
{
char *outBuffer;
int outBufferSize, i, j;
int sepOcur;
outBufferSize = strlen(path) + 1;
outBufferSize = outBufferSize > maxLen ? maxLen : outBufferSize;
outBuffer = (char *) malloc(outBufferSize);
// Error allocating memory.
if(outBuffer == NULL)
return NULL;
memset(outBuffer, 0, outBufferSize);
for(i = 0, sepOcur = 0, j = 0; i < outBufferSize; i++)
{
if(path[i] == separator)
sepOcur ++;
if(sepOcur >= 0 && sepOcur < 3)
outBuffer[j++] = path[i];
else if(sepOcur == 3)
break;
}
// Don't forget to free the buffer if
return outBuffer;
}
int main(void)
{
char path [] = "/Volumes/Media/Music/Arcade Fire/Black Mirror.aac";
char * driveName = extractDriveName(path, '/', strlen(path) + 1);
if(driveName != NULL)
{
printf("Path location: '%s'\n", path);
printf("Drive name: '%s'\n", driveName);
free(driveName);
}
else
{
printf("Error allocating memory\n");
}
}
Output:
Path location: '/Volumes/Media/Music/Arcade Fire/Black Mirror.aac'
Drive name: '/Volumes/Media'

I think that the easiest way is to use strtok(). This function will split a string in tokens separated by one of the characters in the separators string.
If your original path is in str and you want the second part in part:
strtok(str, "/"); /* To get the first token */
part=strtok(NULL, "/"); /* To get the second token that you want */
Note that strtok() will change str, so it should not be a const. If you have a const string, you might use stdrup(), which is not standard but normally available, to create a copy.
Also note that strtok() is not thread safe.

You example makes it appear that you are working in a Macintosh environment. I suspect that there is an Apple API to get the volume on which a particular file resides.
Any reason not to use that?
Edit: Looking at your profile, I suspect I guessed wrong about your environment. I can't help you for windows. Good luck. I'll leave this here in case anyone is looking for the same answer on a Mac.
From Mac Forums I find "Getting the current working Volume name?" which seems to be the same question. There is a nice discussion there, but they dpn't seem to come to a single answer.
Google is your friend.
Another possibility: BSD Path to Volume Name & Vice-Versa at CocoaDev Forums.

Related

Scanning data from text file, that doesn't have spacing between each item of data

I have encountered a problem with my homework. I need to scan some data from a text file, to a struct.
The text file looks like this.
012345678;danny;cohen;22;M;danny1993;123;1,2,4,8;Nice person
223325222;or;dan;25;M;ordan10;1234;3,5,6,7;Singer and dancer
203484758;shani;israel;25;F;shaninush;12345;4,5,6,7;Happy and cool girl
349950234;nadav;cohen;50;M;nd50;nadav;3,6,7,8;Engineer very smart
345656974;oshrit;hasson;30;F;osh321;111;3,4,5,7;Layer and a painter
Each item of data to its matching variable.
id = 012345678
first_name = danny
etc...
Now I can't use fscanf because there is no spacing, and the fgets scanning all the line.
I found some solution with %[^;]s, but then I will need to write one block of code and, copy and past it 9 times for each item of data.
Is there any other option without changing the text file, that similar to the code I would write with fscanf, if there was spacing between each item of data?
************* UPDATE **************
Hey, First of all, thanks everyone for the help really appreciating.
I didn't understand all your answers, but here something I did use.
Here's my code :
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct
{
char *idP, *firstNameP, *lastNameP;
int age;
char gender, *userNameP, *passwordP, hobbies, *descriptionP;
}user;
void main() {
FILE *fileP;
user temp;
char test[99];
temp.idP = (char *)malloc(99);
temp.firstNameP = (char *)malloc(99);
temp.lastNameP = (char *)malloc(99);
temp.age = (int )malloc(4);
temp.gender = (char )malloc(sizeof(char));
temp.userNameP = (char *)malloc(99);
fileP = fopen("input.txt", "r");
fscanf(fileP, "%9[^;];%99[^;];%99[^;];%d;%c", temp.idP,temp.firstNameP,temp.lastNameP,&temp.age, temp.gender);
printf("%s\n%s\n%s\n%d\n%c", temp.idP, temp.firstNameP, temp.lastNameP, temp.age, temp.gender);
fgets(test, 60, fileP); // Just testing where it stop scanning
printf("\n\n%s", test);
fclose(fileP);
getchar();
}
It all works well until I scan the int variable, right after that it doesn't scan anything, and I get an error.
Thanks a lot.
As discussed in the comments, fscanf is probably the shortest option (although fgets followed by strtok, and manual parsing are viable options).
You need to use the %[^;] specifier for the string fields (meaning: a string of characters other than ;), with the fields separated by ; to consume the actual semicolons (which we specifically requested not to be consumed as part of the string field). The last field should be %[^\n] to consume up to the newline, since the input doesn't have a terminating semicolon.
You should also (always) limit the length of each string field read with a scanf family function to one less than the available space (the terminating NUL byte is the +1). So, for example, if the first field is at most 9 characters long, you would need char field1[10] and the format would be %9[^;].
It is usually a good idea to put a single space in the beginning of the format string to consume any whitespace (such as the previous newline).
And, of course you should check the return value of fscanf, e.g., if you have 9 fields as per the example, it should return 9.
So, the end result would be something like:
if (fscanf(file, " %9[^;];%99[^;];%99[^;];%d;%c;%99[^;];%d;%99[^;];%99[^\n]",
s.field1, s.field2, s.field3, &s.field4, …, s.field9) != 9) {
// error
break;
}
(Alternatively, the field with numbers separated by commas could be read as four separate fields as %d,%d,%d,%d, in which case the count would go up to 12.)
Here you have simple tokenizer. As I see you have more than one delimiter here (; & ,)
str - string to be tokenized
del - string containing delimiters (in your case ";," or ";" only)
allowempty - if true allows empty tokens if there are two or more consecutive delimiters
return value is a NULL terminated table of pointers to the tokens.
char **mystrtok(const char *str, const char *del, int allowempty)
{
char **result = NULL;
const char *end = str;
size_t size = 0;
int extrachar;
while(*end)
{
if((extrachar = !!strchr(del, *end)) || !*(end + 1))
{
/* add temp variable and malloc / realloc checks */
/* free allocated memory on error */
if(!(!allowempty && !(end - str)))
{
extrachar = !extrachar * !*(end + 1);
result = realloc(result, (++size + 1) * sizeof(*result));
result[size] = NULL;
result[size -1] = malloc(end - str + 1 + extrachar);
strncpy(result[size -1], str, end - str + extrachar);
result[size -1][end - str + extrachar] = 0;
}
str = end + 1;
}
end++;
}
return result;
}
To free the the memory allocated by the tokenizer:
void myfree(char **ptr)
{
char **savedptr = ptr;
while(*ptr)
{
free(*ptr++);
}
free(savedptr);
}
Function is simple but your can use any separators and any number of separators.

Difficulty printing char pointer array

I've been struggling on this problem all day now, and looking at similar examples hasn't gotten me too far, so I'm hoping you can help! I'm working on the programming assignment 1 at the end of CH 3 of Operating Systems Concepts if anyone wanted to know the context.
So the problem is to essentially create a command prompt in c that allows users to input a command, fork and execute it, and save the command in history. The user can enter the command 'history' to see the 10 most recent commands printed out. The book instructed me to store the current command as a char pointer array of arguments, and I would execute the current one using execvp(args[0], args). My professor added other requirements to this, so having each argument individually accessible like this will be useful for those parts as well.
I decided to store the history of commands in a similar fashion using an array of char pointers. So for example if the first command was ls -la and the second command entered was cd.. we would have history[0] = "ls -la" and history[1] = "cd..". I'm really struggling getting this to work, and I'm fairly certain I'm screwing up pointers somewhere, but I just can't figure it out.
In main I can print the first word in the first command (so just ls for ls -la) using arg_history[0] but really can't figure out printing the whole thing. But I know the data's there and I verify it when I add it in (via add_history function) and it's correct! Even worse when I pass it to the get_history function made for printing the history, it prints a bunch of gibberish. I would greatly appreciate any help in understanding why it's doing this! Right now I have a hunch it's something to do with passing pointers incorrectly between functions, but based on what I've been looking at I can't spot the problem!
/**
* Simple shell interface program.
*
* Operating System Concepts - Ninth Edition
* Copyright John Wiley & Sons - 2013
*/
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#define MAX_LINE 80 /* 80 chars per line, per command */
#define HIST_LENGTH 10
void get_input(char *args[], int *num_args, char *history[], int *hist_index);
void get_history(char *history[], int hist_index);
void add_history(char *history[], char *added_command, int *hist_index);
int main(void)
{
char *args[MAX_LINE/2 + 1]; /* command line (of 80) has max of 40 arguments */
char *arg_history[HIST_LENGTH];
int num_args;
int hist_index;
int should_run = 1;
int i;
while (should_run){
printf("osh>");
fflush(stdout);
get_input(args, &num_args, arg_history, &hist_index);
//printf("%s\n", arg_history[0]); //incorrectly prints up to the first space
//printf("%s\n", args[0]) //prints the correct arg from the last command (eg. for 'ls -la' it prints ls for args[0] and -la for args[1])
if (strcmp(args[0], "history") == 0) {
get_history(arg_history, hist_index);
}
}
return 0;
}
void get_input(char *args[], int *num_args, char *history[], int *hist_index) {
char input[MAX_LINE];
char *arg;
fgets(input, MAX_LINE, stdin);
input[strlen(input) - 1] = NULL; // To remove new line character - the compiler doesn't like how I'm doing this
add_history(history, input, hist_index);
arg = strtok(input, " ");
*num_args = 0;
while(arg != NULL) {
args[*num_args] = arg;
*num_args = *num_args + 1;
arg = strtok(NULL, " ");
}
}
void get_history(char *history[], int hist_index) {
int i;
for (i = 0; i < HIST_LENGTH; i++) {
printf("%d %s\n", hist_index, *history);
// prints gibberish
hist_index = hist_index - 1;
if (hist_index < 1) {
break;
}
}
}
void add_history(char *history[], char *added_command, int *hist_index) {
int i;
for (i = HIST_LENGTH-1; i > 0; i--) {
history[i] = history[i-1];
}
history[0] = added_command;
*hist_index = *hist_index + 1;
//printf("%s\n", history[0]); prints correctly
}
Update:
I made the changes suggested by some of the solutions including moving the pointer to input out of the function (I put it in main) and using strcpy for the add_history function. The reason I was having an issue using this earlier was because I'm rotating the items 'up' through the array, but I was accessing uninitialized locations before history was full with all 10 elements. While I was now able to print the arg_history[0] from main, I was still having problems printing anything else (eg. arg_history[1]). But more importantly, I couldn't print from the get_historyfunction which is what I actually needed to solve. After closer inspection I realized hist_index is never given a value before it's used to access the array. Thanks for the help everyone.
input[strlen(input) - 1] = NULL; // To remove new line character - the compiler doesn't like how I'm doing this
Of course it doesn't. There are many things wrong with this. Imagine if strlen(input) is 0, for example, then strlen(input) - 1 is -1, and you're accessing the -1th item of the array... not to mention NULL is a pointer, not a character value. You probably meant input[strlen(input) - 1] = '\0';, but a safer solution would be:
input[strcspn(input, "\n")] = '\0';
history[0] = added_command;
*hist_index = *hist_index + 1;
//printf("%s\n", history[0]); prints correctly
This prints correctly because the pointer value added_command, which you assign to history[0] and which points into input in get_command is still alive. Once get_command returns, the object that pointer points to no longer exists, and so the history[0] pointer also doesn't exist.
You should know you need to use strcpy to assign strings by now, if you're reading a book (such as K&R2E). Before you do that, you need to create a new object of suitable size (e.g. using malloc)...
This is a common problem for people who aren't reading a book... Which book are you reading?
printf("%d %s\n", hist_index, *history);
// prints gibberish
Well, yes, it prints gibberish because the object that *history once pointed to, before get_command returned, was destroyed when get_command returned. A book would teach you this.
See also Returning an array using C for a similar explanation...
Here are some description of strtok(). Because you just put the pointer of input to your history list instead of putting a copy, you'd only print out the first word.
char *strtok(char *str, const char *delim)
Parameters
str -- The contents of this string are modified and broken into smaller strings (tokens).
delim -- This is the C string containing the delimiters. These may vary from one call to another.

Changing the extension of a passed filename

My function is passed a filename of the type
char *myFilename;
I want to change the existing extension to ".sav", or if there is no extension, simply add ".sav" to the end of the file. But I need to consider files named such as "myfile.ver1.dat".
Can anyone give me an idea on the best way to achieve this.
I was considering using a function to find the last "." and remove all characters after it and replace them with "sav". or if no "." is found, simple add ".sav" to the end of the string. But not sure how to do it as I get confused by the '\0' part of the string and whether strlen returns the whole string with '\0' or do I need to +1 to the string length after.
I want to eventual end up with a filename to pass to fopen().
May be something like this :
char *ptrFile = strrchr(myFilename, '/');
ptrFile = (ptrFile) ? myFilename : ptrFile+1;
char *ptrExt = strrchr(ptrFile, '.');
if (ptrExt != NULL)
strcpy(ptrExt, ".sav");
else
strcat(ptrFile, ".sav");
And then the traditional way , remove and rename
Here's something lazy I've whipped up, it makes minimum use of the standard library functions (maybe you'd like something that does?):
#include <stdio.h>
#include <string.h>
void change_type(char* input, char* new_extension, int size)
{
char* output = input; // save pointer to input in case we need to append a dot and add at the end of input
while(*(++input) != '\0') // move pointer to final position
;
while(*(--input) != '.' && --size > 0) // start going backwards until we encounter a dot or we go back to the start
;
// if we've encountered a dot, let's replace the extension, otherwise let's append it to the original string
size == 0 ? strncat(output, new_extension, 4 ) : strncpy(input, new_extension, 4);
}
int main()
{
char input[10] = "file";
change_type(input, ".bff", sizeof(input));
printf("%s\n", input);
return 0;
}
And it indeed prints file.bff. Please note that this handles extensions up to 3 chars long.
strlen returns the number of characters in the string but arrays are indexed from 0 so
filename [strlen(filename)]
is the terminating null.
int p;
for (p = strlen (filename) - 1; (p > 0) && (filename[p] != '.'); p--)
will loop to zero if no extension and stop at the correct spot otherwise.

What's the C library function to generate random string?

Is there a library function that creates a random string in the same way that mkstemp() creates a unique file name? What is it?
There's no standard function, but your OS might implement something. Have you considered searching through the manuals? Alternatively, this task is simple enough. I'd be tempted to use something like:
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
void rand_str(char *, size_t);
int main(void) {
char str[] = { [41] = '\1' }; // make the last character non-zero so we can test based on it later
rand_str(str, sizeof str - 1);
assert(str[41] == '\0'); // test the correct insertion of string terminator
puts(str);
}
void rand_str(char *dest, size_t length) {
char charset[] = "0123456789"
"abcdefghijklmnopqrstuvwxyz"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ";
while (length-- > 0) {
size_t index = (double) rand() / RAND_MAX * (sizeof charset - 1);
*dest++ = charset[index];
}
*dest = '\0';
}
This has the neat benefit of working correctly on EBCDIC systems, and being able to accommodate virtually any character set. I haven't added any of the following characters into the character set, because it seems clear that you want strings that could be filenames:
":;?#[\]^_`{|}"
I figured many of those characters could be invalid in filenames on various OSes.
There's no build in API, you may use (on *x system) /dev/urandom like:
FILE *f = fopen( "/dev/urandom", "r");
if( !f) ...
fread( binary_string, string_length, f);
fclose(f);
Note that this will create binary data, not string data so you'll may have to filter it afterwards.
You may also use standard pseudorandom generator rand():
#include <time.h>
#include <stdlib.h>
// In main:
srand(time(NULL));
for( int i = 0; i < string_length; ++i){
string[i] = '0' + rand()%72; // starting on '0', ending on '}'
}
And if you need really random string you need to google generating random sequence cryptography which is one of cryptography's difficult problems which still hasn't perfect solution :)

Reading formatted strings from file into Array in C

I am new to the C programming language and trying to improve by solving problems from the Project Euler website using only C and its standard libraires. I have covered basic C fundamentals(I think), functions, pointers, and some basic file IO but now am running into some issues.
The question is about reading a text file of first names and calculating a "name score" blah blah, I know the algorithm I am going to use and have most of the program setup but just cannot figure out how to read the file correctly.
The file is in the format
"Nameone","Nametwo","billy","bobby","frank"...
I have searched and searched and tried countless things but cannot seem to read these as individual names into an array of strings(I think thats the right way to store them individually?) I have tried using sscanf/fscanf with %[^\",]. I have tried different combos of those functions and fgets, but my understanding of fgets is everytime I call it it will get a new line, and this is a text file with over 45,000 characters all on the same line.
I am unsure if I am running into problems with my misunderstanding of the scanf functions, or my misunderstanding with storing an array of strings. As far as the array of strings goes, I (think) I have realized that when I declare an array of strings it does not allocate memory for the strings themselves, something that I need to do. But I still cannot get anything to work.
Here is the code I have now to try to just read in some names I enter from the command line to test my methods.
This code works to input any string up to buffer size(100):
int main(void)
{
int i;
char input[100];
char* names[10];
printf("\nEnter up to 10 names\nEnter an empty string to terminate input: \n");
for(int i = 0; i < 10; i++)
{
int length = 0;
printf("%d: ", i);
fgets(input, 100, stdin);
length = (int)strlen(input);
input[length-1] = 0; // Delete newline character
length--;
if(length < 1)
{
break;
}
names[i] = malloc(length+1);
assert(names[i] != NULL);
strcpy(names[i], input);
}
}
However, I simply cannot make this work for reading in the formatted strings.
PLEASE advise me as to how to read it in with format. I have previously used sscanf on the input buffer and that has worked fine, but I dont feel like I can do that on a 45000+ char line? Am I correct in assuming this? Is this even an acceptable way to read strings into an array?
I apologize if this is long and/or not clear, it is very late and I am very frustrated.
Thank anyone and everyone for helping, and I am looking forward to finally becoming an active member on this site!
There are really two basic issues here:
Whether scanning string input is the proper strategy here. I would argue not because while it might work on this task you are going to run into more complicated scenarios where it too easily breaks.
How to handle a 45k string.
In reality you won't run into too many string of this size but it is nothing that a modern computer of any capacity can't easily handle. Insofar as this is for learning purposes then learn iteratively.
The easiest first approach is to fread() the entire line/file into an appropriately sized buffer and parse it yourself. You can use strtok() to break up the comma-delimited tokens and then pass the tokens to a function that strips the quotes and returns the word. Add the word to your array.
For a second pass you can do away with strtok() and just parse the string yourself by iterating over the buffer and breaking up the comma tokens yourself.
Last but not least you can write a version that reads smaller chunks of the file into a smaller buffer and parses them. This has the added complexity of handling multiple reads and managing the buffers to account for half-read tokens at the end of a buffer and so on.
In any case, break the problem into chunks and learn with each refinement.
EDIT
#define MAX_STRINGS 5000
#define MAX_NAME_LENGTH 30
char* stripQuotes(char *str, char *newstr)
{
char *temp = newstr;
while (*str)
{
if (*str != '"')
{
*temp = *str;
temp++;
}
str++;
}
return(newstr);
}
int main(int argc, char *argv[])
{
char fakeline[] = "\"Nameone\",\"Nametwo\",\"billy\",\"bobby\",\"frank\"";
char *token;
char namebuffer[MAX_NAME_LENGTH] = {'\0'};
char *name;
int index = 0;
char nameArray[MAX_STRINGS][MAX_NAME_LENGTH];
token = strtok(fakeline, ",");
if (token)
{
name = stripQuotes(token, namebuffer);
strcpy(nameArray[index++], name);
}
while (token != NULL)
{
token = strtok(NULL, ",");
if (token)
{
memset(namebuffer, '\0', sizeof(namebuffer));
name = stripQuotes(token, namebuffer);
strcpy(nameArray[index++], name);
}
}
return(0);
}
fscanf("%s", input) reads one token (a string surrounded by spaces) at a time. You can either scan the input until you encounter a specific "end-of-input" string, such as "!", or you can wait for the end-of-file signal, which is achieved by pressing "Ctrl+D" on a Unix console or by pressing "Ctrl+Z" on a Windows console.
The first option:
fscanf("%s", input);
if (input[0] == '!') {
break;
}
// Put input on the array...
The second option:
result = fscanf("%s", input);
if (result == EOF) {
break;
}
// Put input on the array...
Either way, as you read one token at a time, there are no limits on the size of the input.
Why not search the giant string for quote characters instead? Something like this:
#include <stdio.h>
#include <string.h>
int main(void)
{
char mydata[] = "\"John\",\"Smith\",\"Foo\",\"Bar\"";
char namebuffer[20];
unsigned int i, j;
int begin = 1;
unsigned int beginName, endName;
for (i = 0; i < sizeof(mydata); i++)
{
if (mydata[i] == '"')
{
if (begin)
{
beginName = i;
}
else
{
endName = i;
for (j = beginName + 1; j < endName; j++)
{
namebuffer[j-beginName-1] = mydata[j];
}
namebuffer[endName-beginName-1] = '\0';
printf("%s\n", namebuffer);
}
begin = !begin;
}
}
}
You find the first double quote, then the second, and then read out the characters in between to your name string. Then you process those characters as needed for the problem in question.

Resources