How to get specific data from text line - c

I am trying to get the address of a certain line in a text file. This line contains the following;
Kmart, 7055 East Broadway, Tucson AZ
I am using strcpy and strtok functions to extract the address (7055 East Broadway), but so far I've only been able to extract the name of the store (Kmart) using my code.
char strLine[] gets the line from the file and I would like to return it to char strAddress[]
How would I extract just the address and possibly the city and state?
#define MAX_CHARS_PER_LINE 80
void getAddress(char strAddress[], const char strLine[])
{
char newLine[MAX_CHARS_PER_LINE+1];
strcpy(newLine,strLine);
char* token = strtok(newLine, ",");
strAddress = token;
printf("%s\n",strAddress);
}

Assuming,
The address is the second element in the line
It is demarcated with commas.
You can use strtok to get the address as below.
void getAddress(char strAddress[], const char strLine[])
{
char newLine[MAX_CHARS_PER_LINE+1];
strcpy(newLine,strLine);
char* token = strtok(newLine, ",");
if (token != NULL)
{
token = strtok(NULL, ",");
}
if (token != NULL)
{
strcpy (strAddress,token);
}
return;
}
To get the city and state just call token = strtok(NULL, ","); one more time
What if I wanted to separate the city-state and just get the city and get the state separately?
This is a more complex job as you do not have a comma in between the city and the state. You also need to take care of the case where the city can have two words e.g. New Orleans.
You can possibly assume that the state has 2 characters. In this case, a suggested route is
Isolate City + State in a string
Remove spaces at the end of the string
The last 2 characters of the string are the state.

Related

string token in a string token

Hi im trying to use strtok to seperate a file read.
the text file just has a list of names read into one char array at first stored in to data.
ive removed the file reading bit and shown the array for simplicity.
int main()
{
struct Node* head = NULL;
char data[128] = "john smith\nbob jones\nrobert brown";
char *argv [50];
char * token = strtok(data, "\n"); // separates data into lines
while( token != NULL )
{
insertAtBeginning(&head, token); //LL the data gets stored in
token = strtok(NULL, "\n");
}
}
I have managed to split the data into lines, however i want to split the one line - token into an array by the " ".
so i want argv[0] = "john" and argv[1] = smith.
this argv array then gets stored into the linked list instead of "token" at the line.
thanks any help will be much appreciated.

Issues with strtok in C

I have the following function in my code to extract names:
void student_init(struct student *s, char *info) {
char *name;
char *name2;
char a[] = { *info };
name = strtok(a, "/");
strcpy(s->first_name, name);
name2 = strtok(NULL, "/");
strcpy(s->last_name, name2);
}
When I run this, I see:
Please enter a student information or enter "Q" to quit.
Daisy/Lee
A student information is read.: Daisy/Lee
Please enter a row number where the student wants to sit.
1
Please enter a column number where the student wants to sit.
2
The seat at row 1 and column 2 is assigned to the student: D . �
?.?. ?.?. ?.?.
?.?. ?.?. D.�.
?.?. ?.?. ?.?.
I am trying to use the strtok function in a c program to split a string with "/" to separate a fist and last name and store them in the first_name and last_name variables of a student strcuture. I can get the first name stored in the respective variable but as you can see from the image in the link above I am getting a ? symbol in the output where the first initial of the last name should be.
char a[] = { *info };
This is your problem. What this creates is a one-byte character array containing the first character of info and nothing else.
Since strtok requires a string, it's probably going to run off the end of that one-byte array and use whatever happens to be there in memory. That's why you're seeing the first character okay and not much else (though, technically, as undefined behaviour, literally anything is allowed to happen).
Rather than constructing a one-byte array, you should probably just use the string as it was passed in:
name = strtok(info, "/");
The only reason you would make a local copy is if the string you're tokenising was not allowed to be changed (such as if it were a string literal, or if you wanted to preserve it for later use). Since your sample run shows that you're reading into this string, it cannot be a string literal.
And, if you want to preserve it for later, that's probably a cost best incurred by the caller rather than the function (it's wasteful for the function to always make a duplicate when the information as to whether it's needed or not is known only to the caller).
In order to make a copy for tokenising, it's as simple as:
char originalString[] = "pax/diablo";
scratchString = strdup(originalString);
if (scratchString != NULL) {
student_init (struct student *s, scratchString);
free (scratchString);
} else {
handleOutOfMemoryIssue();
}
useSafely (originalString);
If your implementation doesn't have strdup (it's POSIX rather than ISO), see here.
In my opinion, a "cleaner" implementation would be along the lines of:
void student_init (struct student *s, char *info) {
// Default both to empty strings.
*(s->first_name) = '\0';
*(s->last_name) = '\0';
// Try for first name, exit if none, otherwise save.
char *field = strtok (info, "/");
if (field == NULL) return;
strcpy (s->first_name, field);
// Try for second name, exit if none, otherwise save.
if ((field = strtok (NULL, "/")) == NULL) return;
strcpy (s->last_name, field);
}

Weird output from strtok

I was having some issues dealing with char*'s from an array of char*'s and used this for reference: Splitting C char array into words
So what I'm trying to do is read in char arrays and split them with a space delimiter so I can do stuff with it. For example if the first token in my char* is "Dog" I would send it to a different function that dealt with dogs. My problem is that I'm getting a strange output.
For example:
INPUT: *cmd = "Dog needs a vet appointment."
OUTPUT: (from print statements) "Doneeds a vet appntment."
I've checked for memory leaks using valgrind and I have none of them or other errors.
void parseCmd(char* cmd){ //passing in an individual char* from a char**
char** p_args = calloc(100, sizeof(char*));
int i = 0;
char* token;
token = strtok(cmd, " ");
while (token != NULL){
p_args[i++] = token;
printf("%s",token); //trying to debug
token = strtok(NULL, cmd);
}
free(p_args);
}
Any advice? I am new to C so please bear with me if I did something stupid. Thank you.
In your case,
token = strtok(NULL, cmd);
is not what you should be doing. You instead need:
token = strtok(NULL, " ");
As per the ISO standard:
char *strtok(char * restrict s1, const char * restrict s2);
A sequence of calls to the strtok function breaks the string pointed to by s1 into a sequence of tokens, each of which is delimited by a character from the string pointed to by s2.
The only difference between the first and subsequent calls (assuming, as per this case, you want the same delimiters) should be using NULL as the input string rather than the actual string. By using the input string as the delimiter list in subsequent calls, you change the behaviour.
You can see exactly what's happening if you try the following code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void parseCmd(char* cmd) {
char* token = strtok(cmd, " ");
while (token != NULL) {
printf("[%s] [%s]\n", cmd, token);
token = strtok(NULL, cmd);
}
}
int main(void) {
char x[] = "Dog needs a vet appointment.";
parseCmd(x);
return 0;
}
which outputs (first column will be search string to use next iteration, second is result of this iteration):
[Dog] [Dog]
[Dog] [needs a vet app]
[Dog] [intment.]
The first step worked fine since you were using space as the delimiter and it modified the string by placing a \0 at the end of Dog.
That means the next attempt (with the wrong spearator) would use one of the letters from {D,o,g} to split. The first matching letter for that set is the o in appointment which is why you see needs a vet app. The third attempt finds none of the candidate letters so you just get back the rest of the string, intment..
token = strtok(NULL, cmd); should be token = strtok(NULL, " ");.
The second argument is for delimiter.
http://man7.org/linux/man-pages/man3/strtok.3.html

Tokenizing multiple String in C with KEIL Compiler

I am writing on a Microcontroller-program using the Keil-Compiler. The program creates several CSV-Like Strings (Logging-Lines). For example "A;001;ERROR;C05;...\n"
To save space I now want to reduce the data by just logging the differences.
Therefore I am saving the last logged line and compare it to the new one. If a value in a column is the same, I want to just omit it. For example:
"A;001;ERROR;C05;...\n" <- previous Log
"A;002;ERROR;C06;...\n" <- new Log
would result in ";002;;C06;...\n"
At first I just included <string.h> and used 'strtok' to step through my CSV-line. Since I need to compare 2 Strings/Lines, I would need to use it simultaneously on 2 different Strings which does not work. So I switched to 'strtok_r' which seems to not work at all:
token1 = strtok_r(m_cActLogLine, ";", pointer1);
while (token1 != NULL) {
token1 = strtok_r(NULL, ";", pointer1);
}
This just gives me strange behaviour. Typically the second call to 'strtok_r' just returns a NULL and the loop is left.
So is there maybe another way of achieving the desired behaviour?
EDIT:
To clarify what I mean, this is what I am currently trying to get to work:
My Input (m_cMeasureLogLine) is "M;0001;001;01;40;1000.00;0.00;1000.00;0.00;360.00;0.00;400.00;24.90;400.00;-9999.00;-9999.00;-9999.00;0;LED;;;;;400.00;34.40;25.41;27.88;29.01;0.00;0.00;0.00;-100.00;0.00;-1000.00;-1000.00;-103.032;-70.192;19;8192.00;0.00;0;"
char m_cActLogLine[MAX_SIZE_PARAM_LINE_TEXT];
char* token1;
char* token2;
char** pointer1;
char** pointer2;
void vLogProtocolMeasureData()
{
strcpy(m_cActLogLine, m_cMeasureLogLine);
token1 = strtok_r(m_cActLogLine, ";", pointer1);
while (token1 != NULL) {
token1 = strtok_r(NULL, ";", pointer1);
}
}
The function is part of a bigger embedded project so I dont have Console Output but use the debugger to check the contents of my variables. In the above example, after the first call to 'strtok_r' token1 is 'M' which is correct. After the second call however (in the Loop) token 1 becomes 0x00000000 (NULL).
If I instead use 'strtok' instead:
strcpy(m_cActLogLine, m_cMeasureLogLine);
token1 = strtok(m_cActLogLine, ";");
while (token1 != NULL) {
token1 = strtok(NULL, ";");
}
the loop iterates just fine. But that way I cant process two Strings at a time and compare values column-wise.
In string.h the functions are declared as:
extern _ARMABI char *strtok(char * __restrict /*s1*/, const char * __restrict /*s2*/) __attribute__((__nonnull__(2)));
extern _ARMABI char *_strtok_r(char * /*s1*/, const char * /*s2*/, char ** /*ptr*/) __attribute__((__nonnull__(2,3)));
#ifndef __STRICT_ANSI__
extern _ARMABI char *strtok_r(char * /*s1*/, const char * /*s2*/, char ** /*ptr*/) __attribute__((__nonnull__(2,3)));
#endif
You need to pass a pointer to a valid char* for the last parameter of strtok_r(). You're passing a pointer to a pointer with pointer1, but it's a NULL (because it's a globally scoped variable that isn't assigned a value), so when strtok_r() goes to store it's iterating pointer at the address to a pointer you pass in, it's trying to write something to address 0x00000000.
Try...
char m_cActLogLine[MAX_SIZE_PARAM_LINE_TEXT];
char* token1;
char* pointer1;
void vLogProtocolMeasureData()
{
strcpy(m_cActLogLine, m_cMeasureLogLine);
token1 = strtok_r(m_cActLogLine, ";", &pointer1);
while (token1 != NULL) {
token1 = strtok_r(NULL, ";", &pointer1);
}

Problem with strtok only saving the first token

I was having a problem with my function yesterday that turned out to be a simple fix, and now I am have a different issue with it that is baffling me. The function is a tokenizer:
void tokenizer(FILE *file, struct Set *set) {
int nbytes = 100;
int bytesread;
char *buffer;
char *token;
buffer = (char *) malloc(nbytes + 1);
while((bytesread = getLine(&buffer,&nbytes,file)) != -1) {
token = strtok(buffer," ");
while(token != NULL) {
add(set,token);
token = strtok(NULL," ");
}
}
}
I know that the input(a text file) is getting broken into tokens correctly because in the second while loop I can add printf("%s",token) to display each token. However, the problem is with add. It is only adding the first token to my list, but at the same time still breaks every token down correctly. For example, if my input text was "blah herp derp" I would get
token = blah
token = herp
token = derp
but the list would only contain the first token, blah. I don't believe that the problem is with add because it works on its own, ie I could use
add(set,"blah");
add(set,"herp");
add(set,"derp");
and the result will be a list that contains all three words.
Thank you for any help!
strtok() returns a pointer into the string buffer. You need to strdup() that string and add the result to your tree instead. Don't forget to free() it when you clean up the tree.

Resources