Separating multiple first and/or last names in C

Separating multiple first and/or last names in C - c

So I'm working on a small project whereas I want to take mock data and separate them into structs. But I was thinking of the issue of people with multiple first and/or last names.
I want to write first names like you would do (like "Michael") and last names in all capital letters (like "JAMESON").
But what if I'm reading a name like Michael Daniel VAN DOORNE, etc. I don't know how I'd be able to separate "Michael Daniel" as first name, and "VAN DOORNE" as the last name. I tried to separate by stopping at the first capital letter, but I am of course capitalizing the first letter in someone's first names as well.
Example:
I want to read Michael Daniel VAN DOORNE, and separate it into "Michael Daniel" as firstname, and "VAN DOORNE" as the surname.
sscanf(buffer, "%s %s", firstName, lastName);
That wouldnt work naturally. But i am kinda stuck on coming up with a solution for mock names with multiple first and last names.

As you seem to be in total control of the data, I rather recommend a different approach:
A specific separator character in between forename(s) and surname(s). Then you don't rely on case sensitivity any more, especially the single character name issue appearing in another answer isn't an issue any more.
Separator character should be one that won't ever appear in any name, such as a tab (in contrast to space) character, #, '|', ... Even comma or semicolon should be fine, though the period might appear in abbreviated names and thus should not be used.

So knowing if it is part of a first name or last name is a bit of a challenge, but from the sound of it, you are in control of the data, so you can either lowercase the first name and capitalize the last or use some other method.
Breaking up the string, this is relatively easy by using strtok.
Making some assumptions that you are reading names line by line and stuffing them into buffer.
Use strtok to break buffer into "names".
char *token
token = strtok(buffer, " "); //note the second parameter is what you want to parse the array by
while(token != NULL)
{
if(isupper(token[0]))
//store that token into your struct (first or last name) allow for multiple
else
//store into the other
token = strtok(NULL, " "); //keep looping on string for names
}

Assuming last names are always written in upper case, start reading the string from the end and see when you have your last lower case.
int i=strlen(buffer)-1;
while(!islower(buffer[i]) && i>0)
i--;
strncpy(firstName,buffer,i+1);
strcpy(lastName,&buffer[i+2]);

Here's another solution.Read until there are two capitals after each other or a capital and a space. Then use pointer arithmetic to fill first name and lastname.
char name[] = "Michael Daniel VAN DOORNE";
char *p = name;
char firstname[100] = { 0 };
char lastname[100] = { 0 };
while (*p)
{
if (isupper(p[0]) && (isupper(p[1]) || p[1] == ' '))
{
strcpy(lastname, p);
strncpy(firstname, name, p - name - 1);
break;
}
p++;
}

If you're working with ASCII, here is a charset-specific trick that will help you:
#define TWOUPPER(c0, c1) (!((c0) & 32) && !((c1) & 32))
This will work even on single character last names since the null character will fail the 5th bit check, and single character middle names will not be taken as the last name since the following space will not succeed the test.
Works with the following test cases for me by comparing every two characters in the string and stopping on a match:
char test1[100] = "Otto VON BISMARK",
test2[100] = "Johannes Diderik VAN DER WAALS",
test3[100] = "Vincent VAN GOGH",
test4[100] = "Govind A B C D P"; // Only the "P" is counted as the last name here

Related

I need to print out all of the words in the file that begin with an uppercase character

i need to access a file and from that file i need to print out all the words that starts with a capital letter and also how many times the words have occurred. for example in the file there is a text "the Program should Display Files and also Files"
now the output should be:
Text
Program
Display
Files(2)//This word is written two times
enter image description here

while (!feof(..)) is normally not a good idea, instead write
while (fgets(readLine,sizeof(readLine), fpointer) != NULL)
{
}
it seems ptr is superfluous in this context, if you want to check for words in the line you should move it forward in the line?
alt. use instead the runtime function strtok:
for (ptr = strtok(readLine, " "); ptr != NULL; ptr = strtok(NULL, " ")
{
// now ptr will point to each word in the line, then you just check
// if the first character is upper case.
}

This sounds like a homework assignment, so I am not going to put the code here. I can give you the steps to take to have a general idea:
Open and read the file
Use something like strtok to split the lines into words
Loop over the words and check the first character (remember that a word is an array of chars. You can check if it's in range of 60 - 95 as based on the ascii table.
To count words you can create a hashmap in which you store the word with a count as value e.g. {word1: 1, word2: 2}
In the end you go over all the keys in the hashmap and print the key + count.

How can I read in (cin) 2 consecutive words in a line of input, that has more than 2 words?

I need to read in values from the user, such as license plate, name, phone number, and service type.
I already got how to read it in if the user uses the return character after each input, ie:
A36 HTY
John Doe
(263)7742336
Bronze
But how can I read these values into my array if they're all on one line? I can read in word by word, but I need to be able to read in both halves of the license plate, and both the first and last name into one spot in the array.
I'd appreciate any help, thanks!

read the entire line and then use your software's programming to check whether there is any white space(s) in between the non-white spaces, and if there is, break the read string at the white space(s) into two different strings.

You can use isspace.
Use this link for assistance:
https://www.geeksforgeeks.org/isspace-in-c-and-its-application-to-count-whitespace-characters/
In this example, they use isspace for counting the number of whitespaces in a string. You can change the code to save each word in different string. Since I didn't write cpp for quite time now, I don't want to give you example that might be incorrect.
// input sentence
char buf[50] = "Geeks for Geeks";
ch = buf[0];
// counting spaces
while (ch != '\0') {
ch = buf[i];
if (isspace(ch))
count++;
i++;
}
// returning number of spaces
return (count);

Sscanf not returning what I want

I have the following problem:
sscanf is not returning the way I want it to.
This is the sscanf:
sscanf(naru,
"%s[^;]%s[^;]%s[^;]%s[^;]%f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
"%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]"
"%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]",
&jokeri, &paiva1, &keskilampo1, &minlampo1, &maxlampo1,
&paiva2, &keskilampo2, &minlampo2, &maxlampo2, &paiva3,
&keskilampo3, &minlampo3, &maxlampo3, &paiva4, &keskilampo4,
&minlampo4, &maxlampo4, &paiva5, &keskilampo5, &minlampo5,
&maxlampo5, &paiva6, &keskilampo6, &minlampo6, &maxlampo6,
&paiva7, &keskilampo7, &minlampo7, &maxlampo7);
The string it's scanning:
const char *str = "city;"
"2014-04-14;7.61;4.76;7.61;"
"2014-04-15;5.7;5.26;6.63;"
"2014-04-16;4.84;2.49;5.26;"
"2014-04-17;2.13;1.22;3.45;"
"2014-04-18;3;2.15;3.01;"
"2014-04-19;7.28;3.82;7.28;"
"2014-04-20;10.62;5.5;10.62;";
All of the variables are stored as char paiva1[22] etc; however, the sscanf isn't storing anything except the city correctly. I've been trying to stop each variable at ;.
Any help how to get it to store the dates etc correctly would be appreciated.
Or if there's a smarter way to do this, I'm open to suggestions.

There are multiple problems, but BLUEPIXY hit the first one — the scan-set notation doesn't follow %s.
Your first line of the format is:
"%s[^;]%s[^;]%s[^;]%s[^;]%f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
As it stands, it looks for a space separated word, followed by a [, a ^, a ;, and a ] (which is self-contradictory; the character after the string is a space or end of string).
The first fixup would be to use scan-sets properly:
"%[^;]%[^;]%[^;]%[^;]%f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
Now you have a problem that the first %[^;] scans everything up to the end of string or first semicolon, leaving nothing for the second %[;] to match.
"%[^;]; %[^;]; %[^;]; %[^;]; %f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
This looks for a string up to a semicolon, then for the semicolon, then optional white space, then repeats for three items. Apart from adding a length to limit the size of string, preventing overflow, these are fine. The %f is OK. The following material looks for an odd sequence of characters again.
However, when the data is looked at, it seems to consist of a city, and then seven sets of 'a date plus three numbers'.
You'd do better with an array of structures (if you've worked with those yet), or a set of 4 parallel arrays, and a loop:
char jokeri[30];
char paiva[7][30];
float keskilampo[7];
float minlampo[7];
float maxlampo[7];
int eoc; // End of conversion
int offset = 0;
char sep;
if (fscanf(str + offset, "%29[^;]%c%n", jokeri, &sep, &eoc) != 2 || sep != ';')
...report error...
offset += eoc;
for (int i = 0; i < 7; i++)
{
if (fscanf(str + offset, "%29[^;];%f;%f;%f%c%n", paiva[i],
&keskilampo[i], &minlampo[i], &maxlampo[i], &sep, &eoc) != 5 ||
sep != ';')
...report error...
offset += eoc;
}
See also How to use sscanf() in loops.
Now you have data that can be managed. The set of 29 separately named variables is a ghastly thought; the code using them will be horrid.
Note that the scan-set conversion specifications limit the string to a maximum length one shorter than the size of jokeri and the paiva array elements.
You might legitimately be wondering about why the code uses %c%n and &sep before &eoc. There is a reason, but it is subtle. Suppose that the sscanf() format string is:
"%29[^;];%f;%f;%f;%n"
Further, suppose there's a problem in the data that the semicolon after the third number is missing. The call to sscanf() will report that it made 4 successful conversions, but it doesn't count the %n as an assignment, so you can't tell that sscanf() didn't find a semicolon and therefore did not set &eoc at all; the value is left over from a previous call to sscanf(), or simply uninitialized. By using the %c to scan a value into sep, we get 5 returned on success, and we can be sure the %n was successful too. The code checks that the value in sep is in fact a semicolon and not something else.
You might want to consider a space before the semi-colons, and before the %c. They'll allow some other data strings to be converted that would not be matched otherwise. Spaces in a format string (outside a scan-set) indicate where optional white space may appear.

I would use strtok function to break your string into pieces using ; as a delimiter. Such a long format string may be a source of problems in future.

C while loop until user enters "quit"

Here's my code:
1. User types in two names, with a space in between. This means that two strings need to be read. I.e. input:
John Doe.
The strings are then checked in a char-array. (works fine).
The while loop goes on until the user types "stop" - only "stop".
How can I make it to stop directly if "stop" is entered - without the need to check the second string?
The code:
while(bool==false)
{
scanf("%20s%20s", name1, name2);
if(strcmp(name1, "stop")==0)
{
break;
}
// but still the second name has to be entered
rest of code...
}
Thanks for any tips!

I suggest you use fgets to get the input, check for the "stop" string, and then use sscanf to parse the input.

You can put to use the regular expression character class support provided by scanf.
You could do:
scanf("%s%[^\n]s", name, temp);
Here, your first word is mandatory while second is optional.
When you input 2 words, your temp would have a leading space.
If you want to directly avoid it, you can do so by:
char *p = temp;
scanf("%s%[^\n]s", name, p++);
Here, you can later access your 2 words using name and p

How to parse this input in C

Right now i am doing an assignment but find it very hard to parse the user input in C. Here is kind of input user will input.
INSERT Alice, 25 Norway Drive, Fitzerald, GA, 40204, 6000.60
Here INSERT is the command (to enter in link list)
Alice is a name
25 Norway Drive is an address
Fitzerald is a city
GA is a state
40204 is a zip code
6000.60 is a balance
How can I use scanf or any other method in C to properly take this as input? The biggest problem in front of me is how to ignore these "," and store these values in separate variables of appropriate data types.
Thanks everyone, i have solve the issue and here is the solution:
pch = strtok(NULL, ","); pch =
substr(pch, 2, strlen(pch)); //substr is my custom funcition and i believe you can tell by its name what it is doing.
strcpy(customer->streetAddress, pch);

Fast easy method:
Use fgets() to get the string from the user;
and strtok() to tokenize it.
Edit
After reading your comment:
Use strtok() with only the comma, and then remove trailing and leading spaces from the result.
Edit2
After a test run, I noticed you will get "INSERT Alice" as the first token. So, after all tokens have been extracted, run strtok() again, this time with a space, on the first token extracted. Or, find the space and somehow identify the command and the name from there.

If your input data format is fixed you can use something quick and dirty using [s]scanf().
With input of:
INSERT Alice, 25 Norway Drive, Fitzerald, GA, 40204, 6000.60
You might try, if reading from stdin:
char name[80], addr[80], city[80], state[80];
int zip;
double amt;
int res = scanf("INSERT %[^,], %[^,], %[^,], %[^,], %d, %f\n",
&name, &addr, &city, &state, &zip, &amt);
Should return the number of items matched (i.e. 6).

scanf() may be a bit tricky in this situation, assuming that different commands with different parameters can be used. I would probably use fgets() to read in the string first, followed by the use of strtok() to read the first token (the command). At that point you can either continue to use strtok() with "," as the delimiter to read the rest of the tokens in the string, or you could use a sscanf() on the rest of the string (now that you know the format that the rest of the input will be in). sscanf() is still going to be a pain due to the fact that it appears that an unspecified number of spaces would be allowed in the address and possibly town fields.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Separating multiple first and/or last names in C - c

Assuming last names are always written in upper case, start reading the string from the end and see when you have your last lower case. int i=strlen(buffer)-1; while(!islower(buffer[i]) && i>0) i--; strncpy(firstName,buffer,i+1); strcpy(lastName,&buffer[i+2]);

Related

I need to print out all of the words in the file that begin with an uppercase character

How can I read in (cin) 2 consecutive words in a line of input, that has more than 2 words?

Sscanf not returning what I want

C while loop until user enters "quit"

How to parse this input in C

Categories

Resources