Splitting input lines at a comma - c

I am reading the contents of a file into a 2D array. The file is of the type:
FirstName,Surname
FirstName,Surname
etc. This is a homework exercise, and we can assume that everyone has a first name and a surname.
How would I go about splitting the line using the comma so that in a 2D array it would look like this:
char name[100][2];
with
Column1 Column2
Row 0 FirstName Surname
Row 1 FirstName Surname
I am really struggling with this and couldn't find any help that I could understand.

You can use strtok to tokenize your string based on a delimiter, and then strcpy the pointer to the token returned into your name array.
Alternatively, you could use strchr to find the location of the comma, and then use memcpy to copy the parts of the string before and after this point into your name array. This way will also preserve your initial string and not mangle it the way strtok would. It'll also be more thread-safe than using strtok.
Note: a thread-safe alternative to strtok is strtok_r, however that's declared as part of the POSIX standard. If that function's not available to you there may be a similar one defined for your environment.
EDIT: Another way is by using sscanf, however you won't be able to use the %s format specifier for the first string, you'd instead have to use a specifier with a set of characters to not match against (','). Since it's homework (and really simply) I'll let you figure that out.
EDIT2: Also, your array should be char name[2][100] for an array of two strings, each of 100 chars in size. Otherwise, with the way you have it, you'll have an array of 100 strings, each of 2 chars in size.

Related

Inserting "String" variable inside a given String in C

I have a function, which has a String Parameter:
function(char str[3]){
//here i want to insert the string Parameter str
f = open("/d1/d2/d3/test"+str+"/d2.xyz")
}
I am trying to "insert" the String parameter into the given String path. How can I do this in C?
The typical way would be to create a new string by piecing together the three pieces. One way do to this would be the following (shamelessly stolen from the #chux comment):
char buf[1000];
sprintf(buf, “/d1/d2/d3/test%s/d2.xyz”, str);
But before you go that route you need to ensure you really understand the printf family of functions as they are a common source of security related errors. For example, my buf size is large enough for your example, but certainly not for a general solution. Instead the sizes of the input strings would need to be taken into account to ensure the output buffer is large enough.

Is it safe to concatenate formatted strings using sprintf()?

I am writing a program that requires a gradual building of a formatted string, to be printed out as the last stage. The string includes numbers that are collected while the string is formed. Thus, I need to add formatted string fragments to the output string.
One straight forward way is to use sprintf() to a temporary string that contains the formatted fragment, which is then concatenated to the output string using strcat(), like demonstrated in this answer.
A more sophisticated approach is to point sprintf() to the end of the current output string, when adding the new fragment. This is demonstrated here.
The help page to the MSVC sprintf_s() function (and the other variants of sprintf()) states that:
If copying occurs between strings that overlap, the behavior is
undefined.
Now, technically, using sprintf() to concatenate the fragment to the end of the output string means overwriting the terminating NULL, which is considered a part of the first string. So, this action falls under the category of overlapping strings. The technique seems to work well, but is it really safe?
The method in the answer you linked to:
strcat() for formatted strings
is safe against overlapping string issues, but of course it's unsafe in that it's performing unbounded writes using sprintf rather than snprintf.
What's not safe, and what the text about overlapping strings is referring to, is something like:
snprintf(buf, sizeof buf, "%s%s", buf, tail);
Here, overlapping ranges of buf are being used both as an input and an output, and this results in undefined behavior. Don't do it.

iterating string in C, word by word

I just started learning C. What I am trying to right now is that I have two strings in which each word is separated by white spaces and I have to return the number of matching words in both strings. So, is there any function in C where I can take each word and compare it to everyother word in another string, if not any idea on how I can do that.
Break up the first string in words, this you can do in any number of ways everything from looping through the character array inserting \0 at each space to using strtok.
For each word found, go through the other string using strstr which checks if a string is in there. just check return value from strstr, if != NULL it found it.
I'd not use strtok but stick with pointer arithmetics length comparison and memcmp to compare strings of equal length.
There are two problems here:
1) splitting each string into words
The strtok() function can split a string into words.
It is a meaningful exercise to imagine how you might write your own equivalent to strtok.
The rosetta project shows both a strtok and a custom method approach to precisely this problem.
I would naturally write my own parser, as its the kind of code that appeals to me. It could be a fun exercise for you.
2) finding those words in one string that are also in another
If you iterate over each word in one string for each word in another, it has O(n*n) complexity.
If you index the words in one string it will take just O(n) which is substantially quicker (if your input is large enough to make this interesting). It is worth imagining how you might build a hashtable of the words in one string so that you can look for the words in the other.

C: Search and Replace/Delete

Is there a function in C that lets you search for a particular string and delete/replace it? If not, how do I do it myself?
It can be dangerous to do a search and replace - unless you are just replacing single chars with another char (ie change all the 'a' to 'b'). Reason being the replaced value could try and make the char array longer. Better to copy the string and replace as you into a new char array that can hold the result. A good find C function in strstr(). So you can find you string - copy everything before it to another buffer, add your replacement to the buffer - and repeat.
<string.h> is full of string processing functions. Look here for reference:
http://www.edcc.edu/faculty/paul.bladek/c_string_functions.htm
http://www.cs.cf.ac.uk/Dave/C/node19.html

The terminating NULL in an array in C

I have a simple question. Why is it necessary to consider the terminating null in an
array of chars (or simply a string) and not in an array of integers. So when i want a string to hold 20 characters i need to declare char string[21];. When i want to declare an array of integers holding 5 digits then int digits[5]; is enough. What is the reason for this?
You don't have to terminate a char array with NULL if you don't want to, but when using them to represent a string, then you need to do it because C uses null-terminated strings to represent its strings. When you use functions that operate on strings (like strlen for string-length or using printf to output a string), then those functions will read through the data until a NULL is encountered. If one isn't present, then you would likely run into buffer overflow or similar access violation/segmentation fault problems.
In short: that's how C represents string data.
Null terminators are required at the end of strings (or character arrays) because:
Most standard library string functions expect the null character to be there. It's put there in lieu of passing an explicit string length (though some functions require that instead.)
By design, the NUL character (ASCII 0x00) is used to designate the end of strings. Hence why it's also used as an EOF character when reading from ASCII files or streams.
Technically, if you're doing your own string manipulation with your own coded functions, you don't need a null terminator; you just need to keep track of how long the string is. But, if you use just about anything standardized, it will expect it.
It is only by convention that C strings end in the ascii nul character. (That's actually something different than NULL.)
If you like, you can begin your strings with a nul byte, or randomly include nul bytes in the middle of strings. You will then need your own library.
So the answer is: all arrays must allocate space for all of their elements. Your "20 character string" is simply a 21-character string, including the nul byte.
The reason is it was a design choice of the original implementors. A null terminated string gives you a way to pass an array into a function and not pass the size. With an integer array you must always pass the size. Ints convention of the language nothing more you could rewrite every string function in c with out using a null terminator but you would allways have to keep track of your array size.
The purpose of null termination in strings is so that the parser knows when to stop iterating through the array of characters.
So, when you use printf with the %s format character, it's essentially doing this:
int i = 0;
while(input[i] != '\0') {
output(input[i]);
i++;
}
This concept is commonly known as a sentinel.
It's not about declaring an array that's one-bigger, it's really about how we choose to define strings in C.
C strings by convention are considered to be a series of characters terminated by a final NUL character, as you know. This is baked into the language in the form of interpreting "string literals", and is adopted by all the standard library functions like strcpy and printf and etc. Everyone agrees that this is how we'll do strings in C, and that character is there to tell those functions where the string stops.
Looking at your question the other way around, the reason you don't do something similar in your arrays of integers is because you have some other way of knowing how long the array is-- either you pass around a length with it, or it has some assumed size. Strings could work this way in C, or have some other structure to them, but they don't -- the guys at Bell Labs decided that "strings" would be a standard array of characters, but would always have the terminating NUL so you'd know where it ended. (This was a good tradeoff at that time.)
It's not absolutely necessary to have the character array be 21 elements. It's only necessary if you follow the (nearly always assumed) convention that the twenty characters be followed by a null terminator. There is usually no such convention for a terminator in integer and other arrays.
Because of the the technical reasons of how C Strings are implemented compared to other conventions
Actually - you don't have to NUL-terminate your strings if you don't want to!
The only problem is you have to re-write all the string libraries because they depend on them. It's just a matter of doing it the way the library expects if you want to use their functionality.
Just like I have to bring home your daughter at midnight if I wish to date her - just an agreement with the library (or in this case, the father).

Resources