Why no split function in C? [closed] - c

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
There is no Standard function in C to take a string, break it up at whitespace
or other delimiters, and create an array of pointers to char, in one step.
If you want to do that sort of thing, you have to do it yourself, either
completely by hand, or by calling e.g. strspn and strpbrk in a loop,
or by calling strtok in a loop, or by calling strsep in a loop.
I am not asking how to do this. I know how to do this,
and there are plenty of
other questions
on Stackoverflow
about how to do it. What I'm asking is if there are any good reasons why
there's no such function.
I know the two main reasons, of course: "Because no mainstream compiler/library
ever had one" and "Because the C Standard didn't specify one, either (because
it likes to standardize existing practice)." But are there any other reasons?
(Are there arguments that such a function is an actively bad idea?)
This is usually a lame and pointless sort of question, I know. In this case
I'm fixated on it because convenient splitting is such a massively useful
operation. I wrote my own string splitter within my first year as a
C programmer, I think, and it's been a huge productivity enhancer for me ever
since. There are dozens of questions here on SO every day that could be
answered easily (or that wouldn't even have to be asked) if there were a
standard split function that everyone could use and refer to.
To be clear, the function I'm imagining would have a signature like
int split(char *string, char **argv, int maxargs, const char *delim)
It would break up string into at most maxargs substrings, splitting on one or more characters from delim, placing pointers to the substrings into argv, and modifying string in the process.
And to head off an argument I'm sure someone will make: although it's standard, I do not consider
strtok to be an effective solution. strtok, frankly, sucks. Saying "you don't need a split function,
because strtok exists" is a lot like saying "You don't need printf,
because puts exists." This is not a question about what's theoretically
possible with a given toolset; it's about what's useful and convenient. The more
fundamental issue here, I guess, concerns the ineffable tradeoffs involved
in picking tools that are leverageable and productivity-enhancing and that
"pay their way". (I think it's clear that a nicely encapsulated
string-splitting function would pay its way handsomely, but perhaps
that's just me.)

I will try an answer. I indeed agree that such a function would be usefull. It is often quite usefull in the languages that have one.
Basically you are suggesting a builtin very simple wrapper around strtok() or strtok_r(). It would be a less powefull version (as we can't change delimiter while processing) but still usefull in some cases.
What I see is that these cases are also overlapping with scanf() familly functions use cases and with getopt() or getsubopt() familly functions use cases.
Actually I'm not sure that the remaining real use cases are that common.
In real life non trivial cases you would need a true parser or regex library, in specialized common case you already have scanf() or getopt() or even strtok().
Also functions modifying their input strings like strtok() or yours are more or less deprecated these days (experience says they easily lead to troubles).
Most languages providing a split feature have a real string type, often an unmutable one, and are supporting it by creating many individual substrings while leaving the original string intact.
Following that path would lead to either some other API non based on zero delimited strings (maybe with a start pointer and and end pointer), or with allocated string copies (like when using strdup()). Neither really satisfying.
In the end, if you add up not so common use in real life, quite simple to write and not that simple or intuitive API, there is no wonder that such function wasn't included in strandard libc.
Basically I would write something like that:
#include <string.h>
#include <stdio.h>
int split(char *string, char **argv, int maxargs, const char *delim){
char * saveptr = 0;
int x = 0;
argv[x++] = strtok_r(string, delim, &saveptr);
while(argv[x-1] && (x <= maxargs)){
argv[x++] = strtok_r(0, delim, &saveptr);
}
return x-1;
}
int main(){
char * args[10];
{
char * str = strdup("un deux trois quatre cinq six sept huit neuf dix onze");
int res = split(str, args, sizeof(args)/sizeof(char*), " ");
printf("res = %d\n", res);
for(int x = 0; x < res ; x++){
printf("%d:%s\n", x, args[x]);
}
}
{
char * str = strdup("un deux trois quatre cinq");
int res = split(str, args, sizeof(args)/sizeof(char*), " ");
printf("res = %d\n", res);
for(int x = 0; x < res ; x++){
printf("%d:%s\n", x, args[x]);
}
}
}
What I see looking at the code is that the wanted function is really very simple to write using strtok()... and that the call site to use the result is nearly as complicated than the function itself. In such a case hencefore I'd rather inline the function on the call site than having to call libc.
But of course you are welcome to use and write yours if you believe it's simpler for you.

Related

pointer problems in C? function to order inputs, return as string and print in main

I want to write a function that will take two integers, three times, and then return them ordered by the first integer and (for now) print them in main (though eventually I plan/hope to switch to a file-based structure to store and organize data), but I think I might have an issue with my pointers cause even when I skip concatenations (which looks like might also be another separate issue), everything Ive tried has main print a string (or no string) which never matches the input, but the print statements suggest all the looped assignments are working properly.
#include <stdio.h>
#include <string.h>
const char * entry()
{
int n;
int level;
char habit1entry[6];
char habit2entry[6];
char habit3entry[6];
for (int c = 0; c< 3; c++){
printf("Habit #\n");
scanf("%d", &n);
printf("Level:\n");
scanf("%d", &level);
switch (n)
{
case 1:;
sprintf(habit1entry, "|%d|%d|\n", n,level);
printf("n = %d\n",n);
printf("%s\n",habit1entry);
continue;
case 2:;
sprintf(habit2entry, "|%d|%d|\n", n,level);
printf("n = %d\n",n);
printf("%s\n",habit2entry);
continue;
case 3:;
sprintf(habit3entry, "|%d|%d|\n", n,level);
printf("n = %d\n",n);
printf("%s\n",habit3entry);
continue;
}
}
strcat(habit2entry,habit3entry);
printf("%s\n",habit2entry);
strcat(habit1entry,habit2entry);
printf("%s\n",habit1entry);
char *fullEntry=habit3entry;
printf("%s\n",fullEntry);
return strdup(&fullEntry[0]);
}
int main(){
const char * dataEntry = entry();
//strcpy(dataEntry,entry());
printf("Data:\n%s",dataEntry);
}
heres an example of the output(after the correct prints inside the switch cases) for an input of 3 2 1 1 2 2:
"
|2|2|
|1|1|
|2|2|
|2|2|
|��
|2|2|
|��
* stack smashing detected *: ./a.out terminated
Aborted (core dumped) "
p.s. Sorry if this all sounds silly, this is my first C project (and first real stack overflow post, plz b gentl) coming from jumping around between java, python and clojure and I would like to take an operating systems class that allows you to start without knowing C but expects you to pick it up on your own and its hard finding material that explains C concepts in a scope that matches my background knowledge and current learning constraints in terms of time available for taking deep dives through explanations that for me have ended up mostly being either hopelessly esoteric, incredibly case-specific or overly-simplistic/redundant/unhelpful explanations of programming concepts I picked up in other languages. Dont mean to complain or harp on and its probably good to get practice with different methods of asking questions and finding answers for problems like these, but the learning curve for understanding things like this (setting up the compiler/json files involved spending hours only to discover that mcafee was deleting my exes which I became convinced was a symptom of a virus, only to have the behavior stop after I restarted for a minor routine windows update and I have no idea why) outside of a traditional framework sometimes seems more like a wall and I'm worried that maybe I should revise my approach to avoid wasting too much of my time banging my head against a series of very sturdy walls. any and all advice is greatly appreciated.
Abstracting form the logic of the program, you have plenty of issues there:
You do not provide enough space for the strings
Your switch is not very related to your for loop
Names of the variables do not matter for you - but they matter for the program . Be more careful.
probably more but I forgot already
#include <stdio.h>
#include <string.h>
const char * entry()
{
int n;
int level;
char habit1entry[21] = "";
char habit2entry[14] = "";
char habit3entry[7] = "";
for (int c = 1; c < 4; c++){
printf("Habit #\n");
scanf("%d", &n);
printf("Level:\n");
scanf("%d", &level);
switch (c)
{
case 1:;
sprintf(habit1entry, "|%d|%d|\n", n,level % 10);
printf("n = %d\n",n);
printf("He1: %s\n",habit1entry);
continue;
case 2:;
sprintf(habit2entry, "|%d|%d|\n", n,level % 10);
printf("n = %d\n",n);
printf("He2 = %s\n",habit2entry);
continue;
case 3:;
sprintf(habit3entry, "|%d|%d|\n", n,level % 10);
printf("n = %d\n",n);
printf("He3 = %s\n",habit3entry);
continue;
}
}
strcat(habit2entry,habit3entry);
printf("H2 + H3 = %s\n",habit2entry);
strcat(habit1entry,habit2entry);
printf("H1 + H2 = %s\n",habit1entry);
char *fullEntry=habit1entry;
printf("FE: %s\n",fullEntry);
return strdup(fullEntry);
}
int main(){
const char * dataEntry = entry();
//strcpy(dataEntry,entry());
printf("Data:\n%s",dataEntry);
}
Welcome to the weird and wonderful world of C.
I have not actually compiled and run your program yet, just had a quick read through and though I give you my first thoughts.
The way your program is written is primed to generate stack overflows. You have three (very little) character arrays defined on the stack habitxentry, so your sprintf's will most certainly blow your stack unless both the Habit and Level inputs are less than 10. Habit is alright because your switch only allows 1, 2 or 3. Your switch does nothing if Habit is anything else.
As a side note: sprintf is not really the function to use in our security minded world. snprintf is a better choice. Not really an issue here per se as you are not passing in user supplied data but still, it's not a good habit to cultivate.
Next you strcat your character arrays together, virtually guarantying a stack violation, but lets assume this works; you are concatenating 2 and 3 into habit2entry and then 1 and 2 into habit1entry.
Next you are creating a pointer to habit3entry (not habit1entry) and returning a duplicate.
By doing so you are allocating heap in a mildy obscure manner. The callee will be responsible for freeing this memory.
I always preferred to explicitly malloc the memory and then strcpy (or memcpy) the data in.
Now when you grep your code, you only have to look for malloc.
Also, someone using the function will notice the malloc, see you have returned to pointer and realize that freeing it will now be his problem.
In order to avoid these problems some programmers leave it to the caller to supply a buffer to the function. The reasoning is that a function is supposed to do one thing and one thing only. In this case you are doing two things, you allocate memory and you fill that memory.
In your switch statement I noticed that each of your case labels are followed by an empty statement.
the semicolon at the end of that line is not necessary: write "case 1:" not "case 1:;"
You also use continue at the end of each block. This is allowed but "break" is more appropriate.
In this case it will have the same effect but normally you have more statements after the switch.
Now the difference will become apparent. Continue will jump straight to the top of the loop, break will break out of the switch and continue executing there.
Hope this gives you some insight.
Good luck.

passing a structure for comparison [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
struct sign_in
{
char password[MAX_NAME_LEN+1];//The password for each player
char name[MAX_NAME_LEN+1];//Name of the people who can sign in
}
//prototype
int compare_names(char*, char*, struct sign_in*);
int compare_names(char*pName,char*pPassCode,struct sign_in *var)
{
int iComparison = 1;
int flag = 1;
int iComparison2 = 1;
int i = 0;
for (i=0;i<6;i++)
{
printf("%s \t %s ", var[0].name,pName );
if(iComparison != 0)
{
iComparison = strcmp(pName,var[i].name);
i++;
}
if(iComparison2 != 0)
{
iComparison2 = strcmp(pPassCode,var[i].password);
i++;
}
printf("%d", iComparison);
printf("%d", iComparison2);
}
}
I have updated my code and attempted to take into account many of the aspects that you guys have recommended and the good news is that it runs now. The bad news is that it still attempts to print some random jargon that I don't understand, it's just a collection of symbols usually. The structure this function compares against has six members so that's the reason for parameters on the first for loop.
The code you've presented is a cornucopia of sloppyness. When programming, that's not really OK.
You forgot the closing curly braces for the struct sign_in definition and the compare_names() function definition
You did not initialize iComparisson to any value. flag is initialized, but iComparisson is not. Also, it's misspelled!
Don't use printf() with a user-input as the format string, there could be a % in there. At the very least do printf("%s", pname). And you probably want a \n in there too.
strcmp() might return -1 to mean pName sorts before var[i].name (and differs from it of course), so while(iComparisson == 1) does not do what you want
you need to know the length of the var array and stop that loop before you run off the end
strcmp() takes strings, which are pointers. When you call strcmp(*pName, ...) you're dereferencing the pName "pointer to char" to just a "char". It's like getting the first character from the pName string, and then putting that character value where a pointer-to-character value is expected. Not good. The situation with var[i].name is a bit more complicated because name is an array, but get rid of the star, it's not needed for that either.
The second while () loop will loop forever if the iPassCode does not match, you probably want if ()
In your problem description you omit the closing backtick after *var[i].password and the closing double-quote after "invalid type argument of unaray", and you obviously mangled the compiler error message as well. This makes it harder to understand what you wrote and what went wrong.
The iPassCode == var[i].password actually looks fine. It seems rather likely that this isn't the code you had a problem with, due to all the other ridiculous problems in your sample ...

Input Loop trouble [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 years ago.
Improve this question
What I've done is ask a user to input how many times the program will loop, it then records values into 3 different arrays. Everything is working great, but what I need it to do is print the elements of one array if the corresponding element in another array meets the requirements. Everything else runs great, I'll post the two arrays that I'm trying to use for this.
char *names[50][32];
char *states[50][2];
i = 0;
while ( i < b) {
if (state[i] = "tx");{
printf("a string %s\n", names[i]);}
i = i + 1;
}
for this : if (state[i] = "tx");{ I've tried with and without quotes and using 116120...
Basically, it asks for peoples names and where they live. I can get it to print the array element values for each name(it runs in a loop) but I want it to only print the names for the people who live in tx.
There are a few things wrong with your code. First of all ending an if or for construction with a semicolon is a common mistake when starting in C. Basically it creates an empty if statement followed by a code block. Look at it this way:
if (condition)
; // Does nothing. The if is empty
// Totally unrelated block of code.
{
}
Code blocks are usually useful to create scopes, so although it might seem useless for the compiler to interpret blocks in this way, it actually is not. This also happens in other situations, such as while, for, and so on:
for (int i=0 ; i<n ; ++i)
; // Empty for. Runs `n` loops, but doing nothing
// Unrelated block of code. Runs only once
{
}
The comparison operator is also wrong, you should use == for comparisons, instead of =, which is used for assignments.
Finally, you cannot compare strings in this way. Strings are basically arrays, which in turn are represented using pointers. If you compare two pointers (ptr1 == ptr2) it'll only check whether the two strings point at the same address in memory. As strings are composed by several characters, they have to be iterated to be properly compared. Fortunately the standard library already provides a method for this.
Fix a typo or two and this is what you get:
char *names[50][32];
char *states[50][2];
i = 0; // Assuming this is declared somewhere else
while ( i < b) {
if (strcmp(states[i], "tx") == 0) {
printf("a string %s\n", names[i]);
}
i = i + 1;
}
You should probably also check the docs for strcmp.
Edit: as this is already the accepted answer, I should also include a fix as noted by #dbush. The array for stats is clearly missing space for the extra string terminator, as strings are NULL terminated in C. The array for names might or might not suffer from the same issue, it's not clear. Anyway, it's notable that both should include an extra byte for storing the terminator:
char names[50][33];
char states[50][3];
Props to #dbush.
The declaration of your arrays don't look correct:
char *names[50][32];
char *states[50][2];
These declare a pair of two-dimensional array of char pointers, which is probably not what you want.
char names[50][32];
char states[50][3];
These are two-dimensional arrays of characters, or alternately arrays of strings. Note that the states array has space for an extra character for the terminating NULL.
In this if statement this:
if (state[i] = "tx");{
Since the ; occurs immediately after the condition, that ends the if block. The following statements within curly braces therefore will always run. Also, = is for assignment, not comparison, but using == is not appropriate either, since that operator won't compare the strings, but their addresses. You need to use strcmp for string comparisons.
So the fixed code should look like this:
char names[50][32];
char states[50][3];
...
i = 0;
while ( i < b) {
if (strcmp(state[i],"tx") == 0) {
printf("a string %s\n", names[i]);
}
i = i + 1;
}

Efficiency of strncpy and code [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
I'm slowly learning and progressing through coding, so I was hoping someone could have a quick look at this function for me and tell me if it appears that I'm on the right track, how I could do it better or where I might be setting myself up for failure. I'm new to the world of C, so please take it easy on me - but be blunt and honest.
void test(char *username, char *password) {
printf("Checking password for %s - pw: %s\n",username,password);
char *query1 = "SELECT password FROM logins WHERE email = '";
char *query2 = "' LIMIT 1";
char *querystring = malloc(strlen(query1) + strlen(username) + strlen(query2) * sizeof(char));
strncpy(querystring,query1,strlen(query1));
strncat(querystring,username,strlen(username));
strncat(querystring,query2,strlen(query2));
printf("Query string: %s\n",querystring);
mysql_query(mysql_con,querystring);
MYSQL_RES *result = mysql_store_result(mysql_con);
int num_fields = mysql_num_fields(result);
int num_rows = mysql_num_rows(result);
if (num_rows != 0) {
MYSQL_ROW row;
printf("Query returned %i results with %i fields\n",num_rows,num_fields);
row = mysql_fetch_row(result);
printf("Password returned: %s\n",row[0]);
int comparison = strncmp(password, row[0], strlen(password));
if (comparison == 0) {
printf("Passwords match!\n");
} else {
printf("Passwords do NOT match!\n");
}
} else {
printf("No such user... Password is invalid");
}
free(querystring);
}
At the moment, it is working... output:
Checking password for jhall#futuresouth.us - pw: 5f4dcc3b5aa765d61d8327deb882cf99
Query string: SELECT password FROM logins WHERE email = 'test#blah.com' LIMIT 1
Query returned 1 results with 1 fields
Password returned: 5f4dcc3b5aa765d61d8327deb882cf99
Passwords match!
called with:
test("test#blah.com","5f4dcc3b5aa765d61d8327deb882cf99");
I'm looking for input on how I could have worked with the strings better, or if there are any unforeseen issues with how I did this. I'm very new to working with data structures in C.
Using strncpy(target, source, strlen(source)) guarantees that the string in target is not null terminated. If perchance malloc() returns zeroed memory, then it will seem to work, but once malloc() returns non-zeroed memory (previously allocated memory), things will go wrong.
The length argument to strncat() is just plain weird; it is the amount of space left in the target string after the current (null-terminated) data. Your usage, quite apart from not having null-terminated strings to work on, does not protect against buffer overflow.
There really isn't a good use case for strncat() IMNSHO, and seldom a good case for strncpy(). If you know how big everything is, you can use memmove() (or memcpy()) instead. If you don't know how big everything is, you don't know whether it is safe to do the copy without truncation.
Your malloc() call is a bit peculiar too: it doesn't allocate enough space for the trailing null, and it only multiplies one of the three terms by sizeof(char), which is inconsistent but otherwise harmless. A lot of the time you will get away with the short allocation because malloc() rounds the size up, but all hell will break loose when you don't get away with. A tool like valgrind will report abuse of allocated memory.
Jonathan's answer explains the problems with that part of the code.
To fix it you can use snprintf instead:
size_t space_needed = strlen(query1) + strlen(username) + strlen(query2) + 1;
char *querystring = malloc(space_needed + 1);
if ( !query_string )
exit(EXIT_FAILURE);
snprintf(query_string, space_needed, "%s%s%s", query1, username, query2);
Then, even if you calculate the length wrong, at least you didnt get a buffer overflow.
To avoid the code duplication here there is a non-standard function asprintf that you pass the arguments and it yields a pointer to a malloc'd buffer of the right size. Of course, it's possible to write your own version of this function if you don't want to rely on the existence of that function.
There's another serious issue here in that your code does not protect against SQL injection (see here for explanation). A proper discussion of how to protect against that is probably beyond the scope of this question!

how do remove enclosing brackets from string? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have a
char s9[7] = "[abcd]";
How do i remove the brackets [] so that
s9 == "abcd"
I have tried
s9 = s9.Substring(1, s9.Length-2);
throws error in cygwin
a2v2.c:42:13: error: request for member ‘Substring’ in something not a structure or union
a2v2.c:42:29: error: request for member ‘Length’ in something not a structure or union
edit:
i realised my error, i am beginner at c and couldnt differentiate between c and C++ code, regards
Someone will correct me if I'm wrong, since the C standard I know is a couple of decades old, but as far as I know, C doesn't offer any standard support for string manipulation, and in fact doesn't even officially have a concept of strings. (Or of object functions, for that matter.) Instead, C uses pointers, which are much more powerful, but much more dangerous in that you can really mess things up if you don't learn your way around them.
The most important thing, if you want to be a C programmer is that you learn C. At the very least, you need to look up "string manipulation C" and read any of the pages that pop up.
There are many ways to do what you want. I think this is one of the faster ones (though it modifies the string you're looking at. If that matters, choose another way):
// trim off the last character
s9[strlen(s9) - 1] = '\0';
// the char * points to the s9 array. +1 makes it look at
// the second element, so then substring is the string you need
char * substring = s9 + 1;
Skipping any checking that the string actually begins and ends with those characters:
int len = strlen(s9);
for ( i = 0; i < len - 2; ++i )
s9[i] = s9[i + 1];
s9[len - 2] = '\0';
memmove( s9, s9 + 1, 4);
s9[4] = 0;
If it is strictly C, then you will need to use more basic functions (a char[] array has little in common with the string class in C++). Some of the functions to use might be:
strchr: Find the position of a character (e.g., strchr( s9, '[')). This assumes that it is not a fixed format you are dealing with. If you know the length and positions, then you could skip this and simply use memmove directly.
memmove: Shift the character left in the array. In this situation memmove would be needed (over memcpy or strncpy) because the target and destination overlap.
int len = strlen(s9);
memmove(s9, (s9+1), len-2); /* can handle overlapping strings */
s9[len-2] = 0; /* null terminate */

Resources