Efficiency of strncpy and code [closed] - c

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
I'm slowly learning and progressing through coding, so I was hoping someone could have a quick look at this function for me and tell me if it appears that I'm on the right track, how I could do it better or where I might be setting myself up for failure. I'm new to the world of C, so please take it easy on me - but be blunt and honest.
void test(char *username, char *password) {
printf("Checking password for %s - pw: %s\n",username,password);
char *query1 = "SELECT password FROM logins WHERE email = '";
char *query2 = "' LIMIT 1";
char *querystring = malloc(strlen(query1) + strlen(username) + strlen(query2) * sizeof(char));
strncpy(querystring,query1,strlen(query1));
strncat(querystring,username,strlen(username));
strncat(querystring,query2,strlen(query2));
printf("Query string: %s\n",querystring);
mysql_query(mysql_con,querystring);
MYSQL_RES *result = mysql_store_result(mysql_con);
int num_fields = mysql_num_fields(result);
int num_rows = mysql_num_rows(result);
if (num_rows != 0) {
MYSQL_ROW row;
printf("Query returned %i results with %i fields\n",num_rows,num_fields);
row = mysql_fetch_row(result);
printf("Password returned: %s\n",row[0]);
int comparison = strncmp(password, row[0], strlen(password));
if (comparison == 0) {
printf("Passwords match!\n");
} else {
printf("Passwords do NOT match!\n");
}
} else {
printf("No such user... Password is invalid");
}
free(querystring);
}
At the moment, it is working... output:
Checking password for jhall#futuresouth.us - pw: 5f4dcc3b5aa765d61d8327deb882cf99
Query string: SELECT password FROM logins WHERE email = 'test#blah.com' LIMIT 1
Query returned 1 results with 1 fields
Password returned: 5f4dcc3b5aa765d61d8327deb882cf99
Passwords match!
called with:
test("test#blah.com","5f4dcc3b5aa765d61d8327deb882cf99");
I'm looking for input on how I could have worked with the strings better, or if there are any unforeseen issues with how I did this. I'm very new to working with data structures in C.

Using strncpy(target, source, strlen(source)) guarantees that the string in target is not null terminated. If perchance malloc() returns zeroed memory, then it will seem to work, but once malloc() returns non-zeroed memory (previously allocated memory), things will go wrong.
The length argument to strncat() is just plain weird; it is the amount of space left in the target string after the current (null-terminated) data. Your usage, quite apart from not having null-terminated strings to work on, does not protect against buffer overflow.
There really isn't a good use case for strncat() IMNSHO, and seldom a good case for strncpy(). If you know how big everything is, you can use memmove() (or memcpy()) instead. If you don't know how big everything is, you don't know whether it is safe to do the copy without truncation.
Your malloc() call is a bit peculiar too: it doesn't allocate enough space for the trailing null, and it only multiplies one of the three terms by sizeof(char), which is inconsistent but otherwise harmless. A lot of the time you will get away with the short allocation because malloc() rounds the size up, but all hell will break loose when you don't get away with. A tool like valgrind will report abuse of allocated memory.

Jonathan's answer explains the problems with that part of the code.
To fix it you can use snprintf instead:
size_t space_needed = strlen(query1) + strlen(username) + strlen(query2) + 1;
char *querystring = malloc(space_needed + 1);
if ( !query_string )
exit(EXIT_FAILURE);
snprintf(query_string, space_needed, "%s%s%s", query1, username, query2);
Then, even if you calculate the length wrong, at least you didnt get a buffer overflow.
To avoid the code duplication here there is a non-standard function asprintf that you pass the arguments and it yields a pointer to a malloc'd buffer of the right size. Of course, it's possible to write your own version of this function if you don't want to rely on the existence of that function.
There's another serious issue here in that your code does not protect against SQL injection (see here for explanation). A proper discussion of how to protect against that is probably beyond the scope of this question!

Related

Why no split function in C? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
There is no Standard function in C to take a string, break it up at whitespace
or other delimiters, and create an array of pointers to char, in one step.
If you want to do that sort of thing, you have to do it yourself, either
completely by hand, or by calling e.g. strspn and strpbrk in a loop,
or by calling strtok in a loop, or by calling strsep in a loop.
I am not asking how to do this. I know how to do this,
and there are plenty of
other questions
on Stackoverflow
about how to do it. What I'm asking is if there are any good reasons why
there's no such function.
I know the two main reasons, of course: "Because no mainstream compiler/library
ever had one" and "Because the C Standard didn't specify one, either (because
it likes to standardize existing practice)." But are there any other reasons?
(Are there arguments that such a function is an actively bad idea?)
This is usually a lame and pointless sort of question, I know. In this case
I'm fixated on it because convenient splitting is such a massively useful
operation. I wrote my own string splitter within my first year as a
C programmer, I think, and it's been a huge productivity enhancer for me ever
since. There are dozens of questions here on SO every day that could be
answered easily (or that wouldn't even have to be asked) if there were a
standard split function that everyone could use and refer to.
To be clear, the function I'm imagining would have a signature like
int split(char *string, char **argv, int maxargs, const char *delim)
It would break up string into at most maxargs substrings, splitting on one or more characters from delim, placing pointers to the substrings into argv, and modifying string in the process.
And to head off an argument I'm sure someone will make: although it's standard, I do not consider
strtok to be an effective solution. strtok, frankly, sucks. Saying "you don't need a split function,
because strtok exists" is a lot like saying "You don't need printf,
because puts exists." This is not a question about what's theoretically
possible with a given toolset; it's about what's useful and convenient. The more
fundamental issue here, I guess, concerns the ineffable tradeoffs involved
in picking tools that are leverageable and productivity-enhancing and that
"pay their way". (I think it's clear that a nicely encapsulated
string-splitting function would pay its way handsomely, but perhaps
that's just me.)
I will try an answer. I indeed agree that such a function would be usefull. It is often quite usefull in the languages that have one.
Basically you are suggesting a builtin very simple wrapper around strtok() or strtok_r(). It would be a less powefull version (as we can't change delimiter while processing) but still usefull in some cases.
What I see is that these cases are also overlapping with scanf() familly functions use cases and with getopt() or getsubopt() familly functions use cases.
Actually I'm not sure that the remaining real use cases are that common.
In real life non trivial cases you would need a true parser or regex library, in specialized common case you already have scanf() or getopt() or even strtok().
Also functions modifying their input strings like strtok() or yours are more or less deprecated these days (experience says they easily lead to troubles).
Most languages providing a split feature have a real string type, often an unmutable one, and are supporting it by creating many individual substrings while leaving the original string intact.
Following that path would lead to either some other API non based on zero delimited strings (maybe with a start pointer and and end pointer), or with allocated string copies (like when using strdup()). Neither really satisfying.
In the end, if you add up not so common use in real life, quite simple to write and not that simple or intuitive API, there is no wonder that such function wasn't included in strandard libc.
Basically I would write something like that:
#include <string.h>
#include <stdio.h>
int split(char *string, char **argv, int maxargs, const char *delim){
char * saveptr = 0;
int x = 0;
argv[x++] = strtok_r(string, delim, &saveptr);
while(argv[x-1] && (x <= maxargs)){
argv[x++] = strtok_r(0, delim, &saveptr);
}
return x-1;
}
int main(){
char * args[10];
{
char * str = strdup("un deux trois quatre cinq six sept huit neuf dix onze");
int res = split(str, args, sizeof(args)/sizeof(char*), " ");
printf("res = %d\n", res);
for(int x = 0; x < res ; x++){
printf("%d:%s\n", x, args[x]);
}
}
{
char * str = strdup("un deux trois quatre cinq");
int res = split(str, args, sizeof(args)/sizeof(char*), " ");
printf("res = %d\n", res);
for(int x = 0; x < res ; x++){
printf("%d:%s\n", x, args[x]);
}
}
}
What I see looking at the code is that the wanted function is really very simple to write using strtok()... and that the call site to use the result is nearly as complicated than the function itself. In such a case hencefore I'd rather inline the function on the call site than having to call libc.
But of course you are welcome to use and write yours if you believe it's simpler for you.

Is this code vulnerable to buffer overflow?

Fortify reported a buffer overflow vulnerability in below code citing following reason -
In this case we are primarily concerned with the case "Depends upon properties of the data that are enforced outside of the immediate scope of the code.", because we cannot verify the safety of the operation performed by memcpy() in abc.cpp
void create_dir(const char *sys_tmp_dir, const char *base_name,
size_t base_name_len)
{
char *tmp_dir;
size_t sys_tmp_dir_len;
sys_tmp_dir_len = strlen(sys_tmp_dir);
tmp_dir = (char*) malloc(sys_tmp_dir_len + 1 + base_name_len + 1);
if(NULL == tmp_dir)
return;
memcpy(tmp_dir, sys_tmp_dir, sys_tmp_dir_len);
tmp_dir[sys_tmp_dir_len] = FN_LIBCHAR;
memcpy(tmp_dir + sys_tmp_dir_len + 1, base_name, base_name_len);
tmp_dir[sys_tmp_dir_len + base_name_len + 1] = '\0';
..........
..........
}
It appears to me a false positive since we are getting the size of data first, allocating that much amount of space, then calling memcpy with size to copy.
But I am looking for good reasons to convince fellow developer to get rid of current implementation and rather use c++ strings. This issue has been assigned to him. He just sees this a false positive so doesn't want to change anything.
Edit I see quick, valid criticism of the current code. Hopefully, I'll be able to convince him now. Otherwise, I'll hold the baton. :)
Take a look to strlen(), it has input string but it has not an upper bound then it'll go on searching until it founds \0. It's a vulnerability because you'll perform memcpy() trusting its result (if it won't crash because of access violation while searching). Imagine:
create_dir((const char*)12345, baseDir, strlen(baseDir));
You tagged both C and C++...if you're using C++ then std::string will protect you from these issues.
It appears to me a false positive since we are getting the size of data first, allocating that much amount of space
This assumption is a problem that matches the warning/error. In your code, you're assuming that malloc successfully allocated the requested memory. If your system has no memory to spare, malloc will fail and return NULL. When you try to memcpy into tmp_dir, you'd be copying to NULL which would be bad news.
You should check to guarantee that the value returned by malloc is not NULL before considering it as a valid pointer.

how do remove enclosing brackets from string? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have a
char s9[7] = "[abcd]";
How do i remove the brackets [] so that
s9 == "abcd"
I have tried
s9 = s9.Substring(1, s9.Length-2);
throws error in cygwin
a2v2.c:42:13: error: request for member ‘Substring’ in something not a structure or union
a2v2.c:42:29: error: request for member ‘Length’ in something not a structure or union
edit:
i realised my error, i am beginner at c and couldnt differentiate between c and C++ code, regards
Someone will correct me if I'm wrong, since the C standard I know is a couple of decades old, but as far as I know, C doesn't offer any standard support for string manipulation, and in fact doesn't even officially have a concept of strings. (Or of object functions, for that matter.) Instead, C uses pointers, which are much more powerful, but much more dangerous in that you can really mess things up if you don't learn your way around them.
The most important thing, if you want to be a C programmer is that you learn C. At the very least, you need to look up "string manipulation C" and read any of the pages that pop up.
There are many ways to do what you want. I think this is one of the faster ones (though it modifies the string you're looking at. If that matters, choose another way):
// trim off the last character
s9[strlen(s9) - 1] = '\0';
// the char * points to the s9 array. +1 makes it look at
// the second element, so then substring is the string you need
char * substring = s9 + 1;
Skipping any checking that the string actually begins and ends with those characters:
int len = strlen(s9);
for ( i = 0; i < len - 2; ++i )
s9[i] = s9[i + 1];
s9[len - 2] = '\0';
memmove( s9, s9 + 1, 4);
s9[4] = 0;
If it is strictly C, then you will need to use more basic functions (a char[] array has little in common with the string class in C++). Some of the functions to use might be:
strchr: Find the position of a character (e.g., strchr( s9, '[')). This assumes that it is not a fixed format you are dealing with. If you know the length and positions, then you could skip this and simply use memmove directly.
memmove: Shift the character left in the array. In this situation memmove would be needed (over memcpy or strncpy) because the target and destination overlap.
int len = strlen(s9);
memmove(s9, (s9+1), len-2); /* can handle overlapping strings */
s9[len-2] = 0; /* null terminate */

Fixing a segmentation fault [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
I'm getting a segmentation fault, which I've narrowed down to a for loop in a callback function. It's strange because the program was previously working, now it's not!
struct debuggerth_command debuggerth_protocol[] = { /*
* Note: These strings are NOT null-terminated. The
* strings are 4 bytes long for memory alignment and
* integer-cast comparisons.
*/
{ "run ", debuggerth_startprocess },
{ "stop", 0 },
{ "inp ", 0 },
{ "sig ", 0 },
{ 0, 0 }
};
And this is the code:
int debuggerth_callback (struct libwebsocket_context * context,
struct libwebsocket * wsi,
enum libwebsocket_callback_reasons reason,
void * user,
void * in,
size_t len){
switch (reason) {
case LWS_CALLBACK_RECEIVE:
if (len < 4){
/* send error */
return -1;
}
/* Getting a segmentation fault
* within this loop.
*/
// I used this break to determine where the seg fault starts
// break
int i = 0;
for (; debuggerth_protocol[i].cmd; i++)
if (cmpcmd (debuggerth_protocol[i].cmd, in)) break;
//break;
if (!debuggerth_protocol[i].cmd){
int byteswritten = sprintf
(debuggerth_message,
debuggerth_format,
debuggerth_headers[0],
debuggerth_errors [0]);
libwebsocket_write (wsi, debuggerth_message,
byteswritten,
LWS_WRITE_TEXT);
return -1;
}
break;
This is the string comparison macro:
#define cmpcmd(cmd, str) ((*(int*)(cmd)) == (*(int*)(str)))
Anyone have any ideas?
One idea: relying on the fact that your strings are exactly the size of an int is rather horrendous.
People often try to do clever things like that only to be badly bitten when the underlying assumptions change, such as moving to a platform where the int type is eight bytes.
I'd ditch that macro and rewrite it to use strcmp or strncmp (a).
There's also a couple of other things to do.
First, print out (or use a debugger to examine) all variables before attempting to use them. It may be that in is NULL.
Or maybe you attempt to call the NULL commands like stop or sig, or even if you get a command that's not in your table and you blindly call it when i is equal to 4. These particular possibilities are in code not shown, following the loop, so it's pure, though I'd like to think educated, speculation on my part.
Another possibility is that you're running on an architecture that disallows unaligned access. Some architectures are optimised for accessing on specific boundaries (such as getting 32-bit values from 32-bit aligned addresses) and will run slower if you violate that alignment.
However, some architectures won't allow unaligned access at all, instead giving something like a BUS error if you try.
Since you've now indicated in a comment that you're using ARM, that's almost certainly the case. See here for some more information.
If that's the case, it's even more reason to get rid of the tricky macro and use a more conventional solution.
(a): You may also want to investigate the term "strict aliasing" at some point since this may technically be undefined behaviour.
Given this is running on ARM, I think your problem is that it's doing an unaligned memory access, which will either fail or be quite slow. It is not exactly a seg fault. See this question for example, and as suggested there -Wcast-align will probably flag it as risky. You can turn on a software workaround but that's probably slower than just fixing it in your code.
One option would be to use memcmp which gcc may be able to compile down to something nearly as simple as a word read, in the case that it is aligned.
Another option, if performance is critical, is to unwind the loop into a case statement switching by the first byte of the command. Then just check the following characters are as expected.
I looked at some of the changes to my code, as #Jonothan Leffler suggested. This was the change I made:
struct debuggerth_command {
char * cmd;
int (*function)(struct debuggerth_session *, char * input);
};
struct debuggerth_command {
char cmd[4]; // changed this an array
int (*function)(struct debuggerth_session *, char * input);
};
So, when I initialized the structure here:
struct debuggerth_command debuggerth_protocol[] = { /*
* Note: These strings are NOT null-terminated. The
* strings are 4 bytes long for memory alignment and
* integer-cast comparisons.
*/
{ "run ", debuggerth_startprocess },
{ "stop", 0 },
{ "inp ", 0 },
{ "sig ", 0 },
{ 0, 0 } /* Zero used to be a pointer value,
* but now it's the first element in a
* 4 byte array
*/
};
Which changed the evaluation of the for loop:
int i = 0;
for (; debuggerth_protocol[i].cmd; i++)
if (cmpcmd (debuggerth_protocol[i].cmd, in)) break;
To always evaluate true, because cmd is now a valid pointer to a 4-byte array - of which, the first value is 0.
I'll remove the macro, since it might not perform well on some architectures. But, couldn't this be fixed with the use of C11's alignas feature?

Concatenate with memcpy

I'm trying to add two strings together using memcpy. The first memcpy does contain the data, I require. The second one does not however add on. Any idea why?
if (strlen(g->db_cmd) < MAX_DB_CMDS )
{
memcpy(&g->db_cmd[strlen(g->db_cmd)],l->db.param_value.val,strlen(l->db.param_value.val));
memcpy(&g->db_cmd[strlen(g->db_cmd)],l->del_const,strlen(l->del_const));
g->cmd_ctr++;
}
size_t len = strlen(l->db.param_value.val);
memcpy(g->db_cmd, l->db.param_value.val, len);
memcpy(g->db_cmd + len, l->del_const, strlen(l->del_cost)+1);
This gains you the following:
Less redundant calls to strlen. Each of those must traverse the string, so it's a good idea to minimize these calls.
The 2nd memcpy needs to actually append, not replace. So the first argument has to differ from the previous call.
Note the +1 in the 3rd arg of the 2nd memcpy. That is for the NUL terminator.
I'm not sure your if statement makes sense either. Perhaps a more sane thing to do would be to make sure that g->db_cmd has enough space for what you are about to copy. You would do that via either sizeof (if db_cmd is an array of characters) or by tracking how big your heap allocations are (if db_cmd was acquired via malloc). So perhaps it would make most sense as:
size_t param_value_len = strlen(l->db.param_value.val),
del_const_len = strlen(l->del_const);
// Assumption is that db_cmd is a char array and hence sizeof(db_cmd) makes sense.
// If db_cmd is a heap allocation, replace the sizeof() with how many bytes you
// asked malloc for.
//
if (param_value_len + del_const_len < sizeof(g->db_cmd))
{
memcpy(g->db_cmd, l->db.param_value.val, param_value_len);
memcpy(g->db_cmd + param_value_len, l->del_const, del_const_len + 1);
}
else
{
// TODO: your buffer is not big enough. handle that.
}
You're not copying the null terminator, you're only coping the raw string data. That leaves your string non-null-terminated, which can cause all sorts of problems. You're also not checking to make sure you have enough space in your buffer, which can result in buffer overflow vulnerabilities.
To make sure you copy the null terminator, just add 1 to the number of bytes you're copying -- copy strlen(l->db.param_value.val) + 1 bytes.
One possible problem is that your first memcpy() call won't necessarily result in a null terminated string since you're not copying the '\0' terminator from l->db.param_value.val:
So when strlen(g->db_cmd) is called in the second call to memcpy() it might be returning something completely bogus. Whether this is a problem depends on whether the g->db_cmd buffer is initialized to zeros beforehand or not.
Why not use the strcat(), which was made to do exactly what you're trying to do with memcpy()?
if (strlen(g->db_cmd) < MAX_DB_CMDS )
{
strcat( g->db_cmd, l->db.param_value.val);
strcat( g->db_cmd, l->del_const);
g->cmd_ctr++;
}
That'll have the advantage of being easier for someone to read. You might think it would be less performant - but I don't think so since you're making a bunch of strlen() calls explicitly. In any case, I'd concentrate on getting it right first, then worry about performance. Incorrect code is as unoptimized as you can get - get it right before getting it fast. In fact, my next step wouldn't be to improve the code performance-wise, it would be to improve the code to be less likely to have a buffer overrun (I'd probably switch to using something like strlcat() instead of strcat()).
For example, if g->db_cmd is a char array (and not a pointer), the result might look like:
size_t orig_len = strlen(g->db_cmd);
size_t result = strlcat( g->db_cmd, l->db.param_value.val, sizeof(g->db_cmd));
result = strlcat( g->db_cmd, l->del_const, sizeof(g->db_cmd));
g->cmd_ctr++;
if (result >= sizeof(g->db_cmd)) {
// the new stuff didn't fit, 'roll back' to what we started with
g->db_cmd[orig_len] = '\0';
g->cmd_ctr--;
}
If strlcat() isn't part of your platform it can be found on the net pretty easily. If you're using MSVC there's a strcat_s() function which you could use instead (but note that it's not equivalent to strlcat() - you'd have to change how the results from calling strcat_s() are checked and handled).

Resources