basic qsort on string array crashes in qsort() - c

I tried to create some basic code using qsort to sort an array of strings, but it crashes in qsort, according to gdb:
#include <string.h>
#include <stdlib.h>
static int pcmp(const void * a, const void * b)
{
return strcmp(* (char * const *) a, * (char * const *) b);
}
int main()
{
char pn[10][256];
memset(pn, 0, sizeof(char) * 10 * 256);
strcpy(pn[0], "hello");
strcpy(pn[1], "TEST");
strcpy(pn[2], "abc");
strcpy(pn[3], "000000");
qsort(pn, 4, sizeof (char *), pcmp);
}

qsort(pn, 4, sizeof (char *), pcmp);
You tell qsort that what you want to sort is an array of 4 char*, but
char pn[10][256];
actually, pn is an array of 10 char[256]. These things are layout-incompatible, and qsort interprets some bytes in the first of the char[256] as char*s. That's undefined behaviour, and not unlikely to cause a segmentation fault.
To fix it for this special case, you can change your comparison to
static int pcmp(const void * a, const void * b)
{
return strcmp((const char *) a, (const char *) b);
}
and the invocation to
qsort(pn, 4, sizeof pn[0], pcmp);

static int pcmp(const void * a, const void * b)
{
return strcmp( (const char *) a, (const char *) b);
}
int main()
{
char pn[10][256];
strcpy(pn[0], "hello");
strcpy(pn[1], "TEST");
strcpy(pn[2], "abc");
strcpy(pn[3], "000000");
qsort(pn, 4, sizeof (char [256]), pcmp);
return 0;
}

Related

Strcmp causes segfault

Here is the code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int my_compare(const void * a, const void * b);
int main()
{
char s[][80] =
{ "gxydyv", "gdyvjv", "lfdtvr", "ayfdbk", "sqkpge", "axkoev", "wdjitd", "pyrefu", "mdafyu",
"zdgjjf", "awhlff", "dqupga", "qoprcn", "axjyfb", "hfrgjf", "dvhhhr" };
int i;
puts("#Before:#");
for (i = 0; i < 16; i++)
puts(s[i]);
qsort(s, 16, sizeof *s, my_compare);
putchar('\n');
puts("#After:#");
for (i = 0; i < 16; i++)
puts(s[i]);
return 0;
}
int my_compare(const void *a, const void *b)
{
return strcmp(*(char **)a, *(char **)b);
}
Here is the output:
#Before:#
gxydyv
gdyvjv
lfdtvr
ayfdbk
sqkpge
axkoev
wdjitd
pyrefu
mdafyu
zdgjjf
awhlff
dqupga
qoprcn
axjyfb
hfrgjf
dvhhhr
Segmentation fault
I also notice that the prototype of strcmp is:
int strcmp(const char *s1,const char *s2);
I suppose that the type of a and b in my_compare is "pointer to array-of-char". As a result, *(char **)a is a "pointer to char", which is exactly what strcmp expects.
So where is the problem?
Change:
return strcmp(*(char **) a, *(char **) b);
To:
return strcmp(a,b);
You had an extra level of pointer dereferencing that was incorrect and that's why you got the segfault. That is, you were passing the char values and not the char pointers [which got masked with the cast].
Note: no need to cast from void * here.
UPDATE:
In reponse to your question, yes, because of the way you defined s and the qsort call.
Your original my_compare would have been fine if you had done:
char *s[] = { ... };
And changed your qsort call to:
qsort(s, 16, sizeof(char *), my_compare);
To summarize, here are two ways to do it
int
main()
{
char s[][80] = { ... }
qsort(s, 16, 80, my_compare);
return 0;
}
int
my_compare(const void *a, const void *b)
{
return strcmp(a,b);
}
This is a bit cleaner [uses less space in array]:
int
main()
{
char *s[] = { ... }
qsort(s, 16, sizeof(char *), my_compare);
return 0;
}
int
my_compare(const void *a, const void *b)
{
return strcmp(*(char **) a,*(char **) b);
}
UPDATE #2:
To answer your second question: No
None of these even compile:
return strcmp((char ()[80])a,(char ()[80])b);
return strcmp(*(char ()[80])a,*(char ()[80])b);
return strcmp((char [][80])a,(char [][80])b);
return strcmp(*(char [][80])a,*(char [][80])b);
But, even if the did, they would be logically incorrect. The following does not compile either, but is logically closer to what qsort is passing:
return strcmp((char [80])a,(char [80])b);
But, when a function passes something defined as char x[80] it's just the same as char *x, so qsort is passing char * [disguised as void *].
A side note: Using char *s[] is far superior. It allows for arbitrary length strings. The other form char s[][80] will actually fail if a given string exceeds [or is exactly] 80 chars.
I think it's important for you to understand:
Arrays are call by reference.
The interchangeability of arrays and pointers.
The following two are equivalent:
char *
strary(char p[])
{
for (; *p != 0; ++p);
return p;
}
char *
strptr(char *p)
{
for (; *p != 0; ++p);
return p;
}
Consider the following [outer] definitions:
char x[] = { ... };
char *x = ...;
Either of these two may be passed to strary and/or strptr in any of the following forms [total of 20]:
strXXX(x);
strXXX(x + 0);
strXXX(&x[0]);
strXXX(x + 1);
strXXX(&x[1]);
Also, see my recent answer here: Issue implementing dynamic array of structures
You can just cast it to a const char *, it should work now:
int my_compare(const void *a, const void *b) {
return strcmp((const char *)a, (const char *)b);
}
And also you should add:
#include <stdlib.h>

Concatenate 3 strings and return a pointer to the new string C

I am wondering if anyone could help me, I am trying to concatenate 3 strings and return a pointer to the new string. I can't seem to figure out how to do this using strncat instead of strcat and strncpy instead of strcpy. I am only learning C, so any help will be greatly appreciated.
char *concatenate(char *a, char *b, char *d) {
char str[80];
strcpy(str, a);
strcat(str, b);
strcat(str, d);
puts(str);
return (NULL);
}
Your approach cannot be used to return the concatenated string: you would return a pointer to a local array that is no longer valid once the function returns, furthermore, you do not check for buffer overflow.
Here is a quick and dirty version that allocates memory:
#include <stdlib.h>
#include <string.h>
char *concatenate(const char *a, const char *b, const char *c) {
return strcat(strcat(strcpy(malloc(strlen(a) + strlen(b) + strlen(c) + 1,
a), b), c);
}
Here is a more elaborate version using memcpy and testing for malloc failure:
#include <stdlib.h>
#include <string.h>
char *concatenate(const char *a, const char *b, const char *c) {
size_t alen = strlen(a);
size_t blen = strlen(b);
size_t clen = strlen(c);
char *res = malloc(alen + blen + clen + 1);
if (res) {
memcpy(res, a, alen);
memcpy(res + alen, b, blen);
memcpy(res + alen + blen, c, clen + 1);
}
return res;
}
It should be more efficient since it does not perform the extra scans strcpy and strcat do, but only careful benchmarking can prove if it is a real improvement over the simple version above.
If you need to concatenate 3 strings into an existing buffer, a very simple solution is:
char dest[DEST_SIZE];
snprintf(dest, sizeof dest, "%s%s%s", a, b, c);
Many systems (linux, GNU, BSD) have an asprintf() function defined that allocates memory for the resulting string:
int asprintf(char **strp, const char *fmt, ...);
Using this function, you can concatenate the three strings quite simply:
#define _GNU_SOURCE
#include <stdio.h>
char *concatenate(const char *a, const char *b, const char *c) {
char *p;
return (asprintf(&p, "%s%s%s", a, b, c) >= 0) ? p : NULL;
}
Your str is local to your function.
You could add a fourth parameter to your concatenated string or you could malloc it inside the function, just make sure to free it after use.
char *concatenate(char *a, char *b, char *c)
{
int size = strlen(a) + strlen(b) + strlen(c) + 1;
char *str = malloc(size);
strcpy (str, a);
strcat (str, b);
strcat (str, c);
return str;
}
int main(void) {
char *str = concatenate("bla", "ble", "bli");
printf("%s", str);
free(str);
return 0;
}
Maybe something like that:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char * concatenate(const char *a, const char *b, const char *d)
{
/* calculate the length of the new string */
size_t len = strlen(a) + strlen(b) + strlen(d);
/* allocate memory for the new string */
char* str = malloc(len + 1);
/* concatenate */
strcpy(str, a);
strcat(str, b);
strcat(str, d);
/* return the pointer to the new string
* NOTE: clients are responsible for releasing the allocated memory
*/
return str;
}
int main(void)
{
const char a[] = "lorem";
const char b[] = "impsum";
const char d[] = "dolor";
char* str = concatenate(a, b, d);
printf("%s\n", str);
free(str);
return 0;
}
If you want something even more generic (e.g. concatenate N strings), you can have a look to the implementation of g_strconcat of the glib library here:
https://github.com/GNOME/glib/blob/master/glib/gstrfuncs.c#L563

getting at pointers in an array of struct passed as a void*

I have a function with a signature like qsort:
const char* get_str(int i, const void *base, size_t nmemb, size_t size);
I am passed arrays of pointers to const char*, or arrays of structs whose first field is a pointer to const char*.
What casts do I need to do to extract that pointer in the array element i?
I have tried casting the base as an array of char itself, so I can advance to the right element:
return *(const char**)(((const char*)base)[i * size]);
But the compiler is complaining: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
It looks like you want to implement a type or identification sytem for structs like so:
#include <stdlib.h>
#include <stdio.h>
struct a {
const char *id;
int x;
};
struct b {
const char *id;
double d;
};
union any {
struct a a;
struct b b;
};
int main()
{
struct a a[] = {{"one", 1}, {"two", 2}, {"three", 3}};
struct b b[] = {{"pi", 3.1415}, {"e", 2.71}};
union any any[3];
any[0].a = a[0];
any[1].b = b[0];
any[2].a = a[1];
puts(get_str(1, a, 3, sizeof(*a)));
puts(get_str(1, b, 2, sizeof(*b)));
puts(get_str(1, any, 3, sizeof(*any)));
return 0;
}
In this case, the following works:
const char* get_str(int i, const void *base, size_t nmemb, size_t size)
{
const char *p = base;
const char **pp = (const char**) (p + i * size);
return *pp;
}
This can be written in one line as:
const char* get_str(int i, const void *base, size_t nmemb, size_t size)
{
return *(const char**) ((const char *) base + i * size);
}
But I think that the detour via void pointers is not necessary. You can do the address calculations with the typed array:
puts(*(const char **) (&a[1]));
puts(*(const char **) (&b[1]));
puts(*(const char **) (&any[1]));
If you wrap that in a function:
const char *get_str(const void *str)
{
return *(const char **) str;
}
you get:
puts(get_str(&a[1]));
puts(get_str(&b[1]));
puts(get_str(any + 1));
which is more readable than the qsortish syntax in my opinion.
This works, because you acces only one element at a known position. The functions bsort and qsort, however, can't use this technique, because they have to access the array at several indices and hence must be able to do the index calculation themselves.
Find me (well, "yet another" but suppose this parentheses never existed!) a bug in that compiler and I give you a free cookie!
The compiler is right. This
*(const char**)(((const char*)base)[i * size]);
Will type-compile to something like (syntax: expresion -> type)...
base -> const void*
(const char*)base -> (const char*)const void*
(((const char*)base)[i * size]) -> *(const char*)const void* -> const char // Here's the problem!
(const char**)(((const char*)base)[i * size]) -> (const char**)const char // Now, this is *not* good...
*(const char**)(((const char*)base)[i * size]) -> *(const char**)const char -> const char* // Well, we have now just perfectly dereferenced '~'...
Not very sane, isn't it?*
BTW: You don't give us enough information for me to provide a full solution to your problem. What are those structs you talk about?
Edit: May this help you (written according to comments)?
*(const char**)&(((const unsigned char*)base)[size * i])

qsort and strcmp problems when dealing with empty strings and whitespace

I am having a problem getting qsort to work for an array of strings (char * to be exact). I create a compare function that I thought should work for the qsort requirement, but it does not seem to work at all. I also need it to work with whitespace characters and blank strings (ex. ""). Any direction or note on what I am doing wrong would be greatly appreciated.
My relevant source code:
int compareAlphabetically(const void * a,const void * b);
void sortStringArray(char * * arrString, int len){
int size = sizeof(arrString) / sizeof(char *);
if(*arrString != NULL && len > 1)
qsort(arrString, size, sizeof(char *), compareAlphabetically);
}
int compareAlphabetically(const void * a, const void * b)
{
const char *a_p = *(const char **) a;
const char *b_p = *(const char **) b;
return strcmp(a_p,b_p);
}
The function definition is (and it should remain unchanged):
/**
* Takes an array of C-strings, and sorts them alphabetically, ascending.
*
* arrString: an array of strings
* len: length of the array 'arrString'
*
* For example,
* int len;
* char * * strArr = explode("lady beatle brew", " ", &len);
* sortStringArray(strArr, len);
* char * str = implode(strArr, len, " ");
* printf("%s\n", str); // beatle brew lady
*
* Hint: use the <stdlib.h> function "qsort"
* Hint: you must _clearly_ understand the typecasts.
*/
void sortStringArray(char * * arrString, int len);
Wrong size calculation.
size = sizeof(arrString) / sizeof(char *); is likely always 1:
the size of a pointer (char **) divided by the size of a pointer (char *).
Code likely needs to use len:
void sortStringArray(char * * arrString, int len){
if(*arrString != NULL && len > 1) {
// qsort(arrString, size, sizeof(char *), compareAlphabetically);
qsort(arrString, len, sizeof(char *), compareAlphabetically);
}
}
[Edit]
Note: The len > 1 is not functionally needed.
For a value of 0, len > 1 is not needed. But as len may be less than 0 and size_t is some unsigned type (and changing a negative int to some unsigned tpye is disater), using len > 1 is prudent.

Convert Char *ptr to Char *ptrArray[] to use in qsort

Working on using qsort.
#include <string.h>
#include <stdlib.h>
#pragma once
int cstring_cmp(const void *a, const void *b);
int main()
{
int count = 0;
char * randomStr = "sdjsn9i3ms;sa;'smsn92;w;''[w0p4;dsmsdf";
char * charArray[] =
{"s","d","j","s","n","9","i","3","m","s",";","s","a",";","'","s","m","s","n"
,"9","2",";","w",";","'","'","[","w","0","p","4",";","d","s","m","s","d","f"};
size_t strings_len = sizeof(charArray) / sizeof(char *);
/*void qsort(void *base, size_t nel,
size_t width, int (*compar)(const void *, const void *));*/
qsort(charArray, strings_len, sizeof(char *), cstring_cmp);
qsort(randomStr, strings_len, sizeof(char *), cstring_cmp);
// Pause at command prompt
system("pause");
return 0;
} // Close function Main
int cstring_cmp(const void *a, const void *b)
{
const char **ia = (const char **)a;
const char **ib = (const char **)b;
return strcmp(*ia, *ib);
}
So obviously my second qsort is not working. whether that's based on my cstring_cmp function that goes into my qsort not being able to support the base i give it or because my base is not formatted to correctly input into the qsort is a mystery to me.
My Question is how do I convert char * randomStr to char * charArray[] dynamically, during runtime, on the fly, or whatever cool phrase you can come up with. I've searched around a lot and maybe I was just not asking the right question, so I'm coming to you guys for some real question answering power.
Just starting C so if you please try not to fry my brain with your answers me and my brain would appreciate it.
My end goal here is to convert randomStr to a format of charArray, qsort it then convert it back to the randomStr format so i can do some find and replace things I have set up already.
Any help would be great, thanks.
First of all if you use char *randomStr = "Stuff" you can't change it, it's undefined behavior. Second, try this:
int
cmp_fry_brain(const void *a, const void *b)
{
return *((const char *)a) - *((const char *)b);
}
/* This is equivalent to the one above (the compiler will likely emit the
* exact same code).
*/
int
cmp(const void *a, const void *b)
{
const char *x = a;
const char *y = b;
return *x - *y;
}
int
main()
{
char str[] = "This is the end";
qsort(str, strlen(str), 1, cmp_fry_brain);
/* ... */
}

Resources