"programming pearls": Strings of Pearls - c

In column 15.3, the author introduced how to generate text randomly from an input document. The author also gave the source code.
qsort(word, nword, sizeof(word[0]), sortcmp);
int sortcmp(char **p, char **q)
{ return wordncmp(*p, *q);
}
I've been confused by the above lines in the source code.
The last argument of qsort is:
int comparator ( const void * elem1, const void * elem2 ).
But the definition of sortcmp is different. Actually, the source code cannot compiled in my VS2010.

It seems this code was originally compiled with a more forgiving (or less standard-compliant) compiler. The idea seems to be that the canonical void * arguments of the comparator function are interpreted as char ** so that wordncmp(), which is an implementation of lexicographical comparison of up to length n, can be applied to them.
Declaring the function as expected (i.e. taking two const void * arguments) and making the type casts explicit appears to solve the problem (tested with GCC 4.7.0):
int sortcmp(const void *p, const void *q) {
return wordncmp(*(const char **)p, *(const char **)q);
}
I also had to modify the declaration of the wordncmp() function:
int wordncmp(const char *p, const char* q)
{
/*.. Definition unchanged.. */
}

Related

Why can `qsort` be called with a compare function with the wrong signature and compile has no warnings

I was working on consolidating a code base (moving a qsort compar function to a new header /library so that it could be shared without being copy/pasta) and noticed something strange in the process.
Here is a demonstrative listing:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
/** One record has three fields.
* Each field contains a NULL terminated string of length at most 7 characters. */
typedef char Record[3][8];
int main(void)
{
Record database[5] = {0};
strcpy(database[0][0], "ZING");
strcpy(database[0][1], "BOP");
strcpy(database[0][2], "POW");
strcpy(database[1][0], "FIDDLY");
strcpy(database[1][1], "ECHO");
strcpy(database[1][2], "ZOOOM");
strcpy(database[2][0], "AH");
strcpy(database[2][1], "AAAAA");
strcpy(database[2][2], "AH");
strcpy(database[3][0], "BO");
strcpy(database[3][1], "DELTA");
strcpy(database[3][2], "FO");
strcpy(database[4][0], "FRRING");
strcpy(database[4][1], "CRASH");
strcpy(database[4][2], "FOO");
//(gdb) ptype record_compare_field_1
//type = int (char (*)[8], char (*)[8])
int record_compare_field_1();
qsort(database, 5, sizeof(Record), record_compare_field_1);
for (int i = 0; i < 5; i++){
printf("%s\t%s\t%s\n", database[i][0], database[i][1], database[i][2]);
}
}
/* Compares Records at field one. */
int record_compare_field_1(Record rowA, Record rowB)
{
return strcmp(rowA[1], rowB[1]);
}
Compile and run:
$ gcc -Wall main.c
$ ./a.out
AH AAAAA AH
ZING BOP POW
FRRING CRASH FOO
BO DELTA FO
FIDDLY ECHO ZOOOM
It's surprising to me that:
The compiler has no warnings since the signature of the compar function passed to quick sort does not have the prescribed function signature int (*compar)(const void *, const void *). Even in gdb, when I run ptype record_compare_field_1, it looks like the signature does not contain const *void.
The output is somehow correct? (Sorted based on field one (zero-indexed) results in AAAAA, BOP, CRASH, DELTA, ECHO.
The questions are:
Why/how does this work? Is this an old-school way of doing this?
If I wanted to change the qsort compar function in use to use the proper signature, how would I do that (I been struggling trying to come up with the proper casts)?
Thank you!
The int record_compare_field_1(); declaration does not have a prototype. This is an obsolescent feature of the C17/C18 standard.
In the function call qsort(database, 5, sizeof(Record), record_compare_field_1);, the record_compare_field_1 argument has type int (*)() and qsort's compar parameter has type int (*)(const void *, const void *). This is allowed by this rule from C17 6.2.7:
— If only one type is a function type with a parameter type list (a function prototype), the composite type is a function prototype with the parameter type list.
The actual record_compare_field_1 function definition has the prototype int record_compare_field_1(Record, Record) where the Record type is defined by typedef char Record[3][8]. Since array parameters are adjusted to pointers, this is the same as the prototype int record_compare_field_1(char (*)[8], char (*)[8]).
qsort will call the passed in record_compare_field_1 function with the wrong prototype, leading to undefined behavior. Most C implementations use the same representation for all object pointer types, so it lets you get away with it.
To do it properly, the record_compare_field_1 function could be defined like this:
int record_compare_field_1(const void *a, const void *b)
{
const Record *p_rowA = a;
const Record *p_rowB = b;
return strcmp((*p_rowA)[1], (*p_rowB)[1]);
}

Can I use memcmp along with qsort?

I am making C dynamic array library, kind of. Note that I'm doing it for fun in my free time, so please do not recommend million of existing libraries.
I started implementing sorting. The array is of arbitrary element size, defined as struct:
typedef struct {
//[PRIVATE] Pointer to array data
void *array;
//[READONLY] How many elements are in array
size_t length;
//[PRIVATE] How many elements can further fit in array (allocated memory)
size_t size;
//[PRIVATE] Bytes per element
size_t elm_size;
} Array;
I originally prepared this to start with the sort function:
/** sorts the array using provided comparator method
* if metod not provided, memcmp is used
* Comparator signature
* int my_comparator ( const void * ptr1, const void * ptr2, size_t type_size );
**/
void array_sort(Array* a, int(*comparator)(const void*, const void*, size_t)) {
if(comparator == NULL)
comparator = &memcmp;
// Sorting algorithm should follow
}
However I learned about qsort:
void qsort (void* base, size_t num, size_t size, int (*compar)(const void*,const void*));
Apparently, I could just pass my internal array to qsort. I could just call that:
qsort (a->array, a->length, a->elm_size, comparator_callback);
But there's a catch - qsort's comparator signature reads as:
int (*compar)(const void*,const void*)
While memcmp's signature is:
int memcmp ( const void * ptr1, const void * ptr2, size_t type_size );
The element size is missing in qsort's callback, meaning I can no longer have a generic comparator function when NULL is passed as callback. I could manually generate comparators up to X bytes of element size, but that sounds ugly.
Can I use qsort (or other sorting built-in) along with memcpy? Or do I have to chose between built-in comparator and built-in sorting function?
C11 provides you with an (admittedly optional) qsort_s function, which is intended to deal with this specific situation. It allows you to pass-through a user-provided void * value - a context pointer - from the calling code to the comparator function. The comparator callback in this case has the following signature
int (*compar)(const void *x, const void *y, void *context)
In the simplest case you can pass a pointer to the size value as context
#define __STDC_WANT_LIB_EXT1__ 1
#include <stdlib.h>
...
int comparator_callback(const void *x, const void *y, void *context)
{
size_t elm_size = *(const size_t *) context;
return memcmp(x, y, elm_size);
}
...
qsort_s(a->array, a->length, a->elm_size, comparator_callback, &a->elm_size);
Or it might make sense to pass a pointer to your entire array object as context.
Some *nix-based implementations have been providing a similar qsort_r function for a while, although it is non-standard.
A non-thread-safe way is use private global variable to pass the size.
static size_t compareSize = 0;
int defaultComparator(const void *p1, const void *p2) {
return memcmp(p1, p2, compareSize);
}
void array_sort(Array* a, int(*comparator)(const void*, const void*, size_t)) {
if(comparator == NULL) {
compareSize = a->elm_size;
comparator = &defaultComparator;
}
// Sorting algorithm should follow
}
You can make it thread-safe by make compareSize thread-local variable (__thread)
The qsort() API is a legacy of simpler times. There should be an extra "environment" pointer passed unaltered from the qsort() call to each comparison. That would allow you to pass the object size and any other necessary context in a thread safe manner.
But it's not there. #BryanChen's method is the only reasonable one.
The main reason I'm writing this answer is to point out that there are very few cases where memcmp will do something useful. There are not many kinds of objects where comparison by lexicographic order of constituent unsigned chars makes any sense.
Certainly comparing structs that way is dangerous because padding byte values are unspecified. Even the equality part of the comparison can fail. In other words,
struct foo { int i; };
void bar(void) {
struct foo a, b;
a.i = b.i = 0;
if (memcmp(&a, &b, sizeof a) == 0) printf("equal!");
}
may - by the C standard - print nothing!
Another example: for something as simple as unsigned ints, you'll get different sort orders for big- vs. little-endian storage order.
unsigned a = 0x0102;
unsigned b = 0x0201;
printf("%s", memcmp(&a, &b, sizeof a) < 0 ? "Less!" : "More!");
will print Less or More depending on the machine where it's running.
Indeed the only object type I can imagine that makes sense to compare with memcmp is equal-sized blocks of unsigned bytes. This isn't a very common use case for sorting.
In all, a library that offers memcmp as a comparison function is doomed to be error prone. Someone will try to use it as a substitute for a specialized comparison that's really the only way to obtain the desired result.

const array const {}

So you can do this:
void foo(const int * const pIntArray, const unsigned int size);
Which says that the pointer coming is read-only and the integer's it is pointing to are read-only.
You can access this inside the function like so:
blah = pIntArray[0]
You can also do the following declaration:
void foo(const int intArray[], const unsigned int size);
It is pretty much the same but you could do this:
intArray = &intArray[1];
Can I write:
void foo(const int const intArray[], const unsigned int size);
Is that correct?
No, your last variant is not correct. What you are trying to do is achieved in C99 by the following new syntax
void foo(const int intArray[const], const unsigned int size);
which is equivalent to
void foo(const int *const intArray, const unsigned int size);
That [const] syntax is specific to C99. It is not valid in C89/90.
Keep in mind that some people consider top-level cv-qualifiers on function parameters "useless", since they qualify a copy of the actual argument. I don't consider them useless at all, but personally I don't encounter too many reasons to use them in real life.
Use cdecl. It gives an error on the second entry. The first only clearly suggests that the second const refers to the *.
In C/C++, you cannot pass an entire array as an argument to a function.
You can,
however, pass to the function a pointer to an array by specifying the array's name
without an index.
(E.g)
This program fragment passes the address of i to func1() :
int main(void)
{
int i[10];
func1(i);
.
.
.
}
To receive i, a function called func1() can be defined as
void func1(int x[]) /* unsized array */
{
.
.
}
or
void func1(int *x) /* pointer */
{
.
.
}
or
void func1(int x[10]) /* sized array */
{
.
.
}
source : THE COMPLETE REFERENCE - HERBERT.

pointer from integer w/o cast warning when calling lfind

I'm writing a vector in C. The CVectorSearch function uses bsearch if it's sorted, and lfind if it's unsorted. Why am I getting the warning "assignment makes pointer from integer without a cast" when I'm calling lfind? It seems to work fine even when lfind is being used.
typedef struct
{
void *elements;
int logicalLength;
int allocatedLength;
int elementSize;
} CVector;
typedef void (*CVectorFreeElemFn)(void *elemAddr);
int CVectorSearch(const CVector *v, const void *key,
CVectorCmpElemFn comparefn,
int startIndex, bool isSorted)
{
void * found;
int elemSize = v->elementSize;
int length = v->logicalLength;
void *startAddress = (char*)v->elements + startIndex*elemSize;
if(isSorted)
found = bsearch(key, startAddress, length, elemSize, comparefn);
else
found = lfind(key, startAddress, &length, elemSize, comparefn);
if(found)
return ((char*)found - (char*)v->elements) / elemSize;
else
return -1;
}
edit: Now that I've included search.h I'm getting:
warning: passing argument 3 of 'lfind' from incompatible pointer type
The program is still working correctly, though.
Have you included <search.h> which defines lfind? If a function is called without a prototype, your compiler may assume it returns int.
The third argument to lfind() is a pointer to size_t not int as you are passing. The size_t type may be of a different size than int on some architectures (particularly x86-64) and it is also unsigned. You have to change the type of the length variable.
I don't think the above questions really solve the issue as I had this problem. The true answer I believe is the distinction between bsearch prototype and lfind prototype. Let's takea look
void *bsearch(const void *key, const void *base, size_t nmemb,
size_t size, int (*compar)(const void *, const void *));
Versus
void *lfind(const void *key, const void *base, size_t *nmemb,
size_t size, int(*compar)(const void *, const void *));
If you'll notice that the third parameter of the lfind function is a pointer to a size_t type not (as in the bsearch function ) a direct copied value.
Just make sure you pass in the address of the size and it'll be fine.

Problem compiling K&R example

I'm having trouble compiling the example program presented in section 5.11 of the book. I have removed most of the code and left only the relevant stuff.
#define MAXLINES 5000
char *lineptr[MAXLINES];
void qsort1(void *lineptr[], int left, int right, int (*comp)(void *, void *));
int numcmp(char *, char *);
main(int argc, char *argv[]) {
int numeric = 1;
/* ... */
qsort1((void**) lineptr, 0, 100, (int (*)(void*, void*))(numeric ? numcmp : strcmp));
}
void qsort1(void *v[], int left, int right, int (*comp)(void *, void *)) {
/* ... */
}
int numcmp(char *s1, char *s2) {
/* ... */
}
The problem is that the code doesn't compile (I'm using Digital Mars compiler). The error I get is this:
qsort1((void**) lineptr, 0, nlines - 1, (int (*)(void*, void*))(numeric
? numcmp : strcmp));
^
go.c(19) : Error: need explicit cast to convert
from: int (*C func)(char const *,char const *)
to : int (*C func)(char *,char *)
--- errorlevel 1
There must be something wrong with the declarations although I pasted the code from the book correctly. I don't know enough to make the right changes (the section about the function pointers could certainly have been written more extensively).
EDIT: I should have mentioned that I'm reading the ANSI version of the book.
I think the error comes from the fact that old C did not know const yet: strcmp there took two pointers to non-const characters (char *) i think (which could be the reason why it compiled back then, but not with your compiler). However, nowadays strcmp takes char const* (const char* is the same thing). Change your function prototype to this:
int numcmp(char const*, char const*);
That's a common problem :)
The following line tells qsort to expect a pointer to a function with two void* parameters. Unfortunately, strcmp takes two non-modifiable strings hence it's signature is
int (*comp)(const char*, const char*)
instead of what you have:
int (*comp)(void *, void *)
Change the signature of both qsort1 and numeric:
qsort1(void *v[], int left, int right, int (*comp)(const void *, const void *))
and:
int numcmp(const char*, const char*)
The standard function pointer expected by qsort() or bsearch() has the prototype:
int comparator(const void *v1, const void *v2);
The qsort1() defined in the code expects:
int comparator(void *v1, void *v2);
The comparator functions defined in the code do not have that prototype, and there is no automatic conversion between different function pointer types.
So, fixes for qsort1() are either:
Introduce a cast: (int (*)(void *, void *)), or
Rewrite the comparators:
int numcmp(void *v1, void *v2)
{
char *s1 = v1;
char *s2 = v2;
...
}
int str_cmp(void *v1, void *v2) // Note new function name!
{
return(strcmp(v1, v2));
}
Obviously, the call to qsort1() would reference str_cmp instead of strcmp. The authors sought to avoid an intermediate function, but run foul of the (legitimately) fussier compilers in use nowadays.
The standard version of qsort() would require a bunch of const qualifiers, as in the first version of this answer.
Note that strcmp takes two const arguments, whereas your numcmp does not. Therefore, the two functions' types do not match, and the ? : operator will complain.
Do one of:
change numcmp to match the strcmp prototype in terms of constness
push the (int (*)(void*, void*)) cast inside the ? :, e.g.
numeric ? (int (*)(void*, void*))numcmp : (int (*)(void*, void*))strcmp
Its been awhile since I have done any pure C programming, I'm not certain on the new standard.
However casting to void ** creates a pointer to a pointer where the function requires a pointer to an array. Sure, they are the same thing internally, but strong typechecking will catch that as an error.
rewrite the qsort to accept ** instead of *[] and you should be ok.

Resources