What is "-1L" / "1L" in C? - c

What do "-1L", "1L" etc. mean in C ?
For example, in ftell reference, it says
... If an error occurs, -1L is returned ...
What does this mean ? What is the type of "1L" ?
Why not return NULL, if error occurs ?

The L specifies that the number is a long type, so -1L is a long set to negative one, and 1L is a long set to positive one.
As for why ftell doesn't just return NULL, it's because NULL is used for pointers, and here a long is returned. Note that 0 isn't used because 0 is a valid value for ftell to return.
Catching this situation involves checking for a non-negative value:
long size;
FILE *pFile;
...
size = ftell(pFile);
if(size > -1L){
// size is a valid value
}else{
// error occurred
}

ftell() returns type long int, the L suffix applied to a literal forces its type to long rather than plain int.
NULL would be wholly incorrect because it is a macro representing a pointer not an integer. Its value, when interpreted and an integer may represent a valid file position, while -1 (or any negative value) cannot.
For all intents and purposes you can generally simply regard the error return as -1, the L suffix is not critical to correct operation in most cases due to implicit casting rules

It means to return the value as a long, not an int.

That means -1 as a long (rather than the default type for numbers, which is an integer)

-1 formated in long int is a -1L. Why not simple NULL? Because NULL in this function is a normal result and can't sygnalize error too. Why NULL in this function is a normal result? Because NULL == 0 and ftell returns position in a stream, when you are on start of stream function returns 0 and this is a normal result not error, then if you compare this function to NULL to check error, you will be get error when you will be on start position in stream.

Editing today implies more details are still wanted.
Mark has it right. The "L" suffix is long. -1L is thus a long -1.
My favored way to test is different from Marks and is a matter of preference not goodness.
if ( err >= 0L )
success
else
error
By general habit I do not like looking for explicit -1. If a -2 ever pops up in the future my code will likely not break.
Ever since I started using C, way back in the beginning of C, I noticed most library routines returning int values return 0 for success and -1 on error. Most.
NULL is not normally returned by integer functions as NULL is a pointer value. Besides the clash of types a huge reason for not returning NULL depends on a bit of history.
Things were not clean back when C was being invented, and maybe not even on small systems today. The original K&R C did not guarantee NULL would be zero as is usually the case on CPUs with virtual memory. On small "real memory" systems zero may be a valid address making it necessary for "invalid" addresses to be moved to some other OS dependent location. Such would really be accepted by the CPU, just not generated in the normal scheme of things. Perhaps a very high memory address. I can even see a hidden array called extern const long NULL[1]; allowing NULL to become the address of this otherwise unused array.
Back then you saw a lot of if ( ptr != NULL ) statements rather than if ( ptr ) for people serious about writing portable code.

Related

Assign result of sizeof() to ssize_t

It happened to me that I needed to compare the result of sizeof(x) to a ssize_t.
Of course GCC gave an error (lucky me (I used -Wall -Wextra -Werror)), and I decided to do a macro to have a signed version of sizeof().
#define ssizeof (ssize_t)sizeof
And then I can use it like this:
for (ssize_t i = 0; i < ssizeof(x); i++)
The problem is, do I have any guarantees that SSIZE_MAX >= SIZE_MAX? I imagine that sadly this is never going to be true.
Or at least that sizeof(ssize_t) == sizeof(size_t), which would cut half of the values but would still be close enough.
I didn't find any relation between ssize_t and size_t in the POSIX documentation.
Related question:
What type should be used to loop through an array?
There is no guarantee that SSIZE_MAX >= SIZE_MAX. In fact, it is very unlikely to be the case, since size_t and ssize_t are likely to be corresponding unsigned and signed types, so (on all actual architectures) SIZE_MAX > SSIZE_MAX. Casting an unsigned value to a signed type which cannot hold that value is Undefined Behaviour. So technically, your macro is problematic.
In practice, at least on 64-bit platforms, you're unlikely to get into trouble if the value you are converting to ssize_t is the size of an object which actually exists. But if the object is theoretical (eg sizeof(char[3][1ULL<<62])), you might get an unpleasant surprise.
Note that the only valid negative value of type ssize_t is -1, which is an error indication. You might be confusing ssize_t, which is defined by Posix, with ptrdiff_t, which is defined in standard C since C99. These two types are the same on most platforms, and are usually the signed integer type corresponding to size_t, but none of those behaviours is guaranteed by either standard. However, the semantics of the two types are different, and you should be aware of that when you use them:
ssize_t is returned by a number of Posix interfaces in order to allow the function to signal either a number of bytes processed or an error indication; the error indication must be -1. There is no expectation that any possible size will fit into ssize_t; the Posix rationale states that:
A conforming application would be constrained not to perform I/O in pieces larger than {SSIZE_MAX}.
This is not a problem for most of the interfaces which return ssize_t because Posix generally does not require interfaces to guarantee to process all data. For example, both read and write accept a size_t which describes the length of the buffer to be read/written and return an ssize_t which describes the number of bytes actually read/written; the implication is that no more than SSIZE_MAX bytes will be read/written even if more data were available. However, the Posix rationale also notes that a particular implementation may provide an extension which allows larger blocks to be processed ("a conforming application using extensions would be able to use the full range if the implementation provided an extended range"), the idea being that the implementation could, for example, specify that return values other than -1 were to be interpreted by casting them to size_t. Such an extension would not be portable; in practices, most implementations do limit the number of bytes which can be processed in a single call to the number which can be reported in ssize_t.
ptrdiff_t is (in standard C) the type of the result of the difference between two pointers. In order for subtraction of pointers to be well defined, the two pointers must refer to the same object, either by pointing into the object or by pointing at the byte immediately following the object. The C committee recognised that if ptrdiff_t is the signed equivalent of size_t, then it is possible that the difference between two pointers might not be representable, leading to undefined behaviour, but they preferred that to requiring that ptrdiff_t be a larger type than size_t. You can argue with this decision -- many people have -- but it has been in place since C90 and it seems unlikely that it will change now. (Current standard wording from , ยง6.5.6/9: "If the result is not representable in an object of that type [ptrdiff_t], the behavior is undefined.")
As with Posix, the C standard does not define undefined behaviour, so it would be a mistake to interpret that as forbidding the subtraction of two pointers in very large objects. An implementation is always allowed to define the result of behaviour left undefined by the standard, so that it is completely valid for an implementation to specify that if P and Q are two pointers to the same object where P >= Q, then (size_t)(P - Q) is the mathematically correct difference between the pointers even if the subtraction overflows. Of course, code which depends on such an extension won't be fully portable, but if the extension is sufficiently common that might not be a problem.
As a final point, the ambiguity of using -1 both as an error indication (in ssize_t) and as a possibly castable result of pointer subtraction (in ptrdiff_t) is not likely to be a present in practice provided that size_t is as large as a pointer. If size_t is as large as a pointer, the only way that the mathematically correct value of P-Q could be (size_t)(-1) (aka SIZE_MAX) is if the object that P and Q refer to is of size SIZE_MAX, which, given the assumption that size_t is the same width as a pointer, implies that the object plus the following byte occupy every possible pointer value. That contradicts the requirement that some pointer value (NULL) be distinct from any valid address, so we can conclude that the true maximum size of an object must be less than SIZE_MAX.
Please note that you can't actually do this.
The largest possible object in x86 Linux is just below 0xB0000000 in size, while SSIZE_T_MAX is 0x7FFFFFFF.
I haven't checked if read and stuff actually can handle the largest possible objects, but if they can it worked like this:
ssize_t result = read(fd, buf, count);
if (result != -1) {
size_t offset = (size_t) result;
/* handle success */
} else {
/* handle failure */
}
You may find libc is busted. If so, this would work if the kernel is good:
ssize_t result = sys_read(fd, buf, count);
if (result >= 0 || result < -256) {
size_t offset = (size_t) result;
/* handle success */
} else {
errno = (int)-result;
/* handle failure */
}
ssize_t is a POSIX type, it's not defined as part of the C standard. POSIX defines that ssize_t must be able to handle numbers in the interval [-1, SSIZE_MAX], so in principle it doesn't even need to be a normal signed type. The reason for this slightly weird definition is that the only place ssize_t is used is as the return value for read/write/etc. functions.
In practice it's always a normal signed type of the same size as size_t. But if you want to be really pedantic about your types, you shouldn't use it for other purposes than handling return values for IO syscalls. For a general "pointer-sized" signed integer type C89 defines ptrdiff_t. Which in practice will be the same as ssize_t.
Also, if you look at the official spec for read(), you'll see that for the 'nbyte' argument it says that 'If the value of nbyte is greater than {SSIZE_MAX}, the result is implementation-defined.'. So even if a size_t is capable of representing larger values than SSIZE_MAX, it's implementation-defined behavior to use larger values than that for the IO syscalls (the only places where ssize_t is used, as mentioned). And similar for write() etc.
I'm gonna take this on as an X-Y problem. The issue you have is that you want to compare a signed number to an unsigned number. Rather than casting the result of sizeof to ssize_t, You should check if your ssize_t value is less than zero. If it is, then you know it is less than the your size_t value. If not, then you can cast it to size_t and then do a comparison.
For an example, here's a compare function that returns -1 if the signed number is less than the unsigned number, 0 if equal, or 1 if the signed number is greater than the unsigned number:
int compare(ssize_t signed_number, size_t unsigned_number) {
int ret;
if (signed_number < 0 || (size_t) signed_number < unsigned_number) {
ret = -1;
}
else {
ret = (size_t) signed_number > unsigned_number;
}
return ret;
}
If all you wanted was the equivalent of < operation, you can go a bit simpler with something like this:
(signed_number < 0 || (size_t) signed_number < unsigned_number))
That line will give you 1 if signed_number is less than unsigned_number and it limits the branching overhead. Just takes an extra < operation and a logical-OR.

Valgrind C: Argument of function has a fishy (possibly negative) value

I'm getting this error message in multiple places in my code where I call malloc or realloc. Here is one example.
void* reallocate_array(void* ptr, size_t size)
{
return realloc(ptr,size)
}
EDIT2: Looks like the problem is in the test case. I can't modify this
char* reallocated = (char*) reallocate_array(allocated,-1)
Here is my solution which got rid of the fishy value error
if((int)size < 0)
{
return NULL;
}
I was under the impression that size_t was an unsigned integer, meaning it could never be negative. Could this be bug in Valgrind or is it warning me of a possible wraparound?
EDIT: Valgrind output
==20841== 1 errors in context 1 of 3:
==20841== Argument 'size' of function realloc has a fishy (possibly negative) value: -1
==20841== at 0x4C2BB78: realloc (vg_replace_malloc.c:785)
==20841== by 0x4057B1: reallocate_array (allocation.c:24)
==20841== by 0x402A8A: reallocate_NegativeBytes_Test::TestBody() (tests.cpp:56)
Props to Valgrind: it is quite right that passing a negative actual argument to a parameter of unsigned type is fishy. The result in your particular case will be that the argument is converted to the largest representable value of type size_t, but that may very well be different from what was intended.
I suspect that the conversion to a large, positive unsigned value is indeed different from what was intended by your test case. Inasmuch as the test case expects the memory allocation to fail, the case probably was passing, but not for the reason I suspect its author anticipated. At minimum, it is a bad test case on account of being unclear about what it is intended to test.
As for your solution, it is fishy, too. The standard has this to say about your conversion of (size_t) -1 to type int:
Otherwise, the new type is signed and the value cannot be represented
in it; either the result is implementation-defined or an
implementation-defined signal is raised.
(C2011 6.3.1.3/3)
Implementation-defined behavior and the possibility of a signal is not a comfortable place to hang your hat.
If you insist on validating the value inside the function, then you might consider this test:
if (size & ~(SIZE_MAX >> 1)) {
// ...
}
That tests whether the most-significant bit of size is set, which it will be if the value was converted from any negative number of a type no wider than size_t.
Myself, however, I would try to get the test case changed or dropped. Use Valgrind's complaints about it to support your argument, if you wish.

unistd.h library's getopt() function

I'm currently trying to make my own version of getopt() function.
But I do not know how it returns a character type as an int type.
Is there any way I can have a look into the source code of the getopt() function?
The source code of getopt() in glibc is here: https://github.com/lattera/glibc/blob/master/posix/getopt.c
Of course there are more implementations you might look at, but this is probably the most popular one. Here's another, from FreeBSD: https://github.com/lattera/freebsd/blob/master/lib/libc/stdlib/getopt.c
The return value of getopt(3) function is int to allow for an extra value (apart of all the possible chars it returns) to mark the end of options condition. This extra value is EOF (as in getchar(3) function) which must be different from any char possible value.
To deal with this and the possibility of different C compilers implement char either as signed or unsigned, both functions implement the return value as the character value as an unsigned byte from 0 to 255 (by mapping all the negative values to positive, this is adding to the negative values the constant 256 ---this is an example, as the language doesn't specify exactly how this is done---, so the negatives go in the range 128..255), and reserve EOF as the value -1.
If you are writing a getopt(3) function to be integrated in your system's standard c library, just check what value is used for EOF (most probable is -1) and then implement it so the values returned for your default char type don't conflict/overlap with/ it.

Discerning whether a value of 255 is genuine data or an error output

I'm using a microcontroller to send unsigned 8-bit data to Matlab. Whenever there is any data loss, Matlab displays a value of 255. The underlying code of the Matlab program that interfaces with the WIN32 USB APIs shows that a value of -1 is returned for a range of errors. Since the data is of the unsigned 8-bit type, a value of -1 would be interpreted as 255, which explains why the latter number is displayed when a transmission error has occurred.
So, how could one tell whether a value of 255 represents genuine data or an error output?
Thanks and cheers!
(This is only a partial answer.)
This sounds similar to the way C's standard character input is done.
The fgetc() function returns an int result, which is either the value EOF (typically -1), if there was an error or there's no more data to read, or the value of the character that was successfully read, treated as an unsigned char and converted to int.
If you store the value returned by fgetc() in a signed char object (note that plain char may be either signed or unsigned), a value of -1 could indicate either that fgetc() returned EOF, or that it successfully read a byte with the value 0xFF. That's the problem with this kind of in-band signalling; it can be difficult to distinguish between an error indication and valid data that happens to look like an error indication.
With fgetc(), there are two ways to resolve this. You can store the result in an int, which means you'll get distinct values for EOF (-1) and for 0xFF (255). Or you can call the feof() and ferror() functions after calling fgetc(); if either returns a true value, you know that the EOF indicated an actual error or end-of-file condition.
You haven't told us enough about the interface between your microcontroller and Matlab to know how you can make this distinction. If there's some other function you can call, something similar to feof() or ferror(), you could call it when you get a -1 or 255 result to determine what that result means. Or, if possible, you might consider modifying the interface you're using so it returns a result bigger than one byte, so that the error indication -1 is distinct from all possible valid data values.
Well, if the function is supposed to return -1 upon failure, there is no way that reasonable output would return 255. If the function can return -1, it's using a signed 8-bit return, not an unsigned which means its return range should be -128 -> 128. 255 would never be genuine data.

C/GL: Using -1 as sentinel on array of unsigned integers

I am passing an array of vertex indices in some GL code... each element is a GLushort
I want to terminate with a sentinel so as to avoid having to laboriously pass the array length each time alongside the array itself.
#define SENTINEL ( (GLushort) -1 ) // edit thanks to answers below
:
GLushort verts = {0, 0, 2, 1, 0, 0, SENTINEL};
I cannot use 0 to terminate as some of the elements have value 0
Can I use -1?
To my understanding this would wrap to the maximum integer GLushort can represent, which would be ideal.
But is this behaviour guaranteed in C?
(I cannot find a MAX_INT equivalent constant for this type, otherwise I would be using that)
If GLushort is indeed an unsigned type, then (GLushort)-1 is the maximum value for GLushort. The C standard guarantees that. So, you can safely use -1.
For example, C89 didn't have SIZE_MAX macro for the maximum value for size_t. It could be portably defined by the user as #define SIZE_MAX ((size_t)-1).
Whether this works as a sentinel value in your code depends on whether (GLushort)-1 is a valid, non-sentinel value in your code.
GLushort is an UNSIGNED_SHORT type which is typedefed to unsigned short, and which, although C does not guarantee it, OpenGL assumes as a value with a 2^16-1 range (Chapter 4.3 of the specification). On practically every mainstream architecture, this somewhat dangerous assumption holds true, too (I'm not aware of one where unsigned short has a different size).
As such, you can use -1, but it is awkward because you will have a lot of casts and if you forget a cast for example in an if() statement, you can be lucky and get a compiler warning about "comparison can never be true", or you can be unlucky and the compiler will silently optimize the branch out, after which you spend days searching for the reason why your seemingly perfect code executes wrong. Or worse yet, it all works fine in debug builds and only bombs in release builds.
Therefore, using 0xffff as jv42 has advised is much preferrable, it avoids this pitfall alltogether.
I would create a global constant of value:
const GLushort GLushort_SENTINEL = (GLushort)(-1);
I think this is perfectly elegant as long as signed integers are represented using 2's complement.
I don't remember if thats guaranteed by the C standard, but it is virtually guaranteed for most CPU's (in my experience).
Edit: Appparently this is guaranteed by the C standard....
If you want a named constant, you shouldn't use a const qualified variable as proposed in another answer. They are really not the same. Use either a macro (as others have said) or an enumeration type constant:
enum { GLushort_SENTINEL = -1; };
The standard guarantees that this always is an int (really another name of the constant -1) and that it always will translate into the max value of your unsigned type.
Edit: or you could have it
enum { GLushort_SENTINEL = (GLushort)-1; };
if you fear that on some architectures GLushort could be narrower than unsigned int.

Resources