Can we change the size of size_t in C? - c

Can we change the size of size_t in C?

No. But why would you even want to do it?

size_t is not a macro. It is a typedef for a suitable unsigned integer type.
size_t is defined in <stddef.h> (and other headers).
It probably is typedef unsigned long long size_t; and you really should not even think about changing it. The Standard Library uses it as defined by the Standard Library. If you change it, as you cannot change the Standard Library, you'll get all kinds of errors because your program uses a different size for size_t than the Standard Library. You can no longer call malloc(), strncpy(), snprintf(), ...

If you want to fork Linux or NetBSD, then "Yes"
Although you can redefine macros this one is probably a typedef.
If you are defining an environment then it's perfectly reasonable to specify size_t as you like. You will then be responsible for all the C99 standard functions for which conforming code expects size_t.
So, it depends on your situation. If you are developing an application for an existing platform, then the answer is no.
But if you are defining an original environment with one or more compilers, then the answer is yes, but you have your work cut out for you. You will need an implementation of all the library routines with an API element of size_t which can be compiled with the rest of your code with the new size_t typedef. So, if you fork NetBSD or Linux, perhaps for an embedded system, then go for it. Otherwise, you may well find it "not worth the effort".

Related

C99: Custom implementations of non-standard functions

In my project, I want to use a non-standard library function, which which may not be defined on certain systems. In my case, it is strlcpy.
From man strcpy:
Some systems (the BSDs, Solaris, and others) provide the following function:
size_t strlcpy(char *dest, const char *src, size_t size);
...
My system does not implement strlcpy, so I rolled my own. All is well, until compiling on a system that already has strlcpy defined: error: conflicting types for strlcpy.
My question: how can I implement a function that may cause naming conflicts down the road? Can I use some directive like #ifdef some_macro(strlcpy), or am I simply left with renaming strlcpy to my_strlcpy?
check if you need to include.
Example - the list of systems will be much longer I think
#if !defined(__FreeBSD__)
#include "mystring.h"
#endif

When to decide to use typedef's data types or C's built-in standard data types

gcc 4.7.2
c89
Hello,
I am using the Apache Portable Runtime and looking at their typedef's
typedef short apr_int16_t
typedef int apr_int16_t
typedef size_t apr_size_t /* This is basically the same, so what's the point */
etc.
So what is the point of all this?
When should you decided to use C's built-in standard data types or typedef's data types?
I just gave a example using the APR. However, I am also speaking generally as well. There is also the stdint.h header file that typedef's data types.
Many thanks for any suggestions,
In my opinion, it is better to have custom defined data types for native data types of the system as it helps in clearly distingushing the size of the types.
For Ex: A long may be 32 bit or 64 bit depending on the machine in which your code runs and the way it has been built. But, if your code specifically needs a 64 bit variable, then naming it as uint_64_t or something similar will always help in associating the size clearly.
In such cases, the code be written as:
#if _64BIT_
typedef long uint_64_t
#else
typedef long long uint_64_t
#endif
But as suggested by Mehrdad, don't use it "just for kicks". : )
Great question.
So what is the point of all this?
It's meant to be for abstraction, but like anything else, it is sometimes misused/overused.
Sometimes it's necessary for backwards compatibility (e.g. typedef VOID void in Windows), sometimes it's necessary for proper abstraction (e.g. typedef unsigned int size_t), and sometimes it's completely pointless logically, but makes typing easier (e.g. typedef char const *LPCSTR in Windows).
When should you decided to use C's built-in standard data types or typedef's data types?
If it makes something easier, or if it implements a proper abstraction barrier, use it.
What exactly that means is something you'll just have to learn over time.
But don't use it "just for kicks"!

atoi is a standard function. But itoa is not. Why?

Why this distinction? I've landed up with terrible problems, assuming itoa to be in stdlib.h and finally ending up with linking a custom version of itoa with a different prototype and thus producing some crazy errors.
So, why isn't itoa not a standard function? What's wrong with it? And why is the standard partial towards its twin brother atoi?
No itoa has ever been standardised so to add it to the standard you would need a compelling reason and a good interface to add it.
Most itoa interfaces that I have seen either use a static buffer which has re-entrancy and lifetime issues, allocate a dynamic buffer that the caller needs to free or require the user to supply a buffer which makes the interface no better than sprintf.
An "itoa" function would have to return a string. Since strings aren't first-class objects, the caller would have to pass a buffer + length and the function would have to have some way to indicate whether it ran out of room or not. By the time you get that far, you've created something similar enough to sprintf that it's not worth duplicating the code/functionality. The "atoi" function exists because it's less complicated (and arguably safer) than a full "scanf" call. An "itoa" function wouldn't be different enough to be worth it.
The itoa function isn't standard probably for the reason is that there is no consistent definition of it. Different compiler and library vendors have introduced subtly different versions of it, possibly as an invention to serve as a complement to atoi.
If some non-standard function is widely provided by vendors, the standard's job is to codify it: basically add a description of the existing function to the standard. This is possible if the function has more or less consistent argument conventions and behavior.
Because multiple flavors of itoa are already out there, such a function cannot be added into ISO C. Whatever behavior is described would be at odds with some implementations.
itoa has existed in forms such as:
void itoa(int n, char *s); /* Given in _The C Programming Language_, 1st ed. (K&R1) */
void itoa(int input, void (*subr)(char)); /* Ancient Unix library */
void itoa(int n, char *buf, int radix);
char *itoa(int in, char *buf, int radix);
Microsoft provides it in their Visual C Run Time Library under the altered name: _itoa.
Not only have C implementations historically provided it under differing definitions, C programs also provide a function named itoa function for themselves, which is another source for possible clashes.
Basically, the itoa identifier is "radioactive" with regard to standardization as an external name or macro. If such a function is standardized, it will have to be under a different name.

C: Why isn't size_t a C keyword?

sizeof is a C keyword. It returns the size in a type named size_t. However, size_t is not a keyword, but is defined primarily in stddef.h and probably other C standard header files too.
Consider a scenario where you want to create a C program which does not include any C standard headers or libraries. (Like for example, if you are creating an OS kernel.) Now, in such code, sizeof can be used (it is a C keyword, so it is a part of the language), but the type that it returns (size_t) is not available!
Does not this signify some kind of a problem in the C standard specification? Can you clarify this?
It does not literally return a value of type size_t since size_t is not a concrete type in itself, but rather a typedef to an unspecified built-in type. Typedef identifiers (such as size_t) are completely equivalent to their respective underlying types (and are converted thereto at compile time). If size_t is defined as an unsigned int on your platform, then sizeof returns an unsigned int when it is compiled on your system. size_t is just a handy way of maintaining portability and only needs to be included in stddef.h if you are using it explicitly by name.
sizeof is a keyword because, despite it's name and usage, it is an operator like + or = or < rather than a function like printf() or atoi() or fgets(). A lot of people forget (or just don't know) that sizeof is actually an operator, and is always resolved at compile-time rather than at runtime.
The C language doesn't need size_t to be a usable, consistent language. That's just part of the standard library. The C language needs all operators. If, instead of +, C used the keyword plus to add numbers, you would make it an operator.
Besides, I do semi-implicit recasting of size_ts to unsigned ints (and regular ints, but Kernighan and Ritchie will someday smite me for this) all the time. You can assign the return type of a sizeof to an int if you like, but in my work I'm usually just passing it straight on to a malloc() or something.
Some headers from the C standard are defined for a freestanding environment, i.e. fit for use e.g. in an operating system kernel. They do not define any functions, merely defines and typedefs.
They are float.h, iso646.h, limits.h, stdarg.h, stdbool.h, stddef.h and stdint.h.
When working on an operating system, it isn't a bad idea to start with these headers. Having them available makes many things easier in your kernel. Especially stdint.h will become handy (uint32_t et al.).
Does not this signify some kind of a problem in the C standard specification?
Look up the difference between a hosted implementation of C and a freestanding C implementation. The freestanding (C99) implementation is required to provide headers:
<float.h>
<iso646.h>
<limits.h>
<stdarg.h>
<stdbool.h>
<stddef.h>
<stdint.h>
These headers do not define any functions at all. They define parts of the language that are somewhat compiler specific (for example, the offsetof macro in <stddef.h>, and the variable argument list macros and types in <stdarg.h>), but they can be handled without actually being built into the language as full keywords.
This means that even in your hypothetical kernel, you should expect the C compiler to provide these headers and any underlying support functions - even though you provide everything else.
I think that the main reasons that size_t is not a keyword are:
there's no compelling reason for it to be. The designers of the C and C++ languages have always preferred to have language features be implemented in the library if possible and reasonable
adding keywords to a language can create problems for an existing body of legacy code. This is another reason they are generally resistant to adding new keywords.
For example, in discussing the next major revision of the C++ standard, Stroustrup had this to say:
The C++0x improvements should be done in such a way that the resulting language is easier to learn and use. Among the rules of thumb for the committee are:
...
Prefer standard library facilities to language extensions
...
There is no reason not to include stddef.h, even if you are working on a kernel - it defines type sizes for your specific compiler that any code will need.
Note also that almost all C compilers are self-compiled. The actual compiler code for the sizeof operator will therefore use size_t and reference the same stddef.h file as does user code.
From MSDN:
When the sizeof operator is applied
to an object of type char, it yields 1
Even if you don't have stddef.h available/included and don't know about size_t, using sizeof you can get the size of objects relative to char.
size_t is actually a type - often an unsigned int. Sizeof is an operator that gives the size of a type. The type returned by sizeof is actually implementation-specific, not a C standard. It's just an integer.
Edit:
To be very clear, you do not need the size_t type in order to use sizeof. I think the answer you're looking for is - Yes, it is inconsistent. However, it doesn't matter. You can still practically use sizeof correctly without having a size_t definition from a header file.
size_t is not a keyword by necessity. Different architectures often have different sizes for integral types. For example a 64 bit machine is likely to have an unsigned long long as size_t if they didn't decide to make int a 64 bit datatype.
If you make sizeof a builtin type to the compiler, then it will take away the power to do cross compilation.
Also, sizeof is more like a magic compile time macro (think c++ template) which explains why it is a keyword instead of defined type.
The simple reason is because it is not a fundamental type. If you look up the C standard you will find that fundamental types include int, char etc but not size_t. Why so? As others have already pointed out, size_t is an implementation specific type (i.e. a type capable of holding the size in number of "C bytes" of any object).
On the other hand, sizeof is an (unary) operator. All operators are keywords.

overflows in size_t additions

I like to have my code warning free for VS.NET and GCC, and I like to have my code 64-bit ready.
Today I wrote a little module that deals with in memory buffers and provides access to the data via a file-style interface (e.g. you can read bytes, write bytes, seek around etc.).
As the data-type for current read position and size I used size_t since that seems to be the most natural choice. I get around the warnings and it ought to work in 64-bit as well.
Just in case: My structure looks like this:
typedef struct
{
unsigned char * m_Data;
size_t m_CurrentReadPosition;
size_t m_DataSize;
} MyMemoryFile;
The signedness of size_t seems not to be defined in practice. A Google code-search proved that.
Now I'm in a dilemma: I want to check additions with size_t for overflows because I have to deal with user supplied data and third party libraries will use my code. However, for the overflow check I have to know the sign-ness. It makes a huge difference in the implementation.
So - how the heck should I write such a code in a platform and compiler independent way?
Can I check the signedness of size_t at run or compile-time? That would solve my problem. Or maybe size_t wasn't the best idea in the first place.
Any ideas?
EDIT: I'm looking for a solution for the C-language!
Regarding the whether size_t is signed or unsigned and GCC (from an old GCC manual - I'm not sure if it's still there):
There is a potential problem with the
size_t type and versions of GCC prior
to release 2.4. ANSI C requires that
size_t always be an unsigned type. For
compatibility with existing systems'
header files, GCC defines size_t in
stddef.h to be whatever type the
system's sys/types.h defines it to
be. Most Unix systems that define
size_t in sys/types.h, define it to
be a signed type. Some code in the
library depends on size_t being an
unsigned type, and will not work
correctly if it is signed.
The GNU C library code which expects
size_t to be unsigned is correct. The
definition of size_t as a signed type
is incorrect. We plan that in version
2.4, GCC will always define size_t as an unsigned type, and the
'fixincludes' script will massage the
system's sys/types.h so as not to
conflict with this.
In the meantime, we work around this
problem by telling GCC explicitly to
use an unsigned type for size_t when
compiling the GNU C library.
'configure' will automatically detect
what type GCC uses for size_t arrange
to override it if necessary.
If you want a signed version of size_t use ptrdiff_t or on some systems there is a typedef for ssize_t.
size_t is an unsigned integral type, according to the C++ C standards. Any implementation that has size_t signed is seriously nonconforming, and probably has other portability problems as well. It is guaranteed to wrap around when overflowing, meaning that you can write tests like if (a + b < a) to find overflow.
size_t is an excellent type for anything involving memory. You're doing it right.
size_t should be unsigned.
It's typically defined as unsigned long.
I've never seen it be defined otherwise. ssize_t is its signed counterpart.
EDIT:
GCC defines it as signed in some circumstances. compiling in ASNI C mode or std-99 should force it to be unsigned.
For C language, use IntSafe. Also released by Microsoft (not to be confused with the C++ library SafeInt). IntSafe is a set of C language function calls that can perform math and do conversions safely.
updated URL for intsafe functions
Use safeint. It is a class designed by Michael Howard and released as open source from Microsoft. It is designed to make working with integers where overflow is identified as a risk. All overflows are converted to exceptions and handled. The class is designed to make correct usage easy.
For example :
char CouldBlowUp(char a, char b, char c)
{
SafeInt<char> sa(a), sb(b), sc(c);
try
{
return (sa * sb + sc).Value();
}
catch(SafeIntException err)
{
ComplainLoudly(err.m_code);
}
return 0;
}
Also safeint is used a lot internally at Microsoft in products like Office.
Ref:
link text
I am not sure if I understand the question exactly, but maybe you can do something like:
temp = value_to_be_added_to;
value_to_be_added_to += value_to_add;
if (temp > value_to_be_added_to)
{
overflow...
}
Since it will wrap back to lower values you can easily check if it overflowed.

Resources