Difference between size_t and unsigned int? - c

I am so confused about size_t. I have searched on the internet and everywhere mentioned that size_t is an unsigned type so, it can represent only non-negative values.
My first question is: if it is used to represent only non-negative values, why don't we use unsigned int instead of size_t?
My second question is: are size_t and unsigned int interchangeable or not? If not, then why?
And can anyone give me a good example of size_t and briefly its workings?

if it is use to represent non negative value so why we not using unsigned int instead of size_t
Because unsigned int is not the only unsigned integer type. size_t could be any of unsigned char, unsigned short, unsigned int, unsigned long or unsigned long long, depending on the implementation.
Second question is that size_t and unsigned int are interchangeable or not and if not then why?
They aren't interchangeable, for the reason explained above ^^.
And can anyone give me a good example of size_t and its brief working ?
I don't quite get what you mean by "its brief working". It works like any other unsigned type (in particular, like the type it's typedeffed to). You are encouraged to use size_t when you are describing the size of an object. In particular, the sizeof operator and various standard library functions, such as strlen(), return size_t.
Bonus: here's a good article about size_t (and the closely related ptrdiff_t type). It reasons very well why you should use it.

There are 5 standard unsigned integer types in C:
unsigned char
unsigned short
unsigned int
unsigned long
unsigned long long
with various requirements for their sizes and ranges (briefly, each type's range is a subset of the next type's range, but some of them may have the same range).
size_t is a typedef (i.e., an alias) for some unsigned type, (probably one of the above but possibly an extended unsigned integer type, though that's unlikely). It's the type yielded by the sizeof operator.
On one system, it might make sense to use unsigned int to represent sizes; on another, it might make more sense to use unsigned long or unsigned long long. (size_t is unlikely to be either unsigned char or unsigned short, but that's permitted).
The purpose of size_t is to relieve the programmer from having to worry about which of the predefined types is used to represent sizes.
Code that assumes sizeof yields an unsigned int would not be portable. Code that assumes it yields a size_t is more likely to be portable.

size_t has a specific restriction.
Quoting from http://www.cplusplus.com/reference/cstring/size_t/ :
Alias of one of the fundamental unsigned integer types.
It is a type able to represent the size of any object in bytes: size_t is the type returned by the sizeof operator and is widely used in the standard library to represent sizes and counts.
It is not interchangeable with unsigned int because the size of int is specified by the data model. For example LLP64 uses a 32-bit int and ILP64 uses a 64-bit int.

Apart from the other answers it also documents the code and tells people that you are talking about size of objects in memory

size_t is used to store sizes of data objects, and is guaranteed to be able to hold the size of any data object that the particular C implementation can create. This data type may be smaller (in number of bits), bigger or exactly the same as unsigned int.

size_t type is a base unsigned integer type of
C/C++ language. It is the type of the result
returned by sizeof operator. The type's size is
chosen so that it could store the maximum size
of a theoretically possible array of any type. On a
32-bit system size_t will take 32 bits, on a 64-
bit one 64 bits. In other words, a variable of
size_t type can safely store a pointer. The
exception is pointers to class functions but this
is a special case. Although size_t can store a
pointer, it is better to use another unsigned
integer type uintptr_t for that purpose (its name
reflects its capability). The types size_t and
uintptr_t are synonyms. size_t type is
usually used for loop counters, array indexing
and address arithmetic.
The maximum possible value of size_t type is
constant SIZE_MAX .

In simple words size_t is platform and as well as implementation dependent whereas unsigned int is only platform dependent.

Related

Can I just use unsigned int instead of size_t? [duplicate]

This question already has answers here:
unsigned int vs. size_t
(8 answers)
Closed 1 year ago.
I've got an impression that size_t is unsigned int. But can I just write unsigned int instead of size_t in my code?
size_t is the most correct type to use when describing the sizes of arrays and objects. It's guaranteed to be unsigned and is supposedly "large enough" to hold any object size for the given system. Therefore it is more portable to use for that purpose than unsigned int, which is in practice either 16 or 32 bits on all common computers.
So the most canonical form of a for loop when iterating over an array is actually:
for(size_t i=0; i<sizeof array/sizeof *array; i++)
{
do_something(array[i]);
}
And not int i=0; which is perhaps more commonly seen even in some C books.
size_t is also the type returned from the sizeof operator. Using the right type might matter in some situations, for example printf("%u", sizeof obj); is formally undefined behavior, so it might in theory crash printf or print gibberish. You have to use %zu for size_t.
It is quite possible that size_t happens to be the very same type as unsigned long or unsigned long long or uint32_t or uint64_t though.
The type size_t is an unsigned type capable of representing all possible results of sizeof and _Alignof operators.
See https://port70.net/~nsz/c/c11/n1570.html#6.5.3.4p5
The unsigned int is an unsigned variant of ... int that can represent integers from 0 to 65535. It may be capable of representing larger integers.
See https://port70.net/~nsz/c/c11/n1570.html#5.2.4.2.1
They may or may not be the same type. On modern 64-bit machines they are not.

Casting to unsigned without specifying a type

#include <stdio.h>
int main(){
ssize_t a= -1;
size_t b = (unsigned)a;
return 0;
}
a is 8 bytes all set to 1, however b becomes a 4 byte number when casted to unsigned without doing a proper (unsigned size_t), why is that? why doesn't it turn into an 8 byte unsigned variable?
unsigned is short for writing unsigned int, so unsigned and unsigned int are the same type.
unsigned size_t does not exist; size_t is already unsigned.
why doesn't it turn into an 8 byte unsigned variable?
Because ints are commonly 32 bit or 4 bytes.
If you want fixed width integers you can use stdint.h which defines uint32_t, int32_t etc.
The C standard sheds some light on this:
In ยง 6.7.2p2, Type specifiers, "- unsigned, or unsigned int" are in the same list entry, because they are equivalent type specifiers.
As #marco-a told you that unsigned is shorthand for unsigned int and both are equivalent. The sizeof(unsigned int) could be 4 byte(commonly) or 8 byte, it's system-dependent. But size_t is an unsigned integral data type which is guaranteed to be big enough to contain the size of the biggest object the host system can handle. Basically the maximum permissible size is dependent on the compiler; if the compiler is 32 bit then it is simply a typedef(i.e., alias) for unsigned int but if the compiler is 64 bit then it would be a typedef for unsigned long long. The size_t data type is never negative.
Casting a negative value to unsigned (independently of what type is the unsigned) is Undefined Behaviour. You can cast only positive signed values to be unsigned.
The result of casting is not defined as a valid operation, and it depends on the architecture and the compiler implementation how it deals with that cast.
Anyway, when you add the unsigned keyword but don't specify the actual integer type you want, you have to think that short, long, long long, are also adjectives to the basic int type (which is the default type to substitute) so, to end, the default type is int.

Can I store the yield value of sizeof of type size_t in an unsigned int object?

sizeof is a standard C operator.
sizeof yields the size (in bytes) of its operand in type size_t, Quote from ISO/IEC 9899:2018 (C18), 6.5.3.4/5. Phrases surrounded by -- are my addition for clarification of context:
The value of the result of both operators -- (sizeof and _Alignof) -- is implementation-defined, and its type (an unsigned integer type) is size_t, defined in <stddef.h> (and other headers).
Implicitly, if I want my program to be standard conform and want to use sizeof, I need to include one of the header files in which size_t is defined, because the value it yields is of type size_t and I want to store the value in an appropriate object.
Of course, in any program which would not be a toy program I would need at least one of these headers all the way up regardless but in a simple program I need to explictly include those headers, although I do not need them otherwise.
Can I use an unsigned int object to store the size_t value sizeof yields without an explicit cast?
Like for example:
char a[19];
unsigned int b = sizeof(a);
I compiled that with gcc and -Wall and -Werror option flag but it did not had anything to complain.
But is that standard-conform?
It is permissible but it is your responsibility to provide that there will not be an overflow storing a value of the type size_t in an object of the type unsigned int. For unsigned integer types overflow is well-defined behavior.
However it is a bad programming style to use types that were not designed to store values of a wider integer type. This can be a reason of hidden bugs.
Usually the type size_t is an alias for the type unsigned long. On some 64-bit systems, the type unsigned long has the same size as the type unsigned long long, which is 8 bytes instead of the 4 bytes that unsigned int can be stored in.
It is totally conformant, though if you have a very large object (typically 4GB or larger) its size may not fit into an unsigned int. Otherwise there is nothing to worry about.
Having said that, your question and this answer probably have more characters than you would save by not including an appropriate header in a lifetime worth of toy programs.
This is allowed. It is an implicit conversion ("as if by assignment"). See the section labelled "Integer conversions":
A value of any integer type can be implicitly converted to any other integer type. Except where covered by promotions and boolean conversions above, the rules are:
if the target type can represent the value, the value is unchanged
otherwise, if the target type is unsigned, the value 2^(b-1), where b is the number of bits in the target type, is repeatedly subtracted or added to the source value until the result fits in the target type. In other words, unsigned integers implement modulo arithmetic.
In other words, this is always defined behaviour, but if the size is too big to fit in an unsigned int, it will be truncated.
In principle it is ok. The unsigned int can handle almost any sizeof except artificially constructed exotic things.
P.S. I have seen code similar to yours even in Linux kernel modules.

C11 standard guarantees for casting one array type to another

So I have an application where I use a lot of arrays of chars, shorts, ints, and long longs, all unsigned. Rather than allocating space for each and deallocating, my thought is to have a static array of unsigned long longs. I would then cast this as needed as an array of the appropriate type. Is there a way to prove this is compliant with the standard?
I am statically asserting that char, short, int, and long long are of sizes 1, 2, 4, and 8, respectively, and that their alignment requirements do not exceed their sizes. I would like to know if I can prove the validity of my approach with no further static assertions.
EDIT: I thought I'd add that the standard defines object representation as a copy of an object as an array of unsigned char. It seems that this justifies using an unsigned long long array as either that or an unsigned char array, although I cannot absolutely rule out problems associated with using the object representation in context of the object itself rather than in a copy (which is how 6.2.6.1.4 discusses the object representation). This is, however, all I can find, and it does not help at all with the two intermediate integer sizes.
First off, you're not talking about casting arrays. You're talking about casting pointers.
The standard does not guarantee that what you're doing is safe. You can treat an array of unsigned long long as, for example, an array of unsigned char, but there's no guarantee that you can treat it as an array of unsigned int.
Consider a hypothetical implementation with CHAR_BIT==8, sizeof (unsigned int) == 4, and sizeof (unsigned long long) == 8. Assume unsigned int requires strict 4-byte alignment. But the underlying machine has no direct support for 64-bit quantities, so all operations on unsigned long long are done in software. Because of this, the required alignment for unsigned long long is, let's say, 2 bytes.
So an array of unsigned long long might start at an address that's not a multiple of 4 bytes, and therefore you can't safely treat it as an array of unsigned int.
I don't suggest that this is a plausible implementation. Even if 64-bit integers are implemented in software, it would probably make sense for them to have at least 32-bit alignment. But nothing in what I've described violates the standard; the hypothetical implementation could be conforming.
If you're using a C11 compiler (as indicated by the tag on your question), you could statically assert that
_Alignof (unsigned long long) > _Alignof (unsigned int)
and so forth.
Or you could use malloc during startup to allocate your array, guaranteeing that it's properly aligned for any type.
Stealing an idea from the comments, you could define a union of array types, something like:
#define BYTE_COUNT some_big_number
union arrays {
unsigned char ca[BYTE_COUNT];
unsigned short sa[BYTE_COUNT / sizeof (unsigned short)];
unsigned int ia[BYTE_COUNT / sizeof (unsigned int)];
unsigned long la[BYTE_COUNT / sizeof (unsigned long)];
unsigned long long lla[BYTE_COUNT / sizeof (unsigned long long)];
};
Or you could define your arrays for the type of data you want to store in them.

What is the difference between char and unsigned char?

(Edited change C/C++ to C)
Please help me to find out a clean clarification on char and unsigned char in C. Specially when we transfer data between embedded devices and general PCs (The difference between buffer of unsigned char and plain char).
You're asking about two different languages but, in this respect, the answer is (more or less) the same for both. You really should decide which language you're using though.
Differences:
they are distinct types
it's implementation-defined whether char is signed or unsigned
Similarities:
they are both integer types
they are the same size (one byte, at least 8 bits)
If you're simply using them to transfer raw byte values, with no arithmetic, then there's no practical difference.
The type char is special. It is not an unsigned char or a signed char. These are three distinct types (while int and signed int are the same types). A char might have a signed or unsigned representation.
From 3.9.1 Fundamental types
Plain char, signed char, and unsigned char are three distinct types. A
char, a signed char, and an unsigned char occupy the same amount of
storage and have the same alignment requirements (3.11); that is, they
have the same object representation.

Resources