Type to use to represent a byte in ANSI (C89/90) C? - c

Is there a standards-complaint method to represent a byte in ANSI (C89/90) C? I know that, most often, a char happens to be a byte, but my understanding is that this is not guaranteed to be the case. Also, there is stdint.h in the C99 standard, but what was used before C99?
I'm curious about both 8 bits specifically, and a "byte" (sizeof(x) == 1).

char is always a byte , but it's not always an octet. A byte is the smallest addressable unit of memory (in most definitions), an octet is 8-bit unit of memory.
That is, sizeof(char) is always 1 for all implementations, but CHAR_BIT macro in limits.h defines the size of a byte for a platform and it is not always 8 bit. There are platforms with 16-bit and 32-bit bytes, hence char will take up more bits, but it is still a byte. Since required range for char is at least -127 to 127 (or 0 to 255), it will be at least 8 bit on all platforms.
ISO/IEC 9899:TC3
6.5.3.4 The sizeof operator
...
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. [...]
When applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1. [...]
Emphasis mine.

You can always represent a byte (if you mean 8bits) in a unsigned char. It's always at least 8 bits in size, all bits making up the value, so a 8 bit value will always fit into it.
If you want exactly 8 bits, i also think you'll have to use platform dependent ways. POSIX systems seem to be required to support int8_t. That means that on POSIX systems, char (and thus a byte) is always 8 bits.

In ANSI C89/ISO C90 sizeof(char) == 1. However, it is not always the case that 1 byte is 8 bits. If you wish to count the number of bits in 1 byte (and you don't have access to limits.h), I suggest the following:
unsigned int bitnum(void) {
unsigned char c = ~0u; /* Thank you Jonathan. */
unsigned int v;
for(v = 0u; c; ++v)
c &= c - 1u;
return(v);
}
Here we use Kernighan's method to count the number of bits set in c. To better understand the code above (or see others like it), I refer you to "Bit Twiddling Hacks".

Before C99? Platform-dependent code.
But why do you care? Just use stdint.h.
In every implementation of C I have used (from old UNIX to embedded compilers written by hardware engineers to big-vendor compilers) char has always been 8-bit.

You can find pretty reliable macros and typedefs in boost.

I notice that some answered have re-defined the word byte to mean something other than 8 bits.
A byte is 8 bits, however in some c implementations char is 16 bits (2 bytes) or 8 bits (1 byte). The people that are calling a byte 'smallest addressable unit of memory' or some such garbage have lost grasp of the meaning of byte (8 bits).
The reason that some implementations of C have 16 bit chars (2 bytes) and some have 8 bit chars (1 byte), and there is no standard type called 'byte', is due to laziness.
So, we should use int_8

Related

When `char` type is represented on 64 bits how can it keep its range from limits.h

It is possible to find architectures where the char data type is represented on 8 bytes, so 64 bits, the same as long long and in the same time the Standard requires the CHAR_MIN and CHAR_MAX to be bound -- see 5.2.4.2.1 Sizes of integer types <limits.h> from the Standard ISO 9899.
I cannot figure out why these architectures chose to represent the char so and how does it represent char values on so a large space. So how char values are represented in such a case ?
sizeof(char)=1 all the time. My question is, what is the value of sizeof(long long) and sizeof(int) on such an architecture ?
It is possible to find architectures where the char data type is represented on 8 bytes
No. That's because a char is defined to be a byte *). But a byte doesn't necessarily have 8 bits. That's why the term octet is sometimes used to refer to a unit of 8 bits. There are architectures using more than 8 bits in a byte, but I doubt there's one with a 64bit byte, although this would be theoretically possible.
Another thing to consider is that char (as opposed to many other integer types) isn't allowed to have padding bits, so if you ever found an architecture with 64bit chars, that would mean CHAR_MIN and CHAR_MAX would be "very large" ;)
*) In fact, a byte is defined to be the unit of memory used to represent an encoded character, which is normally also the smallest addressable unit of the system. 8 bits are common, The wikipedia article mentions byte sizes up to 48 bits were used. This might not be the best source, but still, finding a 64bit byte is very unlikely.
It is possible to find architectures where the char data type is represented on 8 bytes,
I don't know any. BTW, it is not only a matter of architecture, but also of ABI. BTW, you don't define what is a byte, and the bit size of char-s matters much more.
(IIRC, someone coded a weird implementation of C in Common Lisp on Linux/x86-64 which has 32 bits char-s; of course its ABI is not the usual Linux one!)
sizeof(char)=1 all the time. My question is, what is the value of sizeof(long long) and sizeof(int) on such an architecture ?
It probably would be also 1 (assuming char, int, long long all have 64 bits) unless long long is e.g. 128 bits (which is possible but unusual).
Notice that the C standard imposes minimal bounds and bit sizes (read n1570). E.g. long long could be wider than 64 bits. I never heard of such C implementations (and I hope that when 128 bits processors become common, C will be dead).
But your question is theoretical. I know no practical C implementation with 64 bits char-s or wider than 64 bits long long. In practice assuming that char-s are 8 bits (but they could be signed or unsigned, and both exist) is a reasonable, but non universal, assumption.
Notice that C is not a universal programming language. You won't be able to code a C compiler for a ternary machine like Setun.

sizeof and when a byte is larger than 8 bits?

Since sizeof is an operator, why can we use sizeof(something); like a function call?
When is a byte not 8 bits?
A byte in this context is the same as an unsigned char, and may be larger than 8 bits
And is there a possible that byte is smaller than 8 bits?
Since sizeof is a operater ,why can we use sizof(something); like a function call ?
Well, + is an "operater" (sic!) too, still you can write (1 + 1) and (1) + (1) and ((1) + 1)... it's just normal parenthesizing/grouping.
When byte is not 8 bits?
When you use a platform on which it isn't 8 bits.
And is there a possible that byte is smaller than 8 bits ?
Not on an architecture that aims to be a conforming C implementation. It can happen, though. Some of the early punch card machines used 6-bit bytes, for example.
Like most operators, sizeof can be applied to an expression, which can include parentheses. As far as the parser cares, it's pretty much the same as something like x * (b + c), where * applies to (b + c). While you don't see it as often, something like x + (b) is also entirely possible.
The standard specifies that CHAR_MIN must be no higher than -127 and CHAR_MAX must be at least 127. That requires at least 8 bits to represent, so no, a char can't be any smaller than 8 bits.
Since sizeof is a operater ,why can we use sizof(something); like a function call ?
Others answered this, but I'll answer anyway. ~ is an operator, so why can we use ~(a) instead of ~a? While fundamentally different, they are still similar in terms of syntax. The exception is that you can do sizeof(int), but that is because of what sizeof does, which is expand to a compile-time constant.
When byte is not 8 bits ?
A byte in this context is the same as an unsigned char, and may be larger than 8 bits
Some platforms have 9-bit bytes. The C standard requires a minimum of 8 bits per char. Currently, many systems use an 8-bit byte, a.k.a. an "octet".
In a language like Java, char is not 8 bits, so an implementation of C could just as easily define it the same way. You just wouldn't be able to access smaller amounts of data using standard C syntax without bit masks and bit shifts or bit-fields because other data types like short int are defined in terms of char:
Values stored in non-bit-field objects of any other object type consist of n × CHAR_BIT bits, where n is the size of an object of that type, in bytes.
So if sizeof(short int) is 2, it will have 2 × CHAR_BIT bits. If CHAR_BIT is 16, a short int is a 32-bit integer type.
There is a difference in computer science definition of byte. Which is 8 bits. And C definiton is byte is number of bits that type char has. Since it's almost always the same, it is common that they are treated as equal.
Yes.
Doubtful that you have a machine where a byte is not 8 bits. In any event, sizeof() takes a type. "byte" is not a type. A type is something like "char" "short" "int". In C, the smallest type is a char which is one byte. Perhaps I should say "probably one byte".

Is the size of C "int" 2 bytes or 4 bytes?

Does an Integer variable in C occupy 2 bytes or 4 bytes? What are the factors that it depends on?
Most of the textbooks say integer variables occupy 2 bytes.
But when I run a program printing the successive addresses of an array of integers it shows the difference of 4.
I know it's equal to sizeof(int). The size of an int is really compiler dependent. Back in the day, when processors were 16 bit, an int was 2 bytes. Nowadays, it's most often 4 bytes on a 32-bit as well as 64-bit systems.
Still, using sizeof(int) is the best way to get the size of an integer for the specific system the program is executed on.
EDIT: Fixed wrong statement that int is 8 bytes on most 64-bit systems. For example, it is 4 bytes on 64-bit GCC.
This is one of the points in C that can be confusing at first, but the C standard only specifies a minimum range for integer types that is guaranteed to be supported. int is guaranteed to be able to hold -32767 to 32767, which requires 16 bits. In that case, int, is 2 bytes. However, implementations are free to go beyond that minimum, as you will see that many modern compilers make int 32-bit (which also means 4 bytes pretty ubiquitously).
The reason your book says 2 bytes is most probably because it's old. At one time, this was the norm. In general, you should always use the sizeof operator if you need to find out how many bytes it is on the platform you're using.
To address this, C99 added new types where you can explicitly ask for a certain sized integer, for example int16_t or int32_t. Prior to that, there was no universal way to get an integer of a specific width (although most platforms provided similar types on a per-platform basis).
There's no specific answer. It depends on the platform. It is implementation-defined. It can be 2, 4 or something else.
The idea behind int was that it was supposed to match the natural "word" size on the given platform: 16 bit on 16-bit platforms, 32 bit on 32-bit platforms, 64 bit on 64-bit platforms, you get the idea. However, for backward compatibility purposes some compilers prefer to stick to 32-bit int even on 64-bit platforms.
The time of 2-byte int is long gone though (16-bit platforms?) unless you are using some embedded platform with 16-bit word size. Your textbooks are probably very old.
The answer to this question depends on which platform you are using.
But irrespective of platform, you can reliably assume the following types:
[8-bit] signed char: -127 to 127
[8-bit] unsigned char: 0 to 255
[16-bit]signed short: -32767 to 32767
[16-bit]unsigned short: 0 to 65535
[32-bit]signed long: -2147483647 to 2147483647
[32-bit]unsigned long: 0 to 4294967295
[64-bit]signed long long: -9223372036854775807 to 9223372036854775807
[64-bit]unsigned long long: 0 to 18446744073709551615
C99 N1256 standard draft
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
The size of int and all other integer types are implementation defined, C99 only specifies:
minimum size guarantees
relative sizes between the types
5.2.4.2.1 "Sizes of integer types <limits.h>" gives the minimum sizes:
1 [...] Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown [...]
UCHAR_MAX 255 // 2 8 − 1
USHRT_MAX 65535 // 2 16 − 1
UINT_MAX 65535 // 2 16 − 1
ULONG_MAX 4294967295 // 2 32 − 1
ULLONG_MAX 18446744073709551615 // 2 64 − 1
6.2.5 "Types" then says:
8 For any two integer types with the same signedness and different integer conversion rank
(see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a
subrange of the values of the other type.
and 6.3.1.1 "Boolean, characters, and integers" determines the relative conversion ranks:
1 Every integer type has an integer conversion rank defined as follows:
The rank of long long int shall be greater than the rank of long int, which
shall be greater than the rank of int, which shall be greater than the rank of short
int, which shall be greater than the rank of signed char.
The rank of any unsigned integer type shall equal the rank of the corresponding
signed integer type, if any.
For all integer types T1, T2, and T3, if T1 has greater rank than T2 and T2 has
greater rank than T3, then T1 has greater rank than T3
Does an Integer variable in C occupy 2 bytes or 4 bytes?
That depends on the platform you're using, as well as how your compiler is configured. The only authoritative answer is to use the sizeof operator to see how big an integer is in your specific situation.
What are the factors that it depends on?
Range might be best considered, rather than size. Both will vary in practice, though it's much more fool-proof to choose variable types by range than size as we shall see. It's also important to note that the standard encourages us to consider choosing our integer types based on range rather than size, but for now let's ignore the standard practice, and let our curiosity explore sizeof, bytes and CHAR_BIT, and integer representation... let's burrow down the rabbit hole and see it for ourselves...
sizeof, bytes and CHAR_BIT
The following statement, taken from the C standard (linked to above), describes this in words that I don't think can be improved upon.
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type. The size is determined from the type of the operand.
Assuming a clear understanding will lead us to a discussion about bytes. It's commonly assumed that a byte is eight bits, when in fact CHAR_BIT tells you how many bits are in a byte. That's just another one of those nuances which isn't considered when talking about the common two (or four) byte integers.
Let's wrap things up so far:
sizeof => size in bytes, and
CHAR_BIT => number of bits in byte
Thus, Depending on your system, sizeof (unsigned int) could be any value greater than zero (not just 2 or 4), as if CHAR_BIT is 16, then a single (sixteen-bit) byte has enough bits in it to represent the sixteen bit integer described by the standards (quoted below). That's not necessarily useful information, is it? Let's delve deeper...
Integer representation
The C standard specifies the minimum precision/range for all standard integer types (and CHAR_BIT, too, fwiw) here. From this, we can derive a minimum for how many bits are required to store the value, but we may as well just choose our variables based on ranges. Nonetheless, a huge part of the detail required for this answer resides here. For example, the following that the standard unsigned int requires (at least) sixteen bits of storage:
UINT_MAX 65535 // 2¹⁶ - 1
Thus we can see that unsigned int require (at least) 16 bits, which is where you get the two bytes (assuming CHAR_BIT is 8)... and later when that limit increased to 2³² - 1, people were stating 4 bytes instead. This explains the phenomena you've observed:
Most of the textbooks say integer variables occupy 2 bytes. But when I run a program printing the successive addresses of an array of integers it shows the difference of 4.
You're using an ancient textbook and compiler which is teaching you non-portable C; the author who wrote your textbook might not even be aware of CHAR_BIT. You should upgrade your textbook (and compiler), and strive to remember that I.T. is an ever-evolving field that you need to stay ahead of to compete... Enough about that, though; let's see what other non-portable secrets those underlying integer bytes store...
Value bits are what the common misconceptions appear to be counting. The above example uses an unsigned integer type which typically contains only value bits, so it's easy to miss the devil in the detail.
Sign bits... In the above example I quoted UINT_MAX as being the upper limit for unsigned int because it's a trivial example to extract the value 16 from the comment. For signed types, in order to distinguish between positive and negative values (that's the sign), we need to also include the sign bit.
INT_MIN -32768 // -(2¹⁵)
INT_MAX +32767 // 2¹⁵ - 1
Padding bits... While it's not common to encounter computers that have padding bits in integers, the C standard allows that to happen; some machines (i.e. this one) implement larger integer types by combining two smaller (signed) integer values together... and when you combine signed integers, you get a wasted sign bit. That wasted bit is considered padding in C. Other examples of padding bits might include parity bits and trap bits.
As you can see, the standard seems to encourage considering ranges like INT_MIN..INT_MAX and other minimum/maximum values from the standard when choosing integer types, and discourages relying upon sizes as there are other subtle factors likely to be forgotten such as CHAR_BIT and padding bits which might affect the value of sizeof (int) (i.e. the common misconceptions of two-byte and four-byte integers neglects these details).
The only guarantees are that char must be at least 8 bits wide, short and int must be at least 16 bits wide, and long must be at least 32 bits wide, and that sizeof (char) <= sizeof (short) <= sizeof (int) <= sizeof (long) (same is true for the unsigned versions of those types).
int may be anywhere from 16 to 64 bits wide depending on the platform.
Is the size of C “int” 2 bytes or 4 bytes?
The answer is "yes" / "no" / "maybe" / "maybe not".
The C programming language specifies the following: the smallest addressable unit, known by char and also called "byte", is exactly CHAR_BIT bits wide, where CHAR_BIT is at least 8.
So, one byte in C is not necessarily an octet, i.e. 8 bits. In the past one of the first platforms to run C code (and Unix) had 4-byte int - but in total int had 36 bits, because CHAR_BIT was 9!
int is supposed to be the natural integer size for the platform that has range of at least -32767 ... 32767. You can get the size of int in the platform bytes with sizeof(int); when you multiply this value by CHAR_BIT you will know how wide it is in bits.
While 36-bit machines are mostly dead, there are still platforms with non-8-bit bytes. Just yesterday there was a question about a Texas Instruments MCU with 16-bit bytes, that has a C99, C11-compliant compiler.
On TMS320C28x it seems that char, short and int are all 16 bits wide, and hence one byte. long int is 2 bytes and long long int is 4 bytes. The beauty of C is that one can still write an efficient program for a platform like this, and even do it in a portable manner!
Mostly it depends on the platform you are using .It depends from compiler to compiler.Nowadays in most of compilers int is of 4 bytes.
If you want to check what your compiler is using you can use sizeof(int).
main()
{
printf("%d",sizeof(int));
printf("%d",sizeof(short));
printf("%d",sizeof(long));
}
The only thing c compiler promise is that size of short must be equal or less than int and size of long must be equal or more than int.So if size of int is 4 ,then size of short may be 2 or 4 but not larger than that.Same is true for long and int. It also says that size of short and long can not be same.
This depends on implementation, but usually on x86 and other popular architectures like ARM ints take 4 bytes. You can always check at compile time using sizeof(int) or whatever other type you want to check.
If you want to make sure you use a type of a specific size, use the types in <stdint.h>
#include <stdio.h>
int main(void) {
printf("size of int: %d", (int)sizeof(int));
return 0;
}
This returns 4, but it's probably machine dependant.
Is the size of C “int” 2 bytes or 4 bytes?
Does an Integer variable in C occupy 2 bytes or 4 bytes?
C allows "bytes" to be something other than 8 bits per "byte".
CHAR_BIT number of bits for smallest object that is not a bit-field (byte) C11dr §5.2.4.2.1 1
A value of something than 8 is increasingly uncommon. For maximum portability, use CHAR_BIT rather than 8. The size of an int in bits in C is sizeof(int) * CHAR_BIT.
#include <limits.h>
printf("(int) Bit size %zu\n", sizeof(int) * CHAR_BIT);
What are the factors that it depends on?
The int bit size is commonly 32 or 16 bits. C specified minimum ranges:
minimum value for an object of type int INT_MIN -32767
maximum value for an object of type int INT_MAX +32767
C11dr §5.2.4.2.1 1
The minimum range for int forces the bit size to be at least 16 - even if the processor was "8-bit". A size like 64 bits is seen in specialized processors. Other values like 18, 24, 36, etc. have occurred on historic platforms or are at least theoretically possible. Modern coding rarely worries about non-power-of-2 int bit sizes.
The computer's processor and architecture drive the int bit size selection.
Yet even with 64-bit processors, the compiler's int size may be 32-bit for compatibility reasons as large code bases depend on int being 32-bit (or 32/16).
This is a good source for answering this question.
But this question is a kind of a always truth answere "Yes. Both."
It depends on your architecture. If you're going to work on a 16-bit machine or less, it can't be 4 byte (=32 bit). If you're working on a 32-bit or better machine, its length is 32-bit.
To figure out, get you program ready to output something readable and use the "sizeof" function. That returns the size in bytes of your declared datatype. But be carfull using this with arrays.
If you're declaring int t[12]; it will return 12*4 byte. To get the length of this array, just use sizeof(t)/sizeof(t[0]).
If you are going to build up a function, that should calculate the size of a send array, remember that if
typedef int array[12];
int function(array t){
int size_of_t = sizeof(t)/sizeof(t[0]);
return size_of_t;
}
void main(){
array t = {1,1,1}; //remember: t= [1,1,1,0,...,0]
int a = function(t); //remember: sending t is just a pointer and equal to int* t
print(a); // output will be 1, since t will be interpreted as an int itselve.
}
So this won't even return something different. If you define an array and try to get the length afterwards, use sizeof. If you send an array to a function, remember the send value is just a pointer on the first element. But in case one, you always knows, what size your array has. Case two can be figured out by defining two functions and miss some performance. Define function(array t) and define function2(array t, int size_of_t). Call "function(t)" measure the length by some copy-work and send the result to function2, where you can do whatever you want on variable array-sizes.

Padding bits in unsigned integers and bitwise operations in C89

I have a lot of code that performs bitwise operations on unsigned integers. I wrote my code with the assumption that those operations were on integers of fixed width without any padding bits. For example an array of 32-bit unsigned integers of which all 32 bits available for each integer.
I'm looking to make my code more portable and I'm focused on making sure I'm C89 compliant (in this case). One of the issues that I've come across is possible padded integers. Take this extreme example, taken from the GMP manual:
However on Cray vector systems it may be noted that short and int are always stored in 8 bytes (and with sizeof indicating that) but use only 32 or 46 bits. The nails feature can account for this, by passing for instance 8*sizeof(int)-INT_BIT.
I've also read about this type of padding in other places. I actually read of a post on SO last night (forgive me, I don't have the link and I'm going to cite something similar from memory) where if you have, say, a double with 60 usable bits the other 4 could be used for padding and those padding bits could serve some internal purpose so they cannot be modified.
So let's say for example my code is compiled on a platform where an unsigned int type is sized at 4 bytes, each byte being 8 bits, however the most significant 2 bits are padding bits. Would UINT_MAX in that case be 0x3FFFFFFF (1073741823)?
#include <stdio.h>
#include <stdlib.h>
/* padding bits represented by underscores */
int main( int argc, char **argv )
{
unsigned int a = 0x2AAAAAAA; /* __101010101010101010101010101010 */
unsigned int b = 0x15555555; /* __010101010101010101010101010101 */
unsigned int c = a ^ b; /* ?? __111111111111111111111111111111 */
unsigned int d = c << 5; /* ?? __111111111111111111111111100000 */
unsigned int e = d >> 5; /* ?? __000001111111111111111111111111 */
printf( "a: %X\nb: %X\nc: %X\nd: %X\ne: %X\n", a, b, c, d, e );
return 0;
}
Is it safe to XOR two integers with padding bits?
Wouldn't I XOR whatever the padding bits are?
I can't find this behavior covered in C89.
Furthermore is the c variable guaranteed to be 0x3FFFFFFF or if for example the two padding bits were both on in a or b would c be 0xFFFFFFFF?
Same question with d and e. Am I manipulating the padding bits by shifting?
I would expect to see this below, assuming 32 bits with the 2 most significant bits used for padding, but I want to know if something like this is guaranteed:
a: 2AAAAAAA
b: 15555555
c: 3FFFFFFF
d: 3FFFFFE0
e: 01FFFFFF
Also are padding bits always the most significant bits or could they be the least significant bits?
EDIT 12/19/2010 5PM EST: Christoph has answered my question. Thanks!
I had also asked (above) whether padding bits are always the most significant bits. This is cited in the rationale for the C99 standard, and the answer is no. I am playing it safe and assuming the same for C89. Here is specifically what the C99 rationale says for §6.2.6.2 (Representation of Integer Types):
Padding bits are user-accessible in an unsigned integer type. For example, suppose a machine uses a pair of 16-bit shorts (each with its own sign bit) to make up a 32-bit int and the sign bit of the lower short is ignored when used in this 32-bit int. Then, as a 32-bit signed int, there is a padding bit (in the middle of the 32 bits) that is ignored in determining the value of the 32-bit signed int. But, if this 32-bit item is treated as a 32-bit unsigned int, then that padding bit is visible to the user’s program. The C committee was told that there is a machine that works this way, and that is one reason that padding bits were added to C99.
Footnotes 44 and 45 mention that parity bits might be padding bits. The committee does not know of any machines with user-accessible parity bits within an integer. Therefore, the committee is not aware of any machines that treat parity bits as padding bits.
EDIT 12/28/2010 3PM EST: I found an interesting discussion on comp.lang.c from a few months ago.
Bitwise Operator Effects on Padding Bits (VelocityReviews reader)
Bitwise Operator Effects on Padding Bits (Google Groups alternate link)
One point made by Dietmar which I found interesting:
Let's note that padding bits are not necessary for the existence of trap representations; combinations of value bits which do not represent a value of the object type would also do.
Bitwise operations (like arithmetic operations) operate on values and ignore padding. The implementation may or may not modify padding bits (or use them internally, eg as parity bits), but portable C code will never be able to detect this. Any value (including UINT_MAX) will not include the padding.
Where integer padding might lead to problems on is if you use things like sizeof (int) * CHAR_BIT and then try to use shifts to access all these bits. If you want to be portable, either only use (unsigned) char, fixed-sized integers (a C99 addition) or determine the number of value-bits programatically. This can be done at compile-time with the preprocessor by comparing UINT_MAX against powers of 2 or at runtime by using bit-operations.
edit:
C90 does not mention integer padding at all, but as far as I can tell, 'invisible' preceding or trailing integer padding bits shouldn't violate the standard (I didn't go through all relevant sections to make sure this is really the case, though); there probaby are problems with mixed padding and value bits as mentioned in the C99 rationale because otherwise, the standard would not have needed to be changed.
As to the meaning of user-accessible: Padding bits are accessible insofar as you can alwaye get at any bit of foo (including padding) by using bit-operations on ((unsigned char *)&foo)[…]. Be careful when modifying the padding bits, though: the result won't change the value of the integer, but might create be a trap-representation nevertheless. In case of C90, this is implicitly unspecified (as in not mentioned at all), in case of C99, it's implementation-defined.
This was not what the rationale quotation was about, though: the cited architecture represents 32-bit integers via two 16-bit integers. In case of unsigned types, the resulting integer has 32 value bits and a precision of 32; in case of signed integers, it only has 31 value bits and a precision of 30: one of the sign bits of the 16-bit integers is used as the sign bit of the 32-bit integer, the other one is ignored, thus creating a padding bit surrounded by value bits. Now, if you access a 32-bit signed integer as an unsigned integer (which is explicitly allowed and does not violate the C99 aliasing rules), the padding bit becomes a (user-accessible) value bit.

Are there machines, where sizeof(char) != 1, or at least CHAR_BIT > 8?

Are there machines (or compilers), where sizeof(char) != 1?
Does C99 standard says that sizeof(char) on standard compliance implementation MUST be exactly 1? If it does, please, give me section number and citation.
Update:
If I have a machine (CPU), which can't address bytes (minimal read is 4 bytes, aligned), but only 4-s of bytes (uint32_t), can compiler for this machine define sizeof(char) to 4? sizeof(char) will be 1, but char will have 32 bits (CHAR_BIT macros)
Update2:
But sizeof result is NOT a BYTES ! it is the size of CHAR. And char can be 2 byte, or (may be) 7 bit?
Update3:
Ok. All machines have sizeof(char) == 1. But what machines have CHAR_BIT > 8 ?
It is always one in C99, section 6.5.3.4:
When applied to an operand that has
type char, unsigned char, or signed char, (or a qualified version thereof)
the result is 1.
Edit: not part of your question, but for interest from Harbison and Steele's. C: A Reference Manual, Third Edition, Prentice Hall, 1991 (pre c99) p. 148:
A storage unit is taken to be the
amount of storage occupied by one
character; the size of an object of
type char is therefore 1.
Edit: In answer to your updated question, the following question and answer from Harbison and Steele is relevant (ibid, Ex. 4 of Ch. 6):
Is it allowable to have a C
implementation in which type char can
represent values ranging from
-2,147,483,648 through 2,147,483,647? If so, what would be sizeof(char)
under that implementation? What would
be the smallest and largest ranges of
type int?
Answer (ibid, p. 382):
It is permitted (if wasteful) for an
implementation to use 32 bits to
represent type char. Regardless of
the implementation, the value of
sizeof(char) is always 1.
While this does not specifically address a case where, say bytes are 8 bits and char are 4 of those bytes (actually impossible with the c99 definition, see below), the fact that sizeof(char) = 1 always is clear from the c99 standard and Harbison and Steele.
Edit: In fact (this is in response to your upd 2 question), as far as c99 is concerned sizeof(char) is in bytes, from section 6.5.3.4 again:
The sizeof operator yields the size
(in bytes) of its operand
so combined with the quotation above, bytes of 8 bits and char as 4 of those bytes is impossible: for c99 a byte is the same as a char.
In answer to your mention of the possibility of a 7 bit char: this is not possible in c99. According to section 5.2.4.2.1 of the standard the minimum is 8:
Their implementation-defined values shall be equal or greater [my emphasis] in magnitude to those shown, with the same sign.
— number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8
— minimum value for an object of type signed char
SCHAR_MIN -127
— maximum value for an object of type signed char
SCHAR_MAX +127
— maximum value for an object of type unsigned char
UCHAR_MAX 255
— minimum value for an object of type char
CHAR_MIN see below
— maximum value for an object of type char
CHAR_MAX see below
[...]
If the value of an object of type char
is treated as a signed integer when
used in an expression, the value of
CHAR_MIN shall be the same as that of
SCHAR_MIN and the value of CHAR_MAX
shall be the same as that of
SCHAR_MAX. Otherwise, the value of
CHAR_MIN shall be 0 and the value of
CHAR_MAX shall be the same as that of
UCHAR_MAX. The value UCHAR_MAX
shall equal 2CHAR_BIT − 1.
There are no machines where sizeof(char) is 4. It's always 1 byte. That byte might contain 32 bits, but as far as the C compiler is concerned, it's one byte. For more details, I'm actually going to point you at the C++ FAQ 26.6. That link covers it pretty well and I'm fairly certain C++ got all of those rules from C. You can also look at comp.lang.c FAQ 8.10 for characters larger than 8 bits.
Upd2: But sizeof result is NOT a BYTES
! it is the size of CHAR. And char can
be 2 byte, or (may be) 7 bit?
Yes, it is bytes. Let me say it again. sizeof(char) is 1 byte according to the C compiler. What people colloquially call a byte (8 bits) is not necessarily the same as what the C compiler calls a byte. The number of bits in a C byte varies depending on your machine architecture. It's also guaranteed to be at least 8.
PDP-10 and PDP-11 was.
Update: there like no C99 compilers for PDP-10.
Some models of Analog Devices 32-bit SHARC DSP have CHAR_BIT=32, and
Texas Instruments DSP from TMS32F28xx have CHAR_BIT=16, reportedly.
Update: There is GCC 3.2 for PDP-10 with CHAR_BIT=9
(check include/limits.h in that archive).

Resources