Bitfields bigger than a long long? - c

Is it possible to declare a bitfield of very large numbers e.g.
struct binfield{
uber_int field : 991735910442856976773698036458045320070701875088740942522886681;
}wordlist;
just to clarify, i'm not trying to represent that number in 256bit, that's how many bits I want to use. Or maybe there aren't that many bits in my computer?

C does not support numeric data-types of arbitrary size. You can only use those integer sizes which are provided by the compiler, and when you want your code to be portable, you better stick to the minimum guaranteed sizes for the standardized types of char (8 bit), short (16 bit), and long (32 bit) and long long (64 bit).
But what you can do instead is create a char[]. A char is always at least 8 bit (and is not more than 8 bit either except on some very exotic platforms). So you can use an array of char to store as many bit-values as you can afford memory. However, when you want to use a char array as a bitfield you will need some boilerplate code to access the correct byte.
For example, to get the value of bit n of a char array, use
bitfield[n/8] >> n%8 & 0x1

Related

how to create a new type with custom bytes in C?

Is it possible to create a new type in C that uses the amount of bytes that I decide?
I know that an int takes 4 bytes, but I need to work with very small numbers so allocating 4 bytes with malloc for every int is a bit of a waste, i was thinking of creating a new type for numbers that takes only 1 byte...if it's possible
Is it possible to create a new type in C that uses the amount of bytes that I decide?
Yes, you can declare an array of char (or signed char or unsigned char) of any positive length you like, up to an implementation-dependant (but usually large) limit.
I know that an int takes 4 bytes,
You are mistaken: an int may take four bytes, but the standard does not require that. Its minimum required range can be represented with only two bytes, and some implementations indeed provide two-byte ints. That was more common historically than it is today. Also, implementations can make int larger than four bytes. That's rare as a default, but some compilers provide an option to produce that result.
but I need to work with very small numbers so allocating 4 bytes with malloc for every int is a bit of a waste, i was thinking of creating a new type for numbers that takes only 1 byte...if it's possible.
A one-byte number is a signed char or unsigned char. Technically, plain char also qualifies, but its signedness is implementation-defined, and as a matter of style, it is preferable to reserve its use to character data. Also, technically, char and its signed and unsigned variations may be larger than 8 bits, but you are unlikely ever to run into a C implementation where that is the case, and C anyway offers no smaller data type.
You can just use char instead of int
Or, you can create structure, it is the most commonly used custom data type in C.
For example:
struct customStructure {
char c;
};
A byte type already exits, it is called char, and sizeof(char)=1, so strictly talking about datatypes, char is the smallest amount of memory you can manage though C.
However, if you are talking about bits, it doesn't mean that char type is always 8bits (it is common a byte of 10 or 16bit in DSPs). Given that, the number of bits in a char is indicated by CHAR_BIT.
Strictly speaking, an int has at least 2 bytes, according to the C standard, and the actual width of a type is machine-dependent. If you want numeric types with a certain width, I suggest you take a look at those defined in <stdint.h>.
If you really want to use a type with, say, N bytes, you could use char small_numbers[N] and get your hands dirty with some bit twiddling, but the more practical solution would be to simply use the built-in types.
You can use below structures for bytes and bits respectively.
typedef struct data_type_bytes {
unsigned char data;
} bytes;
typedef struct data_type_bits{
unsigned int heightValidated : 1;
} bits;

How to declare different size variables

Hi I want to declare a 12 bit variable in C or any "unconventional" size variable (a variable that is not in the order of 2^n). how would I do that. I looked everywhere and I couldn't find anything. If that is not possible how would you go about saving certain data in its own variable.
Use a bitfield:
struct {
unsigned int twelve_bits: 12;
} wrapper;
Unlike Ada, C has no way to specify types with a limited range of values. C relies on predefined types with implementation defined characteristics, but with certain guarantees:
Types short and int are guaranteed by the standard to hold at least 16 bits, you can use either one to hold your 12 bit values, signed or unsigned.
Similarly, type long is guaranteed to hold at least 32 bits and type long long at least 64 bits. Choose the type that is large enough for your purpose.
Types int8_t, int16_t, int32_t, int64_t and their unsigned counterparts defined in <stdint.h> have more precise semantics but might not be available on all systems. Types int_least8_t, int_least16_t, int_least32_t and int_least64_t are guaranteed to be available, as well as similar int_fastXX_t types, but they are not used very often, probably because the names are somewhat cumbersome.
Finally, you can use bit-fields for any bit counts from 1 to 64, but these are only available as struct members. bit-fields of size one should be declared as unsigned.
Data is always stored in groups of bytes (8 bits each).
In C, variables can be declared of 1 byte (a "char" or 8 bits), 2 bytes (a "short" int on many computers is 16 bits), and 4 bytes (a "long" int on many computers is 32 bits).
On a more advanced level, you are looking for "bitfields".
See this perhaps: bitfield discussion

How to know the size of primitive types in C in different architectures before programming

We have two kinds of remote systems in our university; we can connect to them remotely and work. I wrote a C program on one of the systems where size of void pointer and size of size_t variable is 8 bytes. But when I connected to other system, my program started working differently. I wasted too much time debugging for the reason and finally found that it is happening due to architecture differences between the two systems.
My questions are:
On what factors the size of primitive types depend?
How to know the size of primitive types before we start programming?
How to write cross platform code in C?
Question:
On what factors the size of primitive types depend?
The CPU and the compiler.
Question:
How to know the size of primitive types before we start programming?
You can't. However, you can write a small program to get the sizes of the primitive types.
#include <stdio.h>
int main()
{
printf("Size of short: %zu\n", sizeof(short));
printf("Size of int: %zu\n", sizeof(int));
printf("Size of long: %zu\n", sizeof(long));
printf("Size of long long: %zu\n", sizeof(long long));
printf("Size of size_t: %zu\n", sizeof(size_t));
printf("Size of void*: %zu\n", sizeof(void*));
printf("Size of float: %zu\n", sizeof(float));
printf("Size of double: %zu\n", sizeof(double));
}
Question:
How to write cross platform code in C?
Minimize code that are dependent an sizes of primitive types.
When exchanging data between platforms, use text files for persistent data as much as possible.
In general Size of integer in a processor depends on how many bits ALU can operate in single cycle.
For e.g.
i)For 8051 Architecture as size of data bus is 8 bits ,All 8051 compilers specifies size of integer is 8 bits.
ii) For 32 bit ARM architecture as data bus is 8 bits wide size of integer is 32 bits.
You should always refer the compiler documentation for correct size of data types.
Almost all compiles declares their name/version as a predefined macro,you can use them in your header file like this:
#ifedef COMPILER_1
typedef char S8
typedef int S16
typedef long S32
:
:
#else COPILER_2
typedef int S32
typedef long S64
:
:
#endif
Then in you code you can declare variables like
S32 Var1;
How to write cross platform code in C?
If you need to marshal data in a platform-independent way (e.g. on a filesystem, or over a network), you should be consistent in (at least) these things:
Datatype sizes - Rely on the types from <stdint.h>. For example, if you need a two-byte unsigned integer, use uint16_t.
Datatype alignment/padding - Be aware of how members in a struct are packed/padded. The default alignment of a member may change from one system to another, which means a member may be at different byte offsets, depending on the compiler. When marshalling data, use __attribute__((packed)) (on GCC), or similar.
Byte order - Multi-byte integers can be stored with their bytes in either order: Little-endian systems store the least-significant byte at the lowest address/offset, while Big-endian systems start with the most-significant. Luckily, everyone has agreed that bytes are sent as big-endian over the network. For this, we use htons/ntohs to convert byte order when sending/receiving multi-byte integers over network connections.
Question:
On what factors the size of primitive types depend & How to know the size of primitive types before we start programming?
Short Answer:
The CPU and the compiler.
Long Answer
To understand Primitive types one has to understand types of Primitive types, there are two types of Primitive types:
1. Integer Types
The integer data types range in size from at least 8 bits to at least 32 bits. The C99 standard extends this range to include integer sizes of at least 64 bits. The sizes and ranges listed for these types are minimums; depending on your computer platform, these sizes and ranges may be larger.
signed char : 8-bit integer values in the range of −128 to 127.
unsigned char : 8-bit integer values in the range of 0 to 255.
char : Depending on your system, the char data type is defined as having the same range as either the signed char or the unsigned char data type
short int : 16-bit integer values in the range of −32,768 to 32,767
unsigned short int : 16-bit integer values in the range of 0 to 65,535
int : 32-bit integer values in the range of −2,147,483,648 to 2,147,483,647
long int : 32-bit integer range of at least −2,147,483,648 to 2,147,483,647 (Depending on your system, this data type might be 64-bit)
unsigned long int : 32-bit integer range of at least −2,147,483,648 to 2,147,483,647 (Depending on your system, this data type might be 64-bit)
long long int : 64-bit Integer values in the range of −9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. (This type is not part of C89, but is both part of C99 and a GNU C extension. )
unsigned long long int: 64-bit integer values in the range of at least 0 to 18,446,744,073,709,551,615 (This type is not part of C89, but is both part of C99 and a GNU C extension. )
Real Number Types
float : float data type is the smallest of the three floating point types, if they differ in size at all. Its minimum value is stored in the FLT_MIN, and should be no greater than 1e-37. Its maximum value is stored in FLT_MAX, and should be no less than 1e37.
double : The double data type is at least as large as the float type. Its minimum value is stored in DBL_MIN, and its maximum value is stored in DBL_MAX.
long double : type is at least as large as the float type, and it may be larger. Its minimum value is stored in DBL_MIN, and its maximum value is stored in DBL_MAX.
Question:
How to write cross platform code in C?
Cross Platform code has two things to do that are:
Use standard 'C' types, not platform specific types
Use only built in #ifdef compiler flags, do not invent your own
Try to re-useable, cross-platform "base" libraries to hide platform code
Don’t use 3rd party "Application Frameworks" or "Runtime Environments"
Jonathan Leffler basically covered this in his comments, but you technically cannot know how large primitive types will be across systems/architectures. If a system follows the C standard, then you know you will have a minimum number of bytes for every variable type, but you may be given more than that value.
For example, if I am writing a program that uses a signed long, I can reliably know that I will be given at least 4 bytes and that I can store numbers up to 2,147,483,647; however, some systems could give me more than 4 bytes.
Unfortunately, a developer cannot know ahead of time (without testing) how many bytes a system will return, and thus good code should be dynamic enough to account for this.
The exceptions to this rule are int8_t, int16_t, int32_t, int64_t and their unsigned counterparts (uintn_t). With these variable types you are guaranteed exactly n number of bits - no more, no less.
C has a standard sizeof() operator.
I ask this question of every developer I hire.
My quess is you are doing this.
struct A {
int X; // 2 or 4 or 8 bytes
short Y; // 2 bytes
}
On 32 bit computer you get a structure that is 48 bits, 32 for int, 16 for the short.
On the 64 bit computer you get structure that is 80 bits long, 64 for in, 16 for short.
(Yes, I know, all kinds of esoteric stuff might happen here, but the goal is solve the problem, not to confuse the questioner.)
The problem comes about when you tried to use this struct to read what was written by the other.
You need a structure that will marshall correctly.
struct A {
long X; // 4 bytes
short Y; // 2 bytes
}
Now both sides will read and write the data correctly in most cases, unless you have monkeyed with the flags.
If you are sending stuff across the wire you must use the char, short, long, etc. If you are not then you can use int, as int and let the compiler figure it out.

What are the byte sizes of basic data types in C [duplicate]

This question already has answers here:
integer size in c depends on what?
(8 answers)
Closed 9 years ago.
I have a pretty basic question to which I am not able to find concrete answers
By default what are the sizes of an int, short, long in C? when I say int a, is it signed by default ?
Also what is the size of unsigned values of the same. i.e unsigned int, unsigned short etc ?
I am using mac os x and xcode to compile/run. I tried doing sizeof (int) and it returns 4 bytes for both "int" and "unsigned int". What is the size difference between signed and unsigned ?
It's always platform and implementation dependent. Sometimes some types are the same size on one implementation and not on another.
Welcome to step one in developing carefully.
Signed and unsigned values have the same size. In the signed form the most significant bit is the sign, while in the unsigned form the extra bit allow the type to hold greater values. For example a signed type of 32 bits can hold a value in -(2^31) to 2^31 - 1 and an unsigned type of 32 bits can hold a value of 0 to 2^32 - 1.
The size of each type is system dependent. If you need a variable to be of an expecific size, you can use the types from stdint.h, so int32_t will always have 32 bits, for example.
Signed and unsigned ints are the same size. The upper bit is the sign bit.
int is signed by default.
shorts are typically 2 bytes, int's are typically 4. longs are usually 4 or 8.
All of these are platform dependent and you should use sizeof() to discover them.
Most systems today, including those Intel ones, use two's complement for representing signed integers.
Therefore usually you get sizeof(unsigned int) == sizeof(signed int).
The number of bits for a certain type is only loosely defined by the specification and therefore mostly compiler/platform dependent. Sometimes an int might be 16bit, other times 32bits. Read the documentation of your compiler for more information (e.g. gcc), and keep that in mind when writing portable code.

What is this C syntax?

I have no idea what to call it, so I have no idea how to search for it.
unsigned int odd : 1;
Edit:
To elaborate, it comes from this snippet:
struct bitField {
unsigned int odd : 1;
unsigned int padding: 15; // to round out to 16 bits
};
I gather this involves bits, but I'm still not all the way understanding.
They are bitfields. odd and padding will be stored in one unsigned int (16 bit) where odd will occupy the lowest bit, and padding the upper 15 bit of the unsigned int.
It's a bitfield - Check the C FAQ.
It's:
1 bit of "odd" (e.g. 1)
15 bits of "padding" (e.g. 0000000000000001)
and (potentially) whatever other bits round out the unsigned int. In modern 32-bit platforms where this is 32 bits, you'll see another 16 0s in memory (but not in the struct). (In this case sizeof returns 4)
Bitfields can save memory but potentially add instructions to computations. In some cases compilers may ignore your bitfield settings. You can't make any assumptions about how the compiler will choose to actually lay out your bit field, and it can depend on the endianness of your platform.
The main thing I use bitfields for is when I know I will be doing a lot of copying of the data, and not necessarily a lot of computation on or reference of the specific fields in the bit field.

Resources