C Programming - Size of 2U and 1024U - c

I know that the U literal means in c, that the value is a unsigned integer. An unsigned intagers size is 4 bytes.
But how big are 2U or 1024U? Does this simply mean 2 * 4 bytes = 8 bytes for example or does this notation means that 2 (or 1024) are unsigned integers?
My goal would be to figured out how much memory will be allocated if i call malloc like this
int *allocated_mem = malloc(2U * 1024U);
and prove in a short program my answer what i tried like this
printf("Size of 2U: %ld\n", sizeof(2U));
printf("Size of 1024U: %ld\n", sizeof(1024U));
I would have expeted for the first line a size of 2 * 4 Bytes = 8 and for the second like 1024 * 4 Bytes = 4096 but the output is always "4".
Would realy appreciate what 2U and 1024U means exactly and how can i check their size in C?

My goal would be to figured out how much memory will be allocated if i call malloc like this int *allocated_mem = malloc(2U * 1024U);
What is difficult about 2 * 1024 == 2048? The fact that they are unsigned literals does not change their value.
An unsigned intagers size is 4 bytes. (sic)
You are correct. So 2U takes up 4-bytes, and 1024U takes up 4-bytes, because they are both unsigned integers.
I would have expeted for the first line a size of 2 * 4 Bytes = 8 and for the second like 1024 * 4 Bytes = 4096 but the output is always "4".
Why would the value change the size? The size depends only on the type. 2U is of type unsigned int, so it takes up 4-bytes; same as 50U, same as 1024U. They all take 4-bytes.
You are trying to multiply the value (2) times the size of the type. That makes no sense.

How big?
2U and 1024U are the same size, the size of an unsigned, commonly 32-bits or 4 "bytes". The size of a type is the same throughout a given platform - it does not change because of value.
"I know that the U literal means in c, that the value is a unsigned integer." --> OK, close enough so far.
"An unsigned integers size is 4 bytes.". Reasonable guess yet C requires that unsigned are at least 16-bits. Further, the U makes the constant unsigned, yet that could be unsigned, unsigned long, unsigned long long, depending on the value and platform.
Detail: in C, 2U is not a literal, but a constant. C has string literals and compound literals. The literals can have their address taken, but &2U is not valid C. Other languages call 2U a literal, and have their rules on how it can be used.
My goal would be to figured out how much memory will be allocated if i call malloc like this int *allocated_mem = malloc(2U * 1024U);
Instead, better to use size_t for sizing than unsigned and check the allocation.
size_t sz = 2U * 1024U;
int *allocated_mem = malloc(sz);
if (allocated_mem == NULL) allocated_mem = 0;
printf("Allocation size %zu\n", allocated_mem);
(Aside) Be careful with computed sizes. Do your size math using size_t types. 4U * 1024U * 1024U * 1024U could overflow unsigned math, yet may compute as desired with size_t.
size_t sz = (size_t)4 * 1024 * 1024 * 1024;
The following attempts to print the size of the constants which is likely 32-bits or 4 "bytes" and not their values.
printf("Size of 1024U: %ld\n", sizeof(1024U));
printf("Size of 1024U: %ld\n", sizeof(2U));

Related

How many chars do I need to print a size_t with sprintf? [duplicate]

This question already has answers here:
Determining sprintf buffer size - what's the standard?
(7 answers)
Closed 7 years ago.
I want to do something like this:
char sLength[SIZE_T_LEN];
sprintf(sLength, "%zu", strlen(sSomeString));
That is, I want to print in a buffer a value of type size_t. The question is: what should be the value of SIZE_T_LEN? The point is to know the minimum size required in order to be sure that an overflow will never occur.
sizeof(size_t) * CHAR_BIT (from limits.h) gives the number of bits. For decimal output, use a (rough) lower estimate of 3 (4 for hex) bits per digit to be on the safe side. Don't forget to add 1 for the nul terminator of a string.
So:
#include <limits.h>
#define SIZE_T_LEN ( (sizeof(size_t) * CHAR_BIT + 2) / 3 + 1 )
This yields a value of type size_t itself. On typical 8/16/32/64 bit platforms + 2 is not required (the minimum possible size of size_t is 16 bits). Here the error is already large enough to yield a correctly truncated result.
Note that this gives an upper bound and is fully portable due to the use of CHAR_BIT. To get an exact value, you have to use log10(SIZE_MAX) (see JohnBollinger's answer for this). But that yields a float and might be calculated at run-time, while the version above is compile-time evaluated (and likely costs more stack than the rough estimate already). Unless you have a very RAM constrained system, that is ok. And on such a system, you should refrain from using stdio anyway.
To be absolutely on the safe side, you might want to use snprintf (but that is not necessary).
The exact answer, accounting for space for a string terminator, would be
log10(SIZE_MAX) + 2
where the SIZE_MAX macro is declared in stdint.h. Unfortunately, that's not a compile-time constant (on account of the use of log10()). If you need a compile-time constant that is computed at compile time then you could use this:
sizeof(size_t) * CHAR_BIT * 3 / 10 + 2
That gives the correct answer for size_t up to at least 256 bits. It's based on the fact that 210 is pretty close to 1000. At some large number of bits it will be too small by one, and for much larger numbers of bits it will fall further behind. If you're worried about such large size_t then you could add one or even two to the result.
If you can use snprintf, use:
int len = snprintf (NULL, 0, "%zu", strlen (sSomeString));
On systems where a char is represented using 8 bits,
If sizeof(size_t) is 2, then, the maximum value is: 65535
If sizeof(size_t) is 4, then, the maximum value is: 4294967295
If sizeof(size_t) is 8, then, the maximum value is: 18446744073709551615
If sizeof(size_t) is 16, then, the maximum value is: 3.4028237e+38
You can use that information to extract maximum size of the string using the pre-processor.
Sample program:
#include <stdio.h>
#ifdef MAX_STRING_SIZE
#undef MAX_STRING_SIZE
#endif
// MAX_STRING_SIZE is 6 when sizeof(size_t) is 2
// MAX_STRING_SIZE is 11 when sizeof(size_t) is 4
// MAX_STRING_SIZE is 21 when sizeof(size_t) is 8
// MAX_STRING_SIZE is 40 when sizeof(size_t) is 16
// MAX_STRING_SIZE is -1 for all else. It will be an error to use it
// as the size of an array.
#define MAX_STRING_SIZE (sizeof(size_t) == 2 ? 6 : sizeof(size_t) == 4 ? 11 : sizeof(size_t) == 8 ? 21 : sizeof(size_t) == 16 ? 40 : -1)
int main()
{
char str[MAX_STRING_SIZE];
size_t a = 0xFFFFFFFF;
sprintf(str, "%zu", a);
printf("%s\n", str);
a = 0xFFFFFFFFFFFFFFFF;
sprintf(str, "%zu", a);
printf("%s\n", str);
}
Output:
4294967295
18446744073709551615
It will be easy to adapt it to systems where a char is represented using 16 bits.

Buffer size for converting unsigned long to string

In reference to question and the answer
here: Can I use this method so that the solution will be platform independent.
char *buff = (char*) malloc(sizeof(unsigned long)*8);
sprintf(buff, "%lu", unsigned_long_variable);
Here I am getting the value of buffer length as it will similar to unsigned long variable. Is this approach correct?
Don't even try to calculate the buffer size.
Start with snprintf, which will tell you safely how many characters are needed. Then you know how many bytes to allocate to print safely.
Since this is a few lines of code that you don't want to repeat again and again, write a function malloc_printf that does exactly what you want: In that function, call snprintf with a NULL destination, then malloc the buffer, sprintf into the malloc buffer, and return it. To make it faster and to often avoid two snprintf and sprintf calls, write into a buffer of 256 chars first which is often enough.
So your final code would be
char* buff = malloc_printf ("%lu", unsigned_long_variable);
Also does quick, safe and easy string concatenation using the format %s%s, for example.
You want to know how many characters are needed to represent the largest possible unsigned long. Correct?
To that end, you are trying to calculate the largest possible unsigned long:
sizeof(unsigned long)*8
That is faulty in several ways. For one, sizeof returns multiples of char, which need not be 8 bit. You should multiply with CHAR_BIT (from <limits.h>) instead. But even that is not necessary, because that very same header already does provide the largest possible value -- UCHAR_MAX.
Then you're making a mistake: Your calculation gives the size of the integer representation of unsigned long in bits. What you want is the size of the string representation in characters. This can be achieved with the log10() function (from <math.h>):
log10( UCHAR_MAX )
This will give you a double value that indicates the number of (decimal) digits in UCHAR_MAX. It will be a fraction, which you need to round up (1) (ceil() does this for you).
Thus:
#include <math.h>
#include <stdlib.h>
#include <limits.h>
int main()
{
char * buff = malloc( ceil( log10( UCHAR_MAX ) ) + 1 );
//...
}
All in all, this is quite dodgy (I made two mistakes while writing this out, shame on me -- if you make mistakes when using this, shame on you). And it requires the use of the math library for something that snprintf( NULL, ... ) can do for you more easily, as indicated by the Q&A you linked to.
(1): log10( 9999 ) gives 3.9999565... for the four-digit number.
The C standard doesn't put an upper limit to the number of bits per char.
If someone constructs a C compiler that uses for example 2000 bits per char the output can overflow the buffer.
Instead of 8 you should use CHAR_BIT from limits.h.
Also, note that you need (slighly less than) 1 char per 3 bits and you need 1 byte for the string terminator.
So, something like this:
#include <limit.h>
char *buff = malloc(1 + (sizeof(unsigned long) * CHAR_BIT + 2) / 3);
sprintf(buff, "%lu", unsigned_long_variable);
No, this is not the right way to calculate the buffer size.
E.g. for 4 byte unsigned longs you have values up to 2^32-1
which means 10 decimal digits. So your buffer needs 11 chars.
You are allocating 4 * 8 = 32.
The correct formula is
ceil(log10(2^(sizeof(unsigned long) * CHAR_BIT) - 1)) + 1
(log10 denotes the decimal logarithm here)
A good (safe) estimation is:
(sizeof(unsigned long) * CHAR_BIT + 2) / 3 + 1
because log10(2) is less than 0.33.
Short answer:
#define INTEGER_STRING_SIZE(t) (sizeof (t) * CHAR_BIT / 3 + 3)
unsigned long x;
char buf[INTEGER_STRING_SIZE(x)];
int len = snprintf(buf, sizeof buf, "%lu", x);
if (len < 0 || len >= sizeof buf) Handle_UnexpectedOutput();
OP's use of sizeof(unsigned long)*8 is weak. On systems where CHAR_BIT (the # of bits per char) is large (it must be at least 8), sizeof(unsigned long) could be 1. 1*8 char is certainly too small for 4294967295 (the minimum value for ULONG_MAX).
Concerning: sprintf()/snprintf() Given locale issues, in theory, code may print additional characters like 4,294,967,295 and so exceed the anticipated buffer. Unless very tight memory constraints occur, recommend a 2x anticipated sized buffer.
char buf[ULONG_STRING_SIZE * 2]; // 2x
int len = snprintf(buf, sizeof buf, "%lu", x);
The expected maximum string width of printing some unsigned integer is ceil(log10(unsigned_MAX)) + 1. In the case of of unsigned long, the value of ULONG_MAX certainly does not exceed pow(2,sizeof (unsigned long) * CHAR_BIT) - 1 so code could use:
#define LOG10_2 0.30102999566398119521373889472449
#define ULONG_STRING_SIZE (sizeof (unsigned long) * CHAR_BIT * LOG10_2 + 2)
// For greater portability, should use integer math.
#define ULONG_STRING_SIZE (sizeof (unsigned long) * CHAR_BIT / 3 + 2)
// or more precisely
#define ULONG_STRING_SIZE (sizeof (unsigned long) * CHAR_BIT * 28/93 + 2)
The short answer used +3 in case a signed` integer was specified.

Data stored with pointers

void *memory;
unsigned int b=65535; //1111 1111 1111 1111 in binary
int i=0;
memory= &b;
for(i=0;i<100;i++){
printf("%d, %d, d\n", (char*)memory+i, *((unsigned int * )((char *) memory + i)));
}
I am trying to understand one thing.
(char*)memory+i - print out adress in range 2686636 - 2686735.
and when i store 65535 with memory= &b this should store this number at adress 2686636 and 2686637
because every adress is just one byte so 8 binary characters so when i print it out
*((unsigned int * )((char *) memory + i)) this should print 2686636, 255 and 2686637, 255
instead of it it prints 2686636, 65535 and 2686637, random number
I am trying to implement memory allocation. It is school project. This should represent memory. One adress should be one byte so header will be 2686636-2586639 (4 bytes for size of block) and 2586640 (1 byte char for free or used memory flag). Can someone explain it to me thanks.
Thanks for answers.
void *memory;
void *abc;
abc=memory;
for(i=0;i<100;i++){
*(int*)abc=0;
abc++;
}
*(int*)memory=16777215;
for(i=0;i<100;i++){
printf("%p, %c, %d\n", (char*)memory+i, *((char *)memory +i), *((char *)memory +i));
}
output is
0028FF94,  , -1
0028FF95,  , -1
0028FF96,  , -1
0028FF97, , 0
0028FF98, , 0
0028FF99, , 0
0028FF9A, , 0
0028FF9B, , 0
i think it works. 255 only one -1, 65535 2 times -1 and 16777215 3 times -1.
In your program it seems that address of b is 2686636 and when you will write (char*)memory+i or (char*)&b+i it means this pointer is pointing to char so when you add one to it will jump to only one memory address i.e2686637 and so on till 2686735(i.e.(char*)2686636+99).
now when you are dereferencing i.e.*((unsigned int * )((char *) memory + i))) you are going to get the value at that memory address but you have given value to b only (whose address is 2686636).all other memory address have garbage values which you are printing.
so first you have to store some data at the rest of the addresses(2686637 to 2686735)
good luck..
i hope this will help
I did not mention this in my comments yesterday but it is obvious that your for loop from 0 to 100 overruns the size of an unsigned integer.
I simply ignored some of the obvious issues in the code and tried to give hints on the actual question you asked (difficult to do more than that on a handy :-)). Unfortunately I did not have time to complete this yesterday. So, with one day delay my hints for you.
Try to avoid making assumptions about how big a certain type is (like 2 bytes or 4 bytes). Even if your assumption holds true now, it might change if you switch the compiler or switch to another platform. So use sizeof(type) consequently throughout the code. For a longer discussion on this you might want to take a look at: size of int, long a.s.o. on Stack Overflow. The standard mandates only the ranges a certain type should be able to hold (0-65535 for unsigned int) so a minimal size for types only. This means that the size of int might (and tipically is) bigger than 2 bytes. Beyond primitive types sizeof helps you also with computing the size of structures where due to memory alignment && packing the size of a structure might be different from what you would "expect" by simply looking at its attributes. So the sizeof operator is your friend.
Make sure you use the correct formatting in printf.
Be carefull with pointer arithmetic and casting since the result depends on the type of the pointer (and obviously on the value of the integer you add with).
I.e.
(unsigned int*)memory + 1 != (unsigned char*)memory + 1
(unsigned int*)memory + 1 == (unsigned char*)memory + 1 * sizeof(unsigned int)
Below is how I would write the code:
//check how big is int on our platform for illustrative purposes
printf("Sizeof int: %d bytes\n", sizeof(unsigned int));
//we initialize b with maximum representable value for unsigned int
//include <limits.h> for UINT_MAX
unsigned int b = UINT_MAX; //0xffffffff (if sizeof(unsigned int) is 4)
//we print out the value and its hexadecimal representation
printf("B=%u 0x%X\n", b, b);
//we take the address of b and store it in a void pointer
void* memory= &b;
int i = 0;
//we loop the unsigned chars starting at the address of b up to the sizeof(b)
//(in our case b is unsigned int) using sizeof(b) is better since if we change the type of b
//we do not have to remember to change the sizeof in the for loop. The loop works just the same
for(i=0; i<sizeof(b); ++i)
{
//here we kept %d for formating the individual bytes to represent their value as numbers
//we cast to unsigned char since char might be signed (so from -128 to 127) on a particular
//platform and we want to illustrate that the expected (all bytes 1 -> printed value 255) occurs.
printf("%p, %d\n", (unsigned char *)memory + i, *((unsigned char *) memory + i));
}
I hope you will find this helpfull. And good luck with your school assignment, I hope you learned something you can use now and in the future :-).

Blowfish floating point

I have some floating point data e.g. "3.9389005e-01" which I want to cipher with Blowfish.
strtod is used to get a float from the string.
But Blowfish only encrypts integers.
So my plan is to encrypt each FP value as two integers, a mantisaa and an exponent. Then store as two separate encrypted values.
Decryption will return the mantissa and exponent enabling re-constitution of the original FP number.
is there a neater solution ?
Some demo code for blowfish is here.
You can access any pointer as any other pointer. The data may not be meaningful as another data-type, but it's possible:
double value = 123.456;
int *ptr = (int *) &value;
Now you have a pointer to a memory area of sizeof(double) bytes (or sizeof(double) / sizeof(int) integers) that can be encrypted.
To get back the double after decryption, you can do e.g.:
double new_value = *((double *) ptr);
Declare an union between the double you need to encrypt, and a uint8_t array of the same size.
To encrypt, fill in the double, encrypt the bytes. To decrypt, decrypt the bytes and read off the double.
This approach can be extended to any non-byte data type, provided the cypher you use outputs same-size messages.
In case of padding, the faster approach of "get a uint8_t * to whatever your data iswill sometimes work, sometimes not; AES-256 will work with adoubleof size 8, but block implementations are liable to crash or corrupt data when working with afloatof size 4 (they will attempt to read 8 bytes, and write 8 bytes; where only four are actually available). Due to platform and compiler quirks, this may *still* work because after thefloat` there might be some "memory padding" available.
To be safe, if for example the cypher is padded to 256 bits (32 bytes), you will have to fix the length of the byte array to be likewise padded. One not-so-clean way of doing this is to increase byte count by a whole padding count:
#include <stdio.h>
typedef struct {
double a;
long v;
// Whatever else.
// ...
} payload_t;
union {
payload_t payload;
unsigned char bytes[sizeof(payload_t)+32]; // 256 bits of margin
} data;
int main(void)
{
data.payload.a = 3.14159;
data.payload.v = 123456789;
...
// Encrypt data.bytes, for a length integer multiple of 32 bytes
size_t length = ((sizeof(payload_t)+31)/32)*32;
Should code want to convert the FP value to a string, then encrypt the string and later want to get the exact FP value back, then the FP number needs to be converted with sufficient precision. Use "%.*e" and DBL_DECIMAL_DIG (or DBL_DIG + 3 if unavailable).
#include <float.h>
// sign digit . fraction e sign expo \0 CYA
#define FP_BUF_SIZE (1+1+1+ (DBL_DECIMAL_DIG-1) + 1 + 1 + 4 + 1 + 10)
double x;
char buf[FP_BUF_SIZE];
sprintf(buf, "%.*e", DBL_DECIMAL_DIG - 1, x);
Encode(buf);
Ref Printf width specifier to maintain precision of floating-point value
Alternatively, code could use sprintf(buf, "%a", x);

How to convert from integer to unsigned char in C, given integers larger than 256?

As part of my CS course I've been given some functions to use. One of these functions takes a pointer to unsigned chars to write some data to a file (I have to use this function, so I can't just make my own purpose built function that works differently BTW). I need to write an array of integers whose values can be up to 4095 using this function (that only takes unsigned chars).
However am I right in thinking that an unsigned char can only have a max value of 256 because it is 1 byte long? I therefore need to use 4 unsigned chars for every integer? But casting doesn't seem to work with larger values for the integer. Does anyone have any idea how best to convert an array of integers to unsigned chars?
Usually an unsigned char holds 8 bits, with a max value of 255. If you want to know this for your particular compiler, print out CHAR_BIT and UCHAR_MAX from <limits.h> You could extract the individual bytes of a 32 bit int,
#include <stdint.h>
void
pack32(uint32_t val,uint8_t *dest)
{
dest[0] = (val & 0xff000000) >> 24;
dest[1] = (val & 0x00ff0000) >> 16;
dest[2] = (val & 0x0000ff00) >> 8;
dest[3] = (val & 0x000000ff) ;
}
uint32_t
unpack32(uint8_t *src)
{
uint32_t val;
val = src[0] << 24;
val |= src[1] << 16;
val |= src[2] << 8;
val |= src[3] ;
return val;
}
Unsigned char generally has a value of 1 byte, therefore you can decompose any other type to an array of unsigned chars (eg. for a 4 byte int you can use an array of 4 unsigned chars). Your exercise is probably about generics. You should write the file as a binary file using the fwrite() function, and just write byte after byte in the file.
The following example should write a number (of any data type) to the file. I am not sure if it works since you are forcing the cast to unsigned char * instead of void *.
int homework(unsigned char *foo, size_t size)
{
int i;
// open file for binary writing
FILE *f = fopen("work.txt", "wb");
if(f == NULL)
return 1;
// should write byte by byte the data to the file
fwrite(foo+i, sizeof(char), size, f);
fclose(f);
return 0;
}
I hope the given example at least gives you a starting point.
Yes, you're right; a char/byte only allows up to 8 distinct bits, so that is 2^8 distinct numbers, which is zero to 2^8 - 1, or zero to 255. Do something like this to get the bytes:
int x = 0;
char* p = (char*)&x;
for (int i = 0; i < sizeof(x); i++)
{
//Do something with p[i]
}
(This isn't officially C because of the order of declaration but whatever... it's more readable. :) )
Do note that this code may not be portable, since it depends on the processor's internal storage of an int.
If you have to write an array of integers then just convert the array into a pointer to char then run through the array.
int main()
{
int data[] = { 1, 2, 3, 4 ,5 };
size_t size = sizeof(data)/sizeof(data[0]); // Number of integers.
unsigned char* out = (unsigned char*)data;
for(size_t loop =0; loop < (size * sizeof(int)); ++loop)
{
MyProfSuperWrite(out + loop); // Write 1 unsigned char
}
}
Now people have mentioned that 4096 will fit in less bits than a normal integer. Probably true. Thus you can save space and not write out the top bits of each integer. Personally I think this is not worth the effort. The extra code to write the value and processes the incoming data is not worth the savings you would get (Maybe if the data was the size of the library of congress). Rule one do as little work as possible (its easier to maintain). Rule two optimize if asked (but ask why first). You may save space but it will cost in processing time and maintenance costs.
The part of the assignment of: integers whose values can be up to 4095 using this function (that only takes unsigned chars should be giving you a huge hint. 4095 unsigned is 12 bits.
You can store the 12 bits in a 16 bit short, but that is somewhat wasteful of space -- you are only using 12 of 16 bits of the short. Since you are dealing with more than 1 byte in the conversion of characters, you may need to deal with endianess of the result. Easiest.
You could also do a bit field or some packed binary structure if you are concerned about space. More work.
It sounds like what you really want to do is call sprintf to get a string representation of your integers. This is a standard way to convert from a numeric type to its string representation. Something like the following might get you started:
char num[5]; // Room for 4095
// Array is the array of integers, and arrayLen is its length
for (i = 0; i < arrayLen; i++)
{
sprintf (num, "%d", array[i]);
// Call your function that expects a pointer to chars
printfunc (num);
}
Without information on the function you are directed to use regarding its arguments, return value and semantics (i.e. the definition of its behaviour) it is hard to answer. One possibility is:
Given:
void theFunction(unsigned char* data, int size);
then
int array[SIZE_OF_ARRAY];
theFunction((insigned char*)array, sizeof(array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(*array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(int));
All of which will pass all of the data to theFunction(), but whether than makes any sense will depend on what theFunction() does.

Resources