How to check a buffer in C? - c

I have a buffer of size 1500. In that buffer I need to check whether 15 bytes are all zeros or not (from 100 to 115). How can we do this (if we do not use any loop for it)? Data is of type "unsigned char", actually it is an unsigned char array.
Platform : Linux, C, gcc compiler
Will using memcmp() be correct or not? I am reading some data from a smart card and storing them in a buffer. Now I need to check whether the last 15 bytes are consecutively zeros or not.
I mentioned memcmp() here because I need an efficient approach; already the smart card reading has taken some time.
Or going for bitwise comparison will be correct or not . Please suggest .

unsigned char buffer[1500];
...
bool allZeros = true;
for (int i = 99; i < 115; ++i)
{
if (buffer[i] != 0)
{
allZeros = false;
break;
}
}
.
static const unsigned char zeros[15] = {0};
...
unsigned char buffer[1500];
...
bool allZeros = (memcmp(&buffer[99], zeros, 15) == 0);

Use a loop. It's the clearest, most accurate way to express your intent. The compiler will optimize it as much as possible. By "optimizing" it yourself, you can actually make things worse.
True story, happened to me a few days ago: I was 'optimizing' a comparison function between two 256-bit integers. The old version used a for loop to compare the 8 32-bit integers that comprised the 256-bit integers, I changed it to a memcmp. It was slower. Turns out that my 'optimization' blinded the compiler to the fact that both buffers were 32-bit aligned, causing it to use a less efficient comparison routine. It had already optimized out my loop anyway.

100 to 115 is not 15 byte, it is 16 byte.
I assume int size is 16 byte in your system.
if (0 == *((unsigned int*)(buffer + 100))) {
// all are zero
}

I implemented like this :
699 int is_empty_buffer(unsigned char *buff , size_t size)
700 {
701 return *buff || memcmp(buff , buff+1, size);
702 }
703
if the return value is zero then it's empty

Related

Vectorize random init and print for BigInt with decimal digit array, with AVX2?

How could I pass my code to AVX2 code and get the same result as before?
Is it possible to use __m256i in the LongNumInit, LongNumPrint functions instead of uint8_t *L, or some similar type of variable?
My knowledge of AVX is quite limited; I investigated quite a bit however I do not understand very well how to transform my code any suggestion and explanation is welcome.
I'm really interested in this code in AVX2.
void LongNumInit(uint8_t *L, size_t N )
{
for(size_t i = 0; i < N; ++i){
L[i] = myRandom()%10;
}
}
void LongNumPrint( uint8_t *L, size_t N, uint8_t *Name )
{
printf("%s:", Name);
for ( size_t i=N; i>0;--i )
{
printf("%d", L[i-1]);
}
printf("\n");
}
int main (int argc, char **argv)
{
int i, sum1, sum2, sum3, N=10000, Rep=50;
seed = 12345;
// obtain parameters at run time
if (argc>1) { N = atoi(argv[1]); }
if (argc>2) { Rep = atoi(argv[2]); }
// Create Long Nums
unsigned char *V1= (unsigned char*) malloc( N);
unsigned char *V2= (unsigned char*) malloc( N);
unsigned char *V3= (unsigned char*) malloc( N);
unsigned char *V4= (unsigned char*) malloc( N);
LongNumInit ( V1, N ); LongNumInit ( V2, N ); LongNumInit ( V3, N );
//Print last 32 digits of Long Numbers
LongNumPrint( V1, 32, "V1" );
LongNumPrint( V2, 32, "V2" );
LongNumPrint( V3, 32, "V3" );
LongNumPrint( V4, 32, "V4" );
free(V1); free(V2); free(V3); free(V4);
return 0;
}
The result that I obtain in my initial code is this:
V1:59348245908804493219098067811457
V2:24890422397351614779297691741341
V3:63392771324953818089038280656869
V4:00000000000000000000000000000000
This is a terrible format for BigInteger in general, see https://codereview.stackexchange.com/a/237764 for a code review of the design flaws in using one decimal digit per byte for BigInteger, and what you could/should do instead.
And see Can long integer routines benefit from SSE? for #Mysticial's notes on ways to store your data that make SIMD for BigInteger math practical, specifically partial-word arithmetic where your temporaries might not be "normalized", letting you do lazy carry handling.
But apparently you're just asking about this code, the random-init and print functions, not how to do math between two numbers in this format.
We can vectorize both of these quite well. My LongNumPrintName() is a drop-in replacement for yours.
For LongNumInit I'm just showing a building-block that stores two 32-byte chunks and returns an incremented pointer. Call it in a loop. (It naturally produces 2 vectors per call so for small N you might make an alternate version.)
LongNumInit
What's the fastest way to generate a 1 GB text file containing random digits? generates space-separated random ASCII decimal digits at about 33 GB/s on 4GHz Skylake, including overhead of write() system calls to /dev/null. (This is higher than DRAM bandwidth; cache blocking for 128kiB lets the stores hit in L2 cache. The kernel driver for /dev/null doesn't even read the user-space buffer.)
It could easily be adapted into an AVX2 version of void LongNumInit(uint8_t *L, size_t N ). My answer there uses an AVX2 xorshift128+ PRNG (vectorized with 4 independent PRNGs in the 64-bit elements of a __m256i) like AVX/SSE version of xorshift128+. That should be similar quality of randomness to your rand() % 10.
It breaks that up into decimal digits via a multiplicative inverse to divide and modulo by 10 with shifts and vpmulhuw, using Why does GCC use multiplication by a strange number in implementing integer division?. (Actually using GNU C native vector syntax to let GCC determine the magic constant and emit the multiplies and shifts for convenient syntax like v16u dig1 = v % ten; and v /= ten;)
You can use _mm256_packus_epi16 to pack two vectors of 16-bit digits into 8-bit elements instead of turning the odd elements into ASCII ' ' and the even elements into ASCII '0'..'9'. (So change vec_store_digit_and_space to pack pairs of vectors instead of ORing with a constant, see below)
Compile this with gcc, clang, or ICC (or hopefully any other compiler that understands the GNU C dialect of C99, and Intel's intrinsics).
See https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html for the __attribute__((vector_size(32))) part, and https://software.intel.com/sites/landingpage/IntrinsicsGuide/ for the _mm256_* stuff. Also https://stackoverflow.com/tags/sse/info.
#include <immintrin.h>
// GNU C native vectors let us get the compiler to do stuff like %10 each element
typedef unsigned short v16u __attribute__((vector_size(32)));
// returns p + size of stores. Caller should use outpos = f(vec, outpos)
// p must be aligned
__m256i* vec_store_digits(__m256i vec, __m256i *restrict p)
{
v16u v = (v16u)vec;
v16u ten = (v16u)_mm256_set1_epi16(10);
v16u divisor = (v16u)_mm256_set1_epi16(6554); // ceil((2^16-1) / 10.0)
v16u div6554 = v / divisor; // Basically the entropy from the upper two decimal digits: 0..65.
// Probably some correlation with the modulo-based values, especially dig3, but we do this instead of
// dig4 for more ILP and fewer instructions total.
v16u dig1 = v % ten;
v /= ten;
v16u dig2 = v % ten;
v /= ten;
v16u dig3 = v % ten;
// dig4 would overlap much of the randomness that div6554 gets
// __m256i or v16u assignment is an aligned store
v16u *vecbuf = (v16u*)p;
// pack 16->8 bits.
vecbuf[0] = _mm256_packus_epi16(div6554, dig1);
vecbuf[1] = _mm256_packus_epi16(dig2, dig3)
return p + 2; // always a constant number of full vectors
}
The logic in random_decimal_fill_buffer that inserts newlines can be totally removed because you just want a flat array of decimal digits. Just call the above function in a loop until you've filled your buffer.
Handling small sizes (less than a full vector):
It would be convenient to pad your malloc up to the next multiple of 32 bytes so it's always safe to do a 32-byte load without checking for maybe crossing into an unmapped page.
And use C11 aligned_alloc to get 32-byte aligned storage. So for example, aligned_alloc(32, (size+31) & -32). This lets us just do full 32-byte stores even if N is odd. Logically only the first N bytes of the buffer hold our real data, but it's convenient to have padding we can scribble over to avoid any extra conditional checks for N being less than 32, or not a multiple of 32.
Unfortunately ISO C and glibc are missing aligned_realloc and aligned_calloc. MSVC does actually provide those: Why is there no 'aligned_realloc' on most platforms? allowing you to sometimes allocate more space at the end of an aligned buffer without copying it. A "try_realloc" would be ideal for C++ that might need to run copy-constructors if non-trivially copyable objects change address. Non-expressive allocator APIs that force sometimes-unnecessary copying is a pet peeve of mine.
LongNumPrint
Taking a uint8_t *Name arg is bad design. If the caller wants to printf a "something:" string first, they can do that. Your function should just do what printf "%d" does for an int.
Since you're storing your digits in reverse printing order, you'll want to byte-reverse into a tmp buffer and convert 0..9 byte values to '0'..'9' ASCII character values by ORing with '0'. Then pass that buffer to fwrite.
Specifically, use alignas(32) char tmpbuf[8192]; as a local variable.
You can work in fixed-size chunks (like 1kiB or 8kiB) instead allocating a potentially-huge buffer. You probably want to still go through stdio (instead of write() directly and managing your own I/O buffering). With an 8kiB buffer, an efficient fwrite might just pass that on to write() directly instead of memcpy into the stdio buffer. You might want to play around with tuning this, but keeping the tmp buffer comfortably smaller than half of L1d cache will mean it's still hot in cache when it's re-read after you wrote it.
Cache blocking makes the loop bounds a lot more complex but it's worth it for very large N.
Byte-reversing 32 bytes at a time:
You could avoid this work by deciding that your digits are stored in MSD-first order, but then if you did want to implement addition it would have to loop from the end backwards.
The your function could be implemented with SIMD _mm_shuffle_epi8 to reverse 16-byte chunks, starting from the end of you digit array and writing to the beginning of your tmp buffer.
Or better, load vmovdqu / vinserti128 16-byte loads to feed _mm256_shuffle_epi8 to byte-reverse within lanes, setting up for 32-byte stores.
On Intel CPUs, vinserti128 decodes to a load+ALU uop, but it can run on any vector ALU port, not just the shuffle port. So two 128-bit loads are more efficient than 256-bit load -> vpshufb - > vpermq which would probably bottleneck on shuffle-port throughput if data was hot in cache. Intel CPUs can do up to 2 loads + 1 store per clock cycle (or in IceLake, 2 loads + 2 stores). We'll probably bottleneck on the front-end if there are no memory bottlenecks, so in practice not saturating load+store and shuffle ports. (https://agner.org/optimize/ and https://uops.info/)
This function is also simplified by the assumption that we can always read 32 bytes from L without crossing into an unmapped page. But after a 32-byte reverse for small N, the first N bytes of the input become the last N bytes in a 32-byte chunk. It would be most convenient if we could always safely do a 32-byte load ending at the end of a buffer, but it's unreasonable to expect padding before the object.
#include <immintrin.h>
#include <stdalign.h>
#include <stddef.h>
#include <stdio.h>
#include <stdint.h>
// one vector of 32 bytes of digits, reversed and converted to ASCII
static inline
void ASCIIrev32B(void *dst, const void *src)
{
__m128i hi = _mm_loadu_si128(1 + (const __m128i*)src); // unaligned loads
__m128i lo = _mm_loadu_si128(src);
__m256i v = _mm256_set_m128i(lo, hi); // reverse 128-bit hi/lo halves
// compilers will hoist constants out of inline functions
__m128i byterev_lane = _mm_set_epi8(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15);
__m256i byterev = _mm256_broadcastsi128_si256(byterev_lane); // same in each lane
v = _mm256_shuffle_epi8(v, byterev); // in-lane reverse
v = _mm256_or_si256(v, _mm256_set1_epi8('0')); // digits to ASCII
_mm256_storeu_si256(dst, v); // Will usually be aligned in practice.
}
// Tested for N=32; could be bugs in the loop bounds for other N
// returns bytes written, like fwrite: N means no error, 0 means error in all fwrites
size_t LongNumPrint( uint8_t *num, size_t N)
{
// caller can print a name if it wants
const int revbufsize = 8192; // 8kiB on the stack should be fine
alignas(32) char revbuf[revbufsize];
if (N<32) {
// TODO: maybe use a smaller revbuf for this case to avoid touching new stack pages
ASCIIrev32B(revbuf, num); // the data we want is at the *end* of a 32-byte reverse
return fwrite(revbuf+32-N, 1, N, stdout);
}
size_t bytes_written = 0;
const uint8_t *inp = num+N; // start with last 32 bytes of num[]
do {
size_t chunksize = (inp - num >= revbufsize) ? revbufsize : inp - num;
const uint8_t *inp_stop = inp - chunksize + 32; // leave one full vector for the end
uint8_t *outp = revbuf;
while (inp > inp_stop) { // may run 0 times
inp -= 32;
ASCIIrev32B(outp, inp);
outp += 32;
}
// reverse first (lowest address) 32 bytes of this chunk of num
// into last 32 bytes of this chunk of revbuf
// if chunksize%32 != 0 this will overlap, which is fine.
ASCIIrev32B(revbuf + chunksize - 32, inp_stop - 32);
bytes_written += fwrite(revbuf, 1, chunksize, stdout);
inp = inp_stop - 32;
} while ( inp > num );
return bytes_written;
// caller can putchar('\n') if it wants
}
// wrapper that prints name and newline
void LongNumPrintName(uint8_t *num, size_t N, const char *name)
{
printf("%s:", name);
//LongNumPrint_scalar(num, N);
LongNumPrint(num, N);
putchar('\n');
}
// main() included on Godbolt link that runs successfully
This compiles and runs (on Godbolt) with gcc -O3 -march=haswell and produces identical output to your scalar loop for the N=32 that main passes. (I used rand() instead of MyRandom(), so we could test with the same seed and get the same numbers, using your init function.)
Untested for larger N, but the general idea of chunksize = min(ptrdiff, 8k) and using that to loop downwards from the end of num[] should be solid.
We could load (not just store) aligned vectors if we converted the first N%32 bytes and passed that to fwrite before starting the main loop. But that probably either leads to an extra write() system call, or to clunky copying inside stdio. (Unless there was already buffered text not printed yet, like Name:, in which case we already have that penalty.)
Note that it's technically C UB to decrement inp past start of num. So inp -= 32 instead of inp = inp_stop-32 would have that UB for the iteration that leaves the outer loop. I actually avoid that in this version, but it generally works anyway because I think GCC assumes a flat memory model and de-factor defines the behaviour of pointer compares enough. And normal OSes reserve the zero page so num definitely can't be within 32 bytes of the start of physical memory (so inp can't wrap to a high address.) This paragraph is mostly left-over from the first totally untested attempt that I thought was decrementing the pointer farther in the inner loop than it actually was.

How to convert to integer a char[4] of "hexadecimal" numbers [C/Linux]

So I'm working with system calls in Linux. I'm using "lseek" to navigate through the file and "read" to read. I'm also using Midnight Commander to see the file in hexadecimal. The next 4 bytes I have to read are in little-endian , and look like this : "2A 00 00 00". But of course, the bytes can be something like "2A 5F B3 00". I have to convert those bytes to an integer. How do I approach this? My initial thought was to read them into a vector of 4 chars, and then to build my integer from there, but I don't know how. Any ideas?
Let me give you an example of what I've tried. I have the following bytes in file "44 00". I have to convert that into the value 68 (4 + 4*16):
char value[2];
read(fd, value, 2);
int i = (value[0] << 8) | value[1];
The variable i is 17480 insead of 68.
UPDATE: Nvm. I solved it. I mixed the indexes when I shift. It shoud've been value[1] << 8 ... | value[0]
General considerations
There seem to be several pieces to the question -- at least how to read the data, what data type to use to hold the intermediate result, and how to perform the conversion. If indeed you are assuming that the on-file representation consists of the bytes of a 32-bit integer in little-endian order, with all bits significant, then I probably would not use a char[] as the intermediate, but rather a uint32_t or an int32_t. If you know or assume that the endianness of the data is the same as the machine's native endianness, then you don't need any other.
Determining native endianness
If you need to compute the host machine's native endianness, then this will do it:
static const uint32_t test = 1;
_Bool host_is_little_endian = *(char *)&test;
It is worthwhile doing that, because it may well be the case that you don't need to do any conversion at all.
Reading the data
I would read the data into a uint32_t (or possibly an int32_t), not into a char array. Possibly I would read it into an array of uint8_t.
uint32_t data;
int num_read = fread(&data, 4, 1, my_file);
if (num_read != 1) { /* ... handle error ... */ }
Converting the data
It is worthwhile knowing whether the on-file representation matches the host's endianness, because if it does, you don't need to do any transformation (that is, you're done at this point in that case). If you do need to swap endianness, however, then you can use ntohl() or htonl():
if (!host_is_little_endian) {
data = ntohl(data);
}
(This assumes that little- and big-endian are the only host byte orders you need to be concerned with. Historically, there have been others, which is why the byte-reorder functions come in pairs, but you are extremely unlikely ever to see one of the others.)
Signed integers
If you need a signed instead of unsigned integer, then you can do the same, but use a union:
union {
uint32_t unsigned;
int32_t signed;
} data;
In all of the preceding, use data.unsigned in place of plain data, and at the end, read out the signed result from data.signed.
Suppose you point into your buffer:
unsigned char *p = &buf[20];
and you want to see the next 4 bytes as an integer and assign them to your integer, then you can cast it:
int i;
i = *(int *)p;
You just said that p is now a pointer to an int, you de-referenced that pointer and assigned it to i.
However, this depends on the endianness of your platform. If your platform has a different endianness, you may first have to reverse-copy the bytes to a small buffer and then use this technique. For example:
unsigned char ibuf[4];
for (i=3; i>=0; i--) ibuf[i]= *p++;
i = *(int *)ibuf;
EDIT
The suggestions and comments of Andrew Henle and Bodo could give:
unsigned char *p = &buf[20];
int i, j;
unsigned char *pi= &(unsigned char)i;
for (j=3; j>=0; j--) *pi++= *p++;
// and the other endian:
int i, j;
unsigned char *pi= (&(unsigned char)i)+3;
for (j=3; j>=0; j--) *pi--= *p++;

memcpy long long int (casting to char*) into char array

I was trying to split a long long into 8 character. Which the first 8 bits represent the first character while the next represent the second...etc.
However, I was using two methods. First , I shift and cast the type and it went well.
But I've failed when using memcpy. The result would be reversed...(which the first 8 bits become the last character). Shouldn't the memory be consecutive and in the same order? Or was I messing something up...
void num_to_str(){
char str[100005] = {0};
unsigned long long int ans = 0;
scanf("%llu" , &ans);
for(int j = 0; j < 8; j++){
str[8 * i + j] = (unsigned char)(ans >> (56 - 8 * j));
}
printf("%s\n", str);
return;
}
This work great:
input : 8102661169684245760
output : program
However, the following doesn't act as I expected.
void num_to_str(){
char str[100005] = {0};
unsigned long long int ans = 0;
scanf("%llu" , &ans);
memcpy(str , (char *)&ans , 8);
for(int i = 0; i < 8; i++)
printf("%c", str[i]);
return;
}
This work unexpectedly:
input : 8102661169684245760
output : margorp
PS:I couldn't even use printf("%s" , str) or puts(str)
I assume that the first character was stored as '\0'
I am a beginner, so I'll be grateful if someone can help me out
The order of bytes within a binary representation of a number within an encoding scheme is called endianness.
In a big-endian system bytes are ordered from the most significant byte to the least significant.
In a little-endian system bytes are ordered from the least significant byte to the most significant one.
There are other endianness, but they are considered esoteric nowadays so you won't find them in practice.
If you run your program on a little endian system, (e.g. x86) you get your results.
You can read more:
https://en.wikipedia.org/wiki/Endianness
You may think why would anyone sane design and use a little endian system where bytes are reversed from how we humans are used (we use big endian for ordering digits when we write). But there are advantages. You can read some here: The reason behind endianness?

Determine the ranges of char by direct computation in C89 (do not use limits.h)

I am trying to solve the Ex 2-1 of K&R's C book. The exercise asks to, among others, determine the ranges of char by direct computation (rather than printing the values directly from the limits.h). Any idea on how this should be done nicely?
Ok, I throw my version in the ring:
unsigned char uchar_max = (unsigned char)~0;
// min is 0, of course
signed char schar_min = (signed char)(uchar_max & ~(uchar_max >> 1));
signed char schar_max = (signed char)(0 - (schar_min + 1));
It does assume 2's complement for signed and the same size for signed and unsigned char. While the former I just define, the latter I'm sure can be deduced from the standard as both are char and have to hold all encodings of the "execution charset" (What would that imply for RL-encoded charsets like UTF-8).
It is straigt-forward to get a 1's complement and sing/magnitude-version from this. Note that the unsigned version is always the same.
One advantage is that is completely runs with char types and no loops, etc. So it will be still performant on 8-bit architectures.
Hmm ... I really thought this would need a loop for signed. What did I miss?
Assuming that the type will wrap intelligently1, you can simply start by setting the char variable to be zero.
Then increment it until the new value is less than the previous value.
The new value is the minimum, the previous value was the maximum.
The following code should be a good start:
#include<stdio.h>
int main (void) {
char prev = 0, c = 0;
while (c >= prev) {
prev = c;
c++;
}
printf ("Minimum is %d\n", c);
printf ("Maximum is %d\n", prev);
return 0;
}
1 Technically, overflowing a variable is undefined behaviour and anything can happen, but the vast majority of implementations will work. Just keep in mind it's not guaranteed to work.
In fact, the difficulty in working this out in a portable way (some implementations had various different bit-widths for char and some even used different encoding schemes for negative numbers) is probably precisely why those useful macros were put into limits.h in the first place.
You could always try the ol' standby, printf...
let's just strip things down for simplicity's sake.
This isn't a complete answer to your question, but it will check to see if a char is 8-bit--with a little help (yes, there's a bug in the code). I'll leave it up to you to figure out how.
#include <stdio.h>
#DEFINE MMAX_8_BIT_SIGNED_CHAR 127
main ()
{
char c;
c = MAX_8_BIT_SIGNED_CHAR;
printf("%d\n", c);
c++;
printf("%d\n", c);
}
Look at the output. I'm not going to give you the rest of the answer because I think you will get more out of it if you figure it out yourself, but I will say that you might want to take a look at the bit shift operator.
There are 3 relatively simple functions that can cover both the signed and unsigned types on both x86 & x86_64:
/* signed data type low storage limit */
long long limit_s_low (unsigned char bytes)
{ return -(1ULL << (bytes * CHAR_BIT - 1)); }
/* signed data type high storage limit */
long long limit_s_high (unsigned char bytes)
{ return (1ULL << (bytes * CHAR_BIT - 1)) - 1; }
/* unsigned data type high storage limit */
unsigned long long limit_u_high (unsigned char bytes)
{
if (bytes < sizeof (long long))
return (1ULL << (bytes * CHAR_BIT)) - 1;
else
return ~1ULL - 1;
}
With CHAR_BIT generally being 8.
the smart way, simply calculate sizeof() of your variable and you know it's that many times larger than whatever has sizeof()=1, usually char. Given that you can use math to calculate the range. Doesn't work if you have odd sized types, like 3 bit chars or something.
the try hard way, put 0 in the type, and increment until it doesn't increment anymore (wrap around or stays the same depending on machine). Whatever the number before that was, that's the max. Do the same for min.

How to convert from integer to unsigned char in C, given integers larger than 256?

As part of my CS course I've been given some functions to use. One of these functions takes a pointer to unsigned chars to write some data to a file (I have to use this function, so I can't just make my own purpose built function that works differently BTW). I need to write an array of integers whose values can be up to 4095 using this function (that only takes unsigned chars).
However am I right in thinking that an unsigned char can only have a max value of 256 because it is 1 byte long? I therefore need to use 4 unsigned chars for every integer? But casting doesn't seem to work with larger values for the integer. Does anyone have any idea how best to convert an array of integers to unsigned chars?
Usually an unsigned char holds 8 bits, with a max value of 255. If you want to know this for your particular compiler, print out CHAR_BIT and UCHAR_MAX from <limits.h> You could extract the individual bytes of a 32 bit int,
#include <stdint.h>
void
pack32(uint32_t val,uint8_t *dest)
{
dest[0] = (val & 0xff000000) >> 24;
dest[1] = (val & 0x00ff0000) >> 16;
dest[2] = (val & 0x0000ff00) >> 8;
dest[3] = (val & 0x000000ff) ;
}
uint32_t
unpack32(uint8_t *src)
{
uint32_t val;
val = src[0] << 24;
val |= src[1] << 16;
val |= src[2] << 8;
val |= src[3] ;
return val;
}
Unsigned char generally has a value of 1 byte, therefore you can decompose any other type to an array of unsigned chars (eg. for a 4 byte int you can use an array of 4 unsigned chars). Your exercise is probably about generics. You should write the file as a binary file using the fwrite() function, and just write byte after byte in the file.
The following example should write a number (of any data type) to the file. I am not sure if it works since you are forcing the cast to unsigned char * instead of void *.
int homework(unsigned char *foo, size_t size)
{
int i;
// open file for binary writing
FILE *f = fopen("work.txt", "wb");
if(f == NULL)
return 1;
// should write byte by byte the data to the file
fwrite(foo+i, sizeof(char), size, f);
fclose(f);
return 0;
}
I hope the given example at least gives you a starting point.
Yes, you're right; a char/byte only allows up to 8 distinct bits, so that is 2^8 distinct numbers, which is zero to 2^8 - 1, or zero to 255. Do something like this to get the bytes:
int x = 0;
char* p = (char*)&x;
for (int i = 0; i < sizeof(x); i++)
{
//Do something with p[i]
}
(This isn't officially C because of the order of declaration but whatever... it's more readable. :) )
Do note that this code may not be portable, since it depends on the processor's internal storage of an int.
If you have to write an array of integers then just convert the array into a pointer to char then run through the array.
int main()
{
int data[] = { 1, 2, 3, 4 ,5 };
size_t size = sizeof(data)/sizeof(data[0]); // Number of integers.
unsigned char* out = (unsigned char*)data;
for(size_t loop =0; loop < (size * sizeof(int)); ++loop)
{
MyProfSuperWrite(out + loop); // Write 1 unsigned char
}
}
Now people have mentioned that 4096 will fit in less bits than a normal integer. Probably true. Thus you can save space and not write out the top bits of each integer. Personally I think this is not worth the effort. The extra code to write the value and processes the incoming data is not worth the savings you would get (Maybe if the data was the size of the library of congress). Rule one do as little work as possible (its easier to maintain). Rule two optimize if asked (but ask why first). You may save space but it will cost in processing time and maintenance costs.
The part of the assignment of: integers whose values can be up to 4095 using this function (that only takes unsigned chars should be giving you a huge hint. 4095 unsigned is 12 bits.
You can store the 12 bits in a 16 bit short, but that is somewhat wasteful of space -- you are only using 12 of 16 bits of the short. Since you are dealing with more than 1 byte in the conversion of characters, you may need to deal with endianess of the result. Easiest.
You could also do a bit field or some packed binary structure if you are concerned about space. More work.
It sounds like what you really want to do is call sprintf to get a string representation of your integers. This is a standard way to convert from a numeric type to its string representation. Something like the following might get you started:
char num[5]; // Room for 4095
// Array is the array of integers, and arrayLen is its length
for (i = 0; i < arrayLen; i++)
{
sprintf (num, "%d", array[i]);
// Call your function that expects a pointer to chars
printfunc (num);
}
Without information on the function you are directed to use regarding its arguments, return value and semantics (i.e. the definition of its behaviour) it is hard to answer. One possibility is:
Given:
void theFunction(unsigned char* data, int size);
then
int array[SIZE_OF_ARRAY];
theFunction((insigned char*)array, sizeof(array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(*array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(int));
All of which will pass all of the data to theFunction(), but whether than makes any sense will depend on what theFunction() does.

Resources