What's the proper way to copy a char array of a given size to an integer in C? - arrays

Suppose I have a char array and an associated length: Arr and Len. Not a string, a char array. There is no null terminator. Yet I have to copy the array data into an integer of type int64_t. Here's how it's done, and for the purpose of this question I'm assuming Len will not exceed 8:
int64_t Word = 0;
memcpy(&Word, Arr, Len);
Is this actually the proper way to do this? I am copying memory, but is there a faster way to do it inline, for example? So Word can be register?
The problem with a type pun is it assumes that Arr has 8 bytes allocated. No, Arr has at most 8 bytes allocated. It could have 5, so casting Arr to a int64_t * then dereferencing it could try to access three illegal bytes at the end, resulting in segfault.
Is the proper way to do what I describe a memcpy() call, or is there a faster or better way?

Since you specify Len is at most (8), it's reasonable to assume little-endian storage, i.e., the least-significant byte at Arr[0].
If Len was fixed at (8), the compiler might be able to replace memcpy simply by loading the value from memory. That would also be dependent on whether the platform can do unaligned reads - if the compiler can't prove alignment - and may involve something like the bswap instruction on x86-64 if the architecture is big-endian.
The fact that a Len is a run-time value will likely generate a call to memcpy. The overhead of the call itself is not trivial. All things considered, it's probably best just to handle this in an endian-independent way using byte arithmetic. The code assumes 8-bit bytes, which seems consistent with your question.
uint64_t Word = 0;
while (Len--)
Word = (Word << 8) | Arr[Len];
On more exotic platforms, where (CHAR_BIT > 8), you can replace the right-hand side of the OR expression with (Arr[Len] & 0xff). In fact, this is optimised away on platforms with 8-bit (normative) bytes, so you might as well add it for completeness. Or just keep these issues in mind.
There are platforms with legal C implementations where char, short, int are 32-bit values, for example. These are quite common in the embedded world.

Related

Can I cast pointers like this?

Code:
unsigned char array_add[8]={0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00};
...
if ((*((uint32_t*)array_add)!=0)||(*((uint32_t*)array_add+1)!=0))
{
...
}
I want to check if the array is all zero. So naturally I thought of casting the address of an array, which also happens to be the address of the first member, to an unsigned int 32 type, so I'll only need to do this twice, since it's a 64 bit, 8 byte array. Problem is, it was successfully compiled but the program crashes every time around here.
I'm running my program on an 8bit microcontroller, cortex-M0.
How wrong am I?
In theory this could work but in practice there is a thing you aren't considering: aligned memory accesses.
If a uint32_t requires aligned memory access (eg to 4 bytes), then casting an array of unsigned char which has 1 byte alignment requirement to an uint32_t* produces a pointer to an unaligned array of uint32_t.
According to documentation:
There is no support for unaligned accesses on the Cortex-M0 processor. Any attempt to perform an unaligned memory access operation results in a HardFault exception.
In practice this is just dangerous and fragile code which invokes undefined behavior in certain circumstances, as pointed out by Olaf and better explained here.
To test multiple bytes as once code could use memcmp().
How speedy this is depends more on the compiler as a optimizing compiler may simple emit code that does a quick 8 byte at once (or 2 4-byte) compare. Even the memcmp() might not be too slow on an 8-bit processor. Profiling code helps.
Take care in micro-optimizations, as they too often are not efficient use of coders` time for significant optimizations.
unsigned char array_add[8] = ...
const unsigned char array_zero[8]={0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00};
if (memcmp(array_zero, array_add, 8) == 0) ...
Another method uses a union. Be careful not to assume if add.arr8[0] is the most or least significant byte.
union {
uint8_t array8[8];
uint64_t array64;
} add;
// below code will check all 8 of the add.array8[] is they are zero.
if (add.array64 == 0)
In general, focus on writing clear code and reserve such small optimizations to very select cases.
I am not sure but if your array has 8 bytes then just assign base address to a long long variable and compare it to 0. That should solve your problem of checking if the array is all 0.
Edit 1: After Olaf's comment I would say that replace long long with int64_t. However, why do you not a simple loop for iterating the array and checking. 8 chars is all you need to compare.
Edit 2: The other approach could be to OR all elements of array and then compare with 0. If all are 0 then OR will be zero. I do not know whether CMP will be fast or OR. Please refer to Cortex-M0 docs for exact CPU cycles requirement, however, I would expect CMP to be slower.

What does casting char* do to a reference of an int? (Using C)

In my course for intro to operating systems, our task is to determine if a system is big or little endian. There's plenty of results I've found on how to do it, and I've done my best to reconstruct my own version of a code. I suspect it's not the best way of doing it, but it seems to work:
#include <stdio.h>
int main() {
int a = 0x1234;
unsigned char *start = (unsigned char*) &a;
int len = sizeof( int );
if( start[0] > start[ len - 1 ] ) {
//biggest in front (Little Endian)
printf("1");
} else if( start[0] < start[ len - 1 ] ) {
//smallest in front (Big Endian)
printf("0");
} else {
//unable to determine with set value
printf( "Please try a different integer (non-zero). " );
}
}
I've seen this line of code (or some version of) in almost all answers I've seen:
unsigned char *start = (unsigned char*) &a;
What is happening here? I understand casting in general, but what happens if you cast an int to a char pointer? I know:
unsigned int *p = &a;
assigns the memory address of a to p, and that can you affect the value of a through dereferencing p. But I'm totally lost with what's happening with the char and more importantly, not sure why my code works.
Thanks for helping me with my first SO post. :)
When you cast between pointers of different types, the result is generally implementation-defined (it depends on the system and the compiler). There are no guarantees that you can access the pointer or that it correctly aligned etc.
But for the special case when you cast to a pointer to character, the standard actually guarantees that you get a pointer to the lowest addressed byte of the object (C11 6.3.2.3 ยง7).
So the compiler will implement the code you have posted in such a way that you get a pointer to the least significant byte of the int. As we can tell from your code, that byte may contain different values depending on endianess.
If you have a 16-bit CPU, the char pointer will point at memory containing 0x12 in case of big endian, or 0x34 in case of little endian.
For a 32-bit CPU, the int would contain 0x00001234, so you would get 0x00 in case of big endian and 0x34 in case of little endian.
If you de reference an integer pointer you will get 4 bytes of data(depends on compiler,assuming gcc). But if you want only one byte then cast that pointer to a character pointer and de reference it. You will get one byte of data. Casting means you are saying to compiler that read so many bytes instead of original data type byte size.
Values stored in memory are a set of '1's and '0's which by themselves do not mean anything. Datatypes are used for recognizing and interpreting what the values mean. So lets say, at a particular memory location, the data stored is the following set of bits ad infinitum: 01001010 ..... By itself this data is meaningless.
A pointer (other than a void pointer) contains 2 pieces of information. It contains the starting position of a set of bytes, and the way in which the set of bits are to be interpreted. For details, you can see: http://en.wikipedia.org/wiki/C_data_types and references therein.
So if you have
a char *c,
an short int *i,
and a float *f
which look at the bits mentioned above, c, i, and f are the same, but *c takes the first 8 bits and interprets it in a certain way. So you can do things like printf('The character is %c', *c). On the other hand, *i takes the first 16 bits and interprets it in a certain way. In this case, it will be meaningful to say, printf('The character is %d', *i). Again, for *f, printf('The character is %f', *f) is meaningful.
The real differences come when you do math with these. For example,
c++ advances the pointer by 1 byte,
i++ advanced it by 4 bytes,
and f++ advances it by 8 bytes.
More importantly, for
(*c)++, (*i)++, and (*f)++ the algorithm used for doing the addition is totally different.
In your question, when you do a casting from one pointer to another, you already know that the algorithm you are going to use for manipulating the bits present at that location will be easier if you interpret those bits as an unsigned char rather than an unsigned int. The same operatord +, -, etc will act differently depending upon what datatype the operators are looking at. If you have worked in Physics problems wherein doing a coordinate transformation has made the solution very simple, then this is the closest analog to that operation. You are transforming one problem into another that is easier to solve.

Is Using 'sizeof(char)' When Dynamically Allocating A 'char' Redundant?

When dynamically allocating chars, I've always done it like this:
char *pCh = malloc(NUM_CHARS * sizeof(char));
I've recently been told, however, that using sizeof(char) is redundant and unnecessary because, "by definition, the size of a char is one byte," so I should/could write the above line like this:
char *pCh = malloc(NUM_CHARS);
My understanding is the size of a char depends on the native character set that is being used on the target computer. For example, if the native character set is ASCII, a char is one byte (8 bits), and if the native character set is UNICODE a char will necessarily require more bytes (> 8 bits).
To provide maximum portability, wouldn't it be necessary to use sizeof(char), as malloc simply allocates 8-bit bytes? Am I misunderstanding malloc and sizeof(char)?
Yes, it is redundant since the language standard specifies that sizeof (char) is 1. This is because that is the unit in which things are measured, so of course the size of the unit itself must be 1.
Life becomes strange with units defined in terms of themselves, that simply doesn't make any sense. Many people seem to "want" to assume that "there are 8-bit bytes, and sizeof tells me how many such there are in a particular value". That is wrong, that's simply not how it works. It's true that there can be platforms with larger characters than 8 bits, that's why we have CHAR_BIT.
Typically you always "know" when you're allocating characters anyway, but if you really want to include sizeof, you should really consider making it use the pointer, instead:
char *pCh = malloc(NUM_CHARS * sizeof *pCh);
This "locks" the unit size of the thing being allocated the pointer that is used to store the result of the allocation. These two types should match, if you ever see code like this:
int *numbers = malloc(42 * sizeof (float));
that is a huge warning signal; by using the pointer from the left-hand side in the sizeof you make that type of error impossible which I consider a big win:
int *numbers = malloc(42 * sizeof *numbers);
Also, it's likely that if you change the name of the pointer, the malloc() won't compile which it would if you had the name of the (wrong) basic type in there. There is a slight risk that if you forget the asterisk (and write sizeof numbers instead of sizeof *numbers) you'll not get what you want. In practice (for me) this seems to never happen, since the asterisk is pretty well established as part of this pattern, to me.
Also, this usage relies on (and emphasizes) the fact that sizeof is not a function, since no ()s are needed around the pointer de-referencing expression. This is a nice bonus, since many people seem to want to deny this. :)
I find this pattern highly satisfying and recommend it to everyone.
The C99 draft standard section 6.5.3.4 The sizeof operator paragraph 3 states:
When applied to an operand that has type char, unsigned char, or signed char,
(or a qualified version thereof) the result is 1. [...]
In the C11 draft standard it is paragraph 4 but the wording is the same. So NUM_CHARS * sizeof(char) should be equivalent to NUM_CHARS.
We can see from the definition of byte in 3.6 that it is a:
addressable unit of data storage large enough to hold any member of the basic character
set of the execution environment
and Note 2 says:
A byte is composed of a contiguous sequence of bits, the number of which is implementation defined. The least significant bit is called the low-order bit; the most significant bit is called the high-order bit.
The C specification states that sizeof(char) is 1, so as long as you are dealing with conforming implementations of C it is redundant.
The size unit used by mallocis the same. malloc(120) allocates space for 120 char.
A char must be at least 8 bits, but may be larger.
sizeof(char) will always return 1 so it doesn't matter if you use it or nit, it will not change. You may be confusing this with UNICODE wide characters, which have two bytes, but they have a different type wchar_t so you should use sizeof in that case.
If you are working on a system where a byte is defined to have 16 bits, then sizeof(char) would still return 1 as this is what the underlying architecture would allocate. 1 Byte with 16 bits.
Allocation sizes are always measured in units of char, which has size 1 by definition. If you are on a 9-bit machine, malloc understands its argument as a number of 9-bit bytes.
sizeof(char) is always 1, but not because char is always one byte (it needn't be), but rather because the sizeof operator returns the object/type size in units of char.

How do I force the program to use unaligned addresses?

I've heard reads and writes of aligned int's are atomic and safe, I wonder when does the system make non malloc'd globals unaligned other than packed structures and casting/pointer arithmetic byte buffers?
[X86-64 linux] In all of my normal cases, the system always chooses integer locations that don't get word torn, for example, two byte on one word and the other two bytes on the other word. Can any one post a program/snip (C or assembly) that forces the global variable to unaligned address such that the integer gets torn and the system has to use two reads to load one integer value ?
When I print the below program, the addresses are close to each other such that multiple variables are within 64bits but never once word tearing is seen (smartness in the system or compiler ?)
#include <stdio.h>
int a;
char b;
char c;
int d;
int e = 0;
int isaligned(void *p, int N)
{
if (((int)p % N) == 0)
return 1;
else
return 0;
}
int main()
{
printf("processor is %d byte mode \n", sizeof(int *));
printf ( "a=%p/b=%p/c=%p/d=%p/f=%p\n", &a, &b, &c, &d, &e );
printf ( " check for 64bit alignment of test result of 0x80 = %d \n", isaligned( 0x80, 64 ));
printf ( " check for 64bit alignment of a result = %d \n", isaligned( &a, 64 ));
printf ( " check for 64bit alignment of d result = %d \n", isaligned( &e, 64 ));
return 0;}
Output:
processor is 8 byte mode
a=0x601038/b=0x60103c/c=0x60103d/d=0x601034/f=0x601030
check for 64bit alignment of test result of 0x80 = 1
check for 64bit alignment of a result = 0
check for 64bit alignment of d result = 0
How does a read of a char happen in the above case ? Does it read from 8 byte aligned boundary (in my case 0x601030 ) and then go to 0x60103c ?
Memory access granularity is always word size isn't it ?
Thx.
1) Yes, there is no guarantee that unaligned accesses are atomic, because [at least sometimes, on certain types of processors] the data may be written as two separate writes - for example if you cross over a memory page boundary [I'm not talking about 4KB pages for virtual memory, I'm talking about DDR2/3/4 pages, which is some fraction of the total memory size, typically 16Kbits times whatever the width is of the actual memory chip - which will vary depending on the memory stick itself]. Equally, on other processors than x86, you get a trap for reading unaligned memory, which would either cause the program to abort, or the read be emulated in software as multiple reads to "fix" the unaligned read.
2) You could always make an unaligned memory region by something like this:
char *ptr = malloc(sizeof(long long) * number+1);
long long *unaligned = (long long *)&ptr[2];
for(i = 0; i < number; i++)
temp = unaligned[i];
By the way, your alignment check checks if the address is aligned to 64 bytes, not 64 bits. You'll have to divide by 8 to check that it's aligned to 64 bits.
3) A char is a single byte read, and the address will be on the actual address of the byte itself. The actual memory read performed is probably for a full cache-line, starting at the target address, and then cycling around, so for example:
0x60103d is the target address, so the processor will read a cache line of 32 bytes, starting at the 64-bit word we want: 0x601038 (and as soon as that's completed the processor goes on to the next instruction - meanwhile the next read will be performed to fill the cacheline), then cacheline is filled with 0x601020, 0x601028, 0x601030. But should we turn the cache off [if you want your 3GHz latest x86 processor to be slightly slower than a 66MHz 486, disabling the cache is a good way to achieve that], the processor would just read one byte at 0x60103d.
4) Not on x86 processors, they have byte addressing - but for normal memory, reads are done on a cacheline basis, as explained above.
Note also that "may not be atomic" is not at all the same as "will not be atomic" - so you'll probably have a hard time making it go wrong by will - you really need to get all the timings of two different threads just right, and straddle cachelines, straddle memory page boundaries, and so on to make it go wrong - this will happen if you don't want it to happen, but trying to make it go wrong can be darn hard [trust me, I've been there, done that].
It probably doesn't, outside of those cases.
In assembly it's trivial. Something like:
.org 0x2
myglobal:
.word SOME_NUMBER
But on Intel, the processor can safely read unaligned memory. It might not be atomic, but that might not be apparent from the generated code.
Intel, right? The Intel ISA has single-byte read/write opcodes. Disassemble your program and see what it's using.
Not necessarily - you might have a mismatch between memory word size and processor word size.
1) This answer is platform-specific. In general, though, the compiler will align variables unless you force it to do otherwise.
2) The following will require two reads to load one variable when run on a 32-bit CPU:
uint64_t huge_variable;
The variable is larger than a register, so it will require multiple operations to access. You can also do something similar by using packed structures:
struct unaligned __attribute__ ((packed))
{
char buffer[2];
int unaligned;
char buffer2[2];
} sample_struct;
3) This answer is platform-specific. Some platforms may behave like you describe. Some platforms have instructions capable of fetching a half-register or quarter-register of data. I recommend examining the assembly emitted by your compiler for more details (make sure you turn off all compiler optimizations first).
4) The C language allows you to access memory with byte-sized granularity. How this is implemented under the hood and how much data your CPU fetches to read a single byte is platform-specific. For many CPUs, this is the same as the size of a general-purpose register.
The C standards guarantee that malloc(3) returns a memory area that complies to the strictest alignment requirements, so this just can't happen in that case. If there are unaligned data, it is probably read/written by pieces (that depends on the exact guarantees the architecture provides).
On some architectures unaligned access is allowed, on others it is a fatal error. When allowed, it is normally much slower than aligned access; when not allowed the compiler must take the pieces and splice them together, and that is even much slower.
Characters (really bytes) are normally allowed to have any byte address. The instructions working with bytes just get/store the individual byte in that case.
No, memory access is according to the width of the data. But real memory access is in terms of cache lines (read up on CPU cache for this).
Non-aligned objects can never come into existence without you invoking undefined behavior. In other words, there is no sequence of actions, all having well-defined behavior, which a program can take that will result in a non-aligned pointer coming into existence. In particular, there is no portable way to get the compiler to give you misaligned objects. The closest thing is the "packed structure" many compilers have, but that only applies to structure members, not independent objects.
Further, there is no way to test alignedness in portable C. You can use the implementation-defined conversions of pointers to integers and inspect the low bits, but there is no fundamental requirement that "aligned" pointers have zeros in the low bits, or that the low bits after conversion to integer even correspond to the "least significant" bits of the pointer, whatever that would mean. In other words, conversions between pointers and integers are not required to commute with arithmetic operations.
If you really want to make some misaligned pointers, the easiest way to do it, assuming alignof(int)>1, is something like:
char buf[2*sizeof(int)+1];
int *p1 = (int *)buf, *p2 = (int *)(buf+sizeof(int)+1);
It's impossible for both buf and buf+sizeof(int)+1 to be simultaneously aligned for int if alignof(int) is greater than 1. Thus at least one of the two (int *) casts gets applied to a misaligned pointer, invoking undefined behavior, and the typical result is a misaligned pointer.

Is this program compatible on both big and little endian systems?

I wrote a small program which reverses a string and prints it to screen:
void ReverseString(char *String)
{
char *Begin = String;
char *End = String + strlen(String) - 1;
char TempChar = '\0';
while (Begin < End)
{
TempChar = *Begin;
*Begin = *End;
*End = TempChar;
Begin++;
End--;
}
printf("%s",String);
}
It works perfectly in Dev C++ on Windows (little endian).
But I have a sudden doubt of its efficiency. If you look at this line:
while (Begin < End)
I am comparing the address of the beginning and end. Is this the correct way?
Does this code work on a big endian OS like Mac OS X ?
Or am I thinking the wrong way ?
I have got several doubts which I mentioned above.
Can anyone please clarify ?
Your code has no endianness-related issues. There's also nothing wrong with the way you're comparing the two pointers. In short, your code's fine.
Endianness is defined as the order of significance of the bytes in a multi-byte primitive type. So if your int is big-endian, that means the first byte (i.e. the one with the lowest address) of an int in memory contains the most significant bits of the int, and so on to the last/least significant. That's all it means. When we say a system is big-endian, that generally means that all of its pointer and arithmetic types are big-endian, although there are some odd special cases out there. Endian-ness doesn't affect pointer arithmetic or comparison, or the order in which strings are stored in memory.
Your code does not use any multi-byte primitive types[*], so endian-ness is irrelevant. In general, endian-ness only becomes relevant if you somehow access the individual bytes of such an object (for example by casting a pointer to unsigned char*, writing the memory to a file or over the network, and the like).
Supposing a caller did something like this:
int x = 0x00010203; // assuming sizeof(int) == 4 and CHAR_BIT == 8
ReverseString((char *)&x);
Then their code would be endian-dependent. On a big-endian system, they would pass you an empty string, since the first byte would be 0, so your code would leave x unchanged. On a little-endian system they would pass you a three-byte string, since the first three bytes would be 0x03, 0x02, 0x01 and the fourth byte 0, so your code would change x to 0x00030201
[*] well, the pointers are multi-byte, on OSX and on pretty much every C implementation. But you don't inspect their storage representations, you just use them as values, so there's no opportunity for behavior to differ according to endianness.
As far as I know, endianness does not affect a char * as each character is a single byte and forms an array of characters. Have a look at http://www.ibm.com/developerworks/aix/library/au-endianc/index.html?ca=drs-
The effect will be seen in multi byte data types like int.
As long as you manipulate whole type T objects (which is what you do with type T being char) you just can't run into endianness problems.
You could run into them if you for example tried to manipulate separate bytes within a larger type (an int for example) but you don't do anything like that. This is why endianness problems are impossible in your code, period.

Resources