Confusion about a memory alignment example - c

When reading some posts for memory alignment knowlodge, I have a question about a good answer from What is aligned memory allocation?, #dan04.
Reading the example he gives,
0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d| bytes
| | | words
The problem is that on some CPU architectures, the instruction to load a 4-byte integer from memory only works on word boundaries. So your program would have to fetch each half of b with separate instructions.
Why can't (Can it?) read the 4 bytes(a word, assume 32bits) directly that contains b?
For example, if I want b
0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d| bytes
| | a word(assume it's 32 bit, get b directly)
read 1 word starts from address 2.
if I want a
0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d| bytes
| | a word
read 1 word starts from address 0 and get the first 2 bytes and discard the latter 2 bytes.
if I want c and d
0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d| bytes
| | a word
read 1 word starts from address 4 and get the last 2 bytes and discard the first 2 bytes.
Then it seems alignment is not needed which is definitely incorrect..
I must have misunderstood something or lack some other knowledge, please help correct me..

"Why can't (Can it?) read the 4 bytes(a word, assume 32bits) directly that contains b?"
The answer you have quoted already right above. The key is "on word boundaries". That is not the same as "in word size". I.e. those CPUs can read word width only from exactly N*wordwidth, not from N*wordwidth+2.
A wordboundary (only applicable on the mentioned platforms) is a clean multiple of the wordwidth. 0, 4, 8, 12... But not 2, 6, 10...
Picking up your phrasing from comment, yes.
Those CPUs can only read from address 0, 4, 8, 12, 16 and so on.
E.g. one word from addresses 0-3, one word from address 4-7.
(Note the added 12.)

Related

How to process array data in chunks where last chunk may be a partial size

I have array of integer and I am trying to send this array as a sub block from esp32 to another one.
According this code I get on output like this:
output:
1 2 3 4 5
6 7 8 9 10
11 12 0 0 0
the expected output:
1 2 3 4 5
6 7 8 9 10
11 12
How can I update on esp_now_send to get like the expected output? how can I deal with the last sub block if it is less than 5 numbers?
The code needs to send up to only the available data. To do that the general approach would be to send full sub-blocks until the last sub-block which may be a partial one. That can be determined by simple maths logic to work out how much the current iteration should send based on how much data is left.
The code changes would be:
Change siz to be the real number of entries in the array: siz = sizeof(data)/sizeof(data[0]).
Change rang in the function call to `(ind + rang <= size ? rang : size - ind)``. That is, the size passed to the function call depends on how much data is left.

File in C, read some integers and put in array

In C, if i want to read a list of integers, but the problem is this.
I must read the first two number and this two number are a dimension for my array.
Now, my doubt is, i read this two number, but my file will point to next int?
Example: (there are 3 array of integres in this file)
3 4
11 -1 1 -12
0 -2 12 2
-8 4 4 7
2 3
8 -8 1
6 -3 -3
3 2
1 1
3 4
-1 8
first array is:
DIMENSION [3][4]
11 -1 1 -12
0 -2 12 2
-8 4 4 7
After you open a file to read, each call to read moves the cursor, meaning after you have read the first line which contains the dimensions, your cursor will be at the start of the secind line.
To read the the arrays after getting the dimensions, I would use malloc to create the two dimensional array for the values.
After each block of array you can call free and malloc or realloc to read the next one.

Why do I need to use type** to point to type*?

I've been reading Learn C The Hard Way for a few days, but here's something I want to really understand. Zed, the author, wrote that char ** is for a "pointer to (a pointer to char)", and saying that this is needed because I'm trying to point to something 2-dimensional.
Here is what's exactly written in the webpage
A char * is already a "pointer to char", so that's just a string. You however need 2 levels, since names is 2-dimensional, that means you need char ** for a "pointer to (a pointer to char)" type.
Does this mean that I have to use a variable that can point to something 2-dimensional, which is why I need two **?
Just a little follow-up, does this also apply for n dimension?
Here's the relevant code
char *names[] = { "Alan", "Frank", "Mary", "John", "Lisa" };
char **cur_name = names;
No, that tutorial is of questionable quality. I wouldn't recommend to continue reading it.
A char** is a pointer-to-pointer. It is not a 2D array.
It is not a pointer to an array.
It is not a pointer to a 2D array.
The author of the tutorial is likely confused because there is a wide-spread bad and incorrect practice saying that you should allocate dynamic 2D arrays like this:
// BAD! Do not do like this!
int** heap_fiasco;
heap_fiasco = malloc(X * sizeof(int*));
for(int x=0; x<X; x++)
{
heap_fiasco[x] = malloc(Y * sizeof(int));
}
This is however not a 2D array, it is a slow, fragmented lookup table allocated all over the heap. The syntax of accessing one item in the lookup table, heap_fiasco[x][y], looks just like array indexing syntax, so therefore a lot of people for some reason believe this is how you allocate 2D arrays.
The correct way to allocate a 2D array dynamically is:
// correct
int (*array2d)[Y] = malloc(sizeof(int[X][Y]));
You can tell that the first is not an array because if you do memcpy(heap_fiasco, heap_fiasco2, sizeof(int[X][Y])) the code will crash and burn. The items are not allocated in adjacent memory.
Similarly memcpy(heap_fiasco, heap_fiasco2, sizeof(*heap_fiasco)) will also crash and burn, but for other reasons: you get the size of a pointer not an array.
While memcpy(array2d, array2d_2, sizeof(*array2d)) will work, because it is a 2D array.
Pointers took me a while to understand. I strongly recommend drawing diagrams.
Please have a read and understand this part of the C++ tutorial (at least with respect to pointers the diagrams really helped me).
Telling you that you need a pointer to a pointer to char for a two dimensional array is a lie. You don't need it but it is one way of doing it.
Memory is sequential. If you want to put 5 chars (letters) in a row like in the word hello you could define 5 variables and always remember in which order to use them, but what happens when you want to save a word with 6 letters? Do you define more variables? Wouldn't it be easier if you just stored them in memory in a sequence?
So what you do is you ask the operating system for 5 chars (and each char just happens to be one byte) and the system returns to you a memory address where your sequence of 5 chars begins. You take this address and store it in a variable which we call a pointer, because it points to your memory.
The problem with pointers is that they are just addresses. How do you know what is stored at that address? Is it 5 chars or is it a big binary number that is 8 bytes? Or is it a part of a file that you loaded? How do you know?
This is where the programming language like C tries to help by giving you types. A type tells you what the variable is storing and pointers too have types but their types tell you what the pointer is pointing to. Hence, char * is a pointer to a memory location that holds either a single char or a sequence of chars. Sadly, the part about how many chars are there you will need to remember yourself. Usually you store that information in a variable that you keep around to remind you how many chars are there.
So when you want to have a 2 dimensional data structure how do you represent that?
This is best explained with an example. Let's make a matrix:
1 2 3 4
5 6 7 8
9 10 11 12
It has 4 columns and 3 rows. How do we store that?
Well, we can make 3 sequences of 4 numbers each. The first sequence is 1 2 3 4, the second is 5 6 7 8 and the third and last sequence is 9 10 11 12. So if we want to store 4 numbers we will ask the system to reserve 4 numbers for us and give us a pointer to them. These will be pointers to numbers. However since we need to have 3 of them we will ask the system to give us 3 pointers to pointers numbers.
And that's how you end up with the proposed solution...
The other way to do it would be to realize that you need 4 times 3 numbers and just ask the system for 12 numbers to be stored in a sequence. But then how do you access the number in row 2 and column 3? This is where maths comes in but let's try it on our example:
1 2 3 4
5 6 7 8
9 10 11 12
If we store them next to each other they would look like this:
offset from start: 0 1 2 3 4 5 6 7 8 9 10 11
numbers in memory: [1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]
So our mapping is like this:
row | column | offset | value
1 | 1 | 0 | 1
1 | 2 | 1 | 2
1 | 3 | 2 | 3
1 | 4 | 3 | 4
2 | 1 | 4 | 5
2 | 2 | 5 | 6
2 | 3 | 6 | 7
2 | 4 | 7 | 8
3 | 1 | 8 | 9
3 | 2 | 9 | 10
3 | 3 | 10 | 11
3 | 4 | 11 | 12
And we now need to work out a nice and easy formula for converting a row and column to an offset... I'll come back to it when I have more time... Right now I need to get home (sorry)...
Edit: I'm a little late but let me continue. To find the offset of each of the numbers from a row and column you can use the following formula:
offset = (row - 1) * 4 + (column - 1)
If you notice the two -1's here and think about it you will come to understand that it is because our row and column numberings start with 1 that we have to do this and this is why computer scientists prefer zero based offsets (because of this formula). However with pointers in C the language itself applies this formula for you when you use a multi-dimensional array. And hence this is the other way of doing it.
From your question what i understand is that you are asking why you need char ** for the variable which is declared as *names[]. So the answer is when you simply write names[], than that it is the syntax of array and array is basically a pointer.
So when you write *names[] than that means you are pointing to an array. And as array is basically a pointer so that means you have a pointer to a pointer and thats why compiler will not complain if you write
char ** cur_name = names ;
In above line you are declaring a pointer to a character pointer and then initialinzing it with the pointer to an array (remember array is also pointer).

Matrices - memory

Let's say that I have a matrix A=[];
I want to know if there is any way to represent it in a way where only the filled blocks must occupy memory and remaining must not, e.g.:
A = 1 0 0
0 1 0
0 0 1
Now, every block would take 1 bit of memory to store the matrix,
hence I would like to know is it possible to store matrix as:
A = 1
1
1
and the empty spaces must not occupy any memory at all. Is there any file format to represent a matrix in such a way?
No. You're dealing with bits. It would take MORE memory to store a list of the "filled" bits than it would to simply store the bits. e.g. for a simple 1x8 matrix:
0 1 2 3 4 5 6 7 <---bit-wise addresses
m = [0,1,0,0,0,1,1,1]
could be stored as a SINGLE byte of memory, at a storage ratio of 1 bit per bit.
To store just the locations of the SET bits would take 4 bytes. If all of the bits were set, you'd need 8 bytes to store those locations. So now you've got from a constant 1 byte requirement, to a variable 0 -> 8 bytes.
You could develop an way where you can store Informatiosn about the positions in a List but that would at least consummee more memmory as you would win this way. So at least no.

inet_pton() counterpart for link layer address

I have two problems related to my implementation -
I need a function which can convert a given link-layer address from text to standard format like we have a similar function at n/w layer for IP addresses inet_pton() which converts a given IP address from text to standard IPv4/IPv6 format.
Is there any difference b/w Link-layer address and 48-bit mac address
(in case of IPv6 specifically)?
If no, then link layer address should always also be of 48 bit in length, if I am not wrong.
Thanks in advance. Please excuse if I am missing something trivial.
EDIT :
Ok.. I am clear on the difference b/w link layer address and ethernet mac address. There are several types of data link layer addresses, ethernet mac address is just one.
Now, this arises one more problem ... As i said in my first question I need to convert a link layer address given from command line to its standard form. The solution provided here will just work for ethernet mac address only.
Isn't there any standard function for the purpose ? What I would like to do is to create an application where user will enter values for different options present in ICMP router advertisement message as stated in RFC 4861.
Option Formats
Neighbor Discovery messages include zero or more options, some of
which may appear multiple times in the same message. Options should
be padded when necessary to ensure that they end on their natural
64-bit boundaries. All options are of the form:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
~ ... ~
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Fields:
Type 8-bit identifier of the type of option. The
options defined in this document are:
Option Name Type
Source Link-Layer Address 1
Target Link-Layer Address 2
Prefix Information 3
Redirected Header 4
MTU 5
Length 8-bit unsigned integer. The length of the option
(including the type and length fields) in units of
8 octets. The value 0 is invalid. Nodes MUST
silently discard an ND packet that contains an
option with length zero.
4.6.1. Source/Target Link-layer Address
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type | Length | Link-Layer Address ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Fields:
Type
1 for Source Link-layer Address
2 for Target Link-layer Address
Length The length of the option (including the type and
length fields) in units of 8 octets. For example,
the length for IEEE 802 addresses is 1
[IPv6-ETHER].
Link-Layer Address
The variable length link-layer address.
The content and format of this field (including
byte and bit ordering) is expected to be specified
in specific documents that describe how IPv6
operates over different link layers. For instance,
[IPv6-ETHER].
One more thing I am not quite handy with C++, can you please provide a C alternative ?
Thanks.
Your first question, it's not that hard to write, and since MAC addresses are represented by a 6 byte array you don't need to take machine-dependency into account ( like endian-ness and stuff )
void str2MAC(string str,char* mac) {
for(int i=0;i<5;i++) {
string b = str.substr(0,str.find(':'));
str = str.substr(str.find(':')+1);
mac[i] = 0;
for(int j=0;j<b.size();b++) {
mac[i] *= 0x10;
mac[i] += (b[j]>'9'?b[j]-'a'+10:b[j]-'0');
}
}
mac[5] = 0;
for(int i=0;i<str.size();i++) {
mac[5] *= 0x10;
mac[5] += (str[i]>'9'?str[i]-'a'+10:str[i]-'0');
}
}
About your second question, IP ( and IPv6 specifically) is a Network Layer protocol and is above the Link Layer, thus doesn't have to do anything with the Link Layer.
If by Link Layer you mean Ethernet, yes Ethernet Address is always 48bits, but there are other Link Layer protocols presents which may use other formats.

Resources