I'm new to C and learning it for a class right now. We are currently working on a little project and we are supposed to use pointer arithmetic to access arrays as opposed to the standard [] way.
For some reason, I can use it just fine on the first loop (see code) but when I use it in the second, it doesn't produce the same outcome as if I were to use the standard [] way.
for (int i = 0; i < size; i++) {
for (int j = 0; j < size; j++) {
int num = *(*array+i)+j;
//Irrelevant code
}
}
for (int i = 0; i < size; i++) {
for (int j = 0; j < size; j++) {
int num = array[j][i]; // Error comes if I do *(*array+j)+i;
//Irrelevant code
}
}
I don't know if I am missing something here but why would calling the array using pointer arithmetic be different between the 2 loops?
The equivalence between subscripts and pointer notation is:
a[i] == *(a + i)
You are using (*a + i) in place of the correct *(a + i).
I believe your first set of loops should read:
for (int i = 0; i < size; i++) {
for (int j = 0; j < size; j++) {
int spot = *(*(board+i)+j);
for (int k = j + 1; k < size; k++) {
if (spot == *(*(board + i) + k) && spot > 0) {
return 0;
}
}
}
}
However, since you've not provided an MCVE (Minimal, Complete, Verifiable Example
— or MRE or whatever name SO now uses)
or an
SSCCE (Short, Self-Contained, Correct Example
— the same idea by a different name), I can't easily test the code.
Also, now you know why it is better to use the explicit subscript notation; it is a lot harder to get it wrong.
Related
So, I'll preface this by saying I'm fairly new to pointers and dynamic allocation. Currently I am trying to store a file that contains a 3x3 matrix of ints into a 2d array. I've tried debugging my code and what I notice is that it reads my first 2 values, but than begins to generate random garbage into my 2d array. I assume that I am storing my ints incorrectly and there is a flaw in my logic, but as I keep trying to think about it, I can't seem to find where it could be incorrect as they are moving from [0][0], [0][1], etc.
Here is my code for reference. Thanks, I'd appreciate just some guidance on how I can troubleshoot this problem for this specific case and future issues.
#include <stdio.h>
#include <stdlib.h>
int main() {
FILE* fpM1;
fpM1 = fopen("m1.txt", "r");
int i, j, row1 = 2, col1 = 2;
int* ptrM1 = (int* )malloc(9 * sizeof(int));
if (fpM1 != NULL) {
for (i = 0; i < row1; i++) {
for (j = 0; j < col1; j++) {
fscanf(fpM1, "%d", ((ptrM1 + i) + j));
}
}
for (i = 0; i < row1; i++)
for (j = 0; j < col1; j++) {
{
printf(" %d", *((ptrM1 + i) + j));
}
}
}
free(ptrM1);
fclose(fpM1);
return 0;
}
Your for loops end too fast. You should use <= instead of <. After that, everything seems to work perfectly.
Maybe you should consider adding new line in outer for loop during printing array. It will help with clarity:
for (i = 0; i < row1; i++) {
for (j = 0; j < col1; j++) {
printf(" %d", *(ptrM1 + i));
}
printf("\n");
}
Also you don't need those double brackets before first printf statement. In C you can put many scopes without context, but you shouldn't do that without reason.
EDIT: It actually doesn't work. Reading from file should be done this way:
fscanf(fpM1, "%d", ptrM1 + (i * (col1 + 1) + j));
and printing:
printf(" %d", ptrM1[i * (col1 + 1) + j]);
I have just been given an assignment to re-write the following C function, to help the ARM compiler produce more efficient assembly code. Does anyone know how to do this?
void some_function(int *data)
{
int i, j;
for (i = 0; i < 64; i++)
{
for (j = 0; j < 64; j++)
data[j + 64*i] = (i + j)/2;
}
}
First (as Jonathan Leffler mentioned) the compiler is likely to do so good a job already that trying to optimise by writing specific C code is usually commercially questionable, i.e. you lose more money via development time than you can make by slightly faster code.
But sometimes it is worth it; let's assume it is the case here.
If you do optimise, do so while measuring. It is very possible to write code which ends up being less optimal, because in subtle ways otherwise possible compiler optimisations are foiled. Also, whether and how much optimisation works depends on the environment, i.e. measuring in all potential environments is necessary.
Ok, after that wise-cracking, here is code in which I demonstrate optimisations as proposed in comments, one of them by Jonathan Leffler:
/* Jonathan Leffler */
void some_function(int *data)
{
int i, j;
int k = 0;
for (i = 0; i < 64; i++)
{
for (j = 0; j < 64; j++)
{
data[k++] = (i + j)/2;
}
}
}
/* Yunnosch 1, loop unrolling by 2 */
void some_function(int *data)
{
int i, j;
for (i = 0; i < 64; i++)
{
for (j = 0; j < 64; j+=2)
data[j + 64*i] = (i + j )/2;
data[j + 1 + 64*i] = (i + j+1)/2;
}
}
/* Yunnosch 1 and Jonathan Leffler */
void some_function(int *data)
{
int i, j;
int k=0; /* Jonathan Leffler */
for (i = 0; i < 64; i++)
{
for (j = 0; j < 64; j+=2) /* Yunnosch */
{
data[k++] = (i + j )/2;
data[k++] = (i + j+1)/2; /* Yunnosch */
}
}
}
/* Yunnosch 2, avoiding the /2, including Jonathan Leffler */
/* Well, duh. This is harder than I thought...
I admit that this is NOT tested, I want to demonstrate the idea.
Everybody feel free to help the very grateful me with fixing errors. */
void some_function(int *data)
{
int i, j;
int k=0;
for (i = 0; i < 32; i++) /* magic numbers I normally avoid, 32 is 64/2 */
{
for (j = 0; j < 32; j++)
{
data[k ] = (i + j);
data[k+1 ] = (i + j);
data[k +64] = (i + j);
data[k+1+64] = (i + j +1);
k+=2;
}
k+=64;
}
}
The last version is based on the following observable 2x2 group pattern in the desired result, as seen in a 2D interpretation:
00 11 ...
01 12 ...
11 22 ...
12 23 ...
.. ..
.. ..
.. ..
´´´´
Optimizing C code to generate "more efficient assembly code" for a specific compiler/processor is something you normally shouldn't do. Write clear and easy to understand C code and let the compiler do the optimization.
Even if you make all kinds of tricks with the C code and end up with "more efficient assembly code" for your specific compiler/processor, it may turn out that a simple compiler upgrade may ruin the whole thing and you'll have to change the C code again.
For something as simple as your code, write it in assembler code from the start. But be aware that you'll have to be a real expert in that processor/assembly language to beat a decent compiler.
Anyway... If we want to guess, this is my guess:
void some_function(int *data)
{
int i, j, x;
for (i = 0; i < 64; i++)
{
// Handle even i-values
x = i/2;
for (j = 0; j < 64; j += 2)
{
*data = x;
++data;
*data = x;
++data;
++x; // Increment after writing to data twice
}
++i;
// Handle odd i-values
x = i/2;
for (j = 0; j < 64; j += 2)
{
*data = x;
++data;
++x; // Increment after writing to data once
*data = x;
++data;
}
}
}
The idea is 1) to replace the array-indexing with pointer increments and 2) to replace the (i+j)/2 with integer increments.
I have not done any measurement so I can't say for sure that this will be a good solution. I'll leave that to OP.
Same idea as above, but with a few more tweaks (proposed by #user3386109)
void some_function(int *data)
{
for (int i = 0; i < 32; i++)
{
// when i is even, the output is in matched pairs
int value = i;
for (int j = 0; j < 32; j++)
{
*data++ = value;
*data++ = value++;
}
// when i is odd, the output starts with a singleton
// followed by matched pairs, and ending with a singleton
value = i;
*data++ = value++;
for (int j = 0; j < 31; j++)
{
*data++ = value;
*data++ = value++;
}
*data++ = value;
}
}
So I have this function which currently sorts through an array struct I plug into it and sorts it alphabetically as shown:
void string_rearrangement(Employee payroll[], int size)
{
//declaring temporary struct
Employee temp;
int i = 0, j = 0;
for (i = 0; i <= size; i++)
{
for (j = i + 1; j <= size; j++)
{
if (strcmp(payroll[i].name, payroll[j].name) >0)
{
temp = payroll[i];
payroll[i] = payroll[j];
payroll[j] = temp;
}
}
}
}
It works. I've tested it. Here's a screenshot of the output i get when I print the names: https://i.imgur.com/YhUzUa0.png
But now I wanted to change it so it would give me the reverse alphabetical order. I thought this would just be changing the strcmp condition to <0 as shown:
if (strcmp(payroll[i].name, payroll[j].name) <0)
But when I do that, I get some crazy output like this: https://i.imgur.com/JdI0v8b.png
The only thing I changed between the code was the >0 so I'm not sure why this is happening.
for (i = 0; i <= size; i++)
{
for (j = i + 1; j <= size; j++)
You should almost certainly use < instead of <= here. You’re running over the array bounds and into undefined behaviour. You simply got incredibly lucky that your code worked before you changed the sort order.
This question contains a JavaScript example but it could possibly be relevant for other languages as well.
I got a 2d binary array (values are set to 1 and 0 only). I want to make an action which toggles all values, meaning turn all 0 to 1 and all 1 to 0.
Which is a better way to do it:
1)
for(var i = 0; i < rowsNum; i++)
{
for(var j = 0; j < colNum; j++)
{
if(arr[i][j] == 0)
{
arr[i][j] = 1;
}
else
{
arr[i][j] = 0;
}
}
}
or
2)
for(var i = 0; i < rowsNum; i++)
{
for(var j = 0; j < colNum; j++)
{
arr[i][j] = 1 - arr[i][j];
}
}
I would like to know if there's a generic method which is best for most cases. Also, specifically regarding JS, is there a better way to do it than these 2 methods?
I would go for the second way of doing it, or I would use the xor operation, like this:
for (var i = 0; i < rows; i++) {
for (var j = 0; j < cols; j++) {
arr[i][j] ^= 1;
}
}
The thing is, if statements translate into branching instructions which can be slow due to branch mispredictions. However, the performance gain in an example like this will barely show, and if it makes the code less readable, then it's not worth it. Always optimize last and if it's absolutely necessary.
Which of these optimizations is better and in what situation? Why?
Intuitively, I am getting the feeling that loop tiling will in general
be a better optimization.
What about for the below example?
Assume a cache which can only store about 20 elements in it at any time.
Original Loop:
for(int i = 0; i < 10; i++)
{
for(int j = 0; j < 1000; j++)
{
a[i] += a[i]*b[j];
}
}
Loop Interchange:
for(int i = 0; i < 1000; i++)
{
for(int j = 0; j < 10; j++)
{
a[j] += a[j]*b[i];
}
}
Loop Tiling:
for(int k = 0; k < 1000; k += 20)
{
for(int i = 0; i < 10; i++)
{
for(int j = k; j < min(1000, k+20); j++)
{
a[i] += a[i]*b[j];
}
}
}
The first two cases you are exposing in your question are about the same. Things would really change in the following two cases:
CASE 1:
for(int i = 0; i < 10; i++)
{
for(int j = 0; j < 1000; j++)
{
b[i] += a[i]*a[j];
}
}
Here you are accessing the matrix "a" as follows: a[0]*a[0], a[0]*a1, a[0]*a[2],.... In most architectures, matrix structures are stored in memory like: a[0]*a[0], a1*a[0], a[2]*a[0] (first column of first row followed by second column of first raw,....). Imagine your cache only could store 5 elements and your matrix is 6x6. The first "pack" of elements that would be stored in cache would be a[0]*a[0] to a[4]*a[0]. Your first acces would cause no cache miss so a[0][0] is stored in cache but the second yes!! a0 is not stored in cache! Then the OS would bring to cache the pack of elements a0 to a4. Then you do the third acces: a[0]*a[2] wich is out of cache again. Another cache miss!
As you can colcude, case 1 is not a good solution for the problem. It causes lots of cache misses that we can avoid changing the code for the following:
CASE 2:
for(int i = 0; i < 10; i++)
{
for(int j = 0; j < 1000; j++)
{
b[i] += a[i]*a[j];
}
}
Here, as you can see, we are accessing the matrix as it's stored in memory. Consequently it's much better (faster) than case 1.
About the third code you posted about loop tiling, loop tiling and also loop unrolling are optimizations that in most cases the compiler does automaticaly. Here's a very interesting post in stackoverflow explaining these two techniques;
Hope it helps! (sorry about my english, I'm not a native speaker)