Why array indexes are zero based in most programming languages? - arrays

C++, C#, C, D, Java,... are zero based.
Matlab is the only language I know that begin at 1.

Arrays are zero based in c and c++ as the represent the offset from the beginning of the list of the item.
These two lines have identical result in c.
anArray[3] = 4;
*(anArray +3) = 4;
The first is the standard indexer the second takes the pointer adds three to id and then dereffrences it. Which is the same as the indexer.

Well, consider Dijkstra's famous article, Why numbering should start at zero. He argues that numbering should start at 0 because it means that the valid indexes into an array can be described as 0 <= i < N. This is clearly more appealing than 1 <= i < N + 1, on an aesthetic level.
(One could ask, "why not say 0 < i <= N", but he argues against that, too, again for aesthetic reasons.)

I guess because arrays use pointer arithmetic to refer to some value. Basically arrays have contiguous memory and if you want to refer to 5th element (a[4]) then a + 4 * size of int is performed
Say if you start with 1 then to refer to 5th element you will have to do something like a + (5-1) * size of int

I guess it has mostly historical reasons, new languages just try to use the existing convention which programmers are familiar with.
Older languages from which this rule originated were close to the metal, and an index is really the distance from the starting element, hence 0 makes sense for the first element.

Probably "C" got it because it is more efficient. To calculate address of item in 0-based array it is enough to multiple Index by ItemSize, for 1-based array you have to calculate (Index-1)*ItemSize. "C" and then "C++" where most popular languages, so new languages have to follow same rules, it helps to avoid mistakes for those who use C/C++.
But this question seems to be offtopic and i guess it will be closed by moderator.
P.S. In Delphi/Pascal strings are 1-based, but for arrays you have to provide range and so you can use what you like.

Because there are 10 integers 0..9

Related

Theory of arrays in Z3: (1) model is difficult to understand, (2) do not know how to implement functions and (3) difference with sequences

Following to the question published in How expressive can we be with arrays in Z3(Py)? An example, I expressed the following formula in Z3Py:
Exists i::Integer s.t. (0<=i<|arr|) & (avg(arr)+t<arr[i])
This means: whether there is a position i::0<i<|arr| in the array whose value a[i] is greater than the average of the array avg(arr) plus a given threshold t.
The solution in Z3Py:
t = Int('t')
avg_arr = Int('avg_arr')
len_arr = Int('len_arr')
arr = Array('arr', IntSort(), IntSort())
phi_1 = And(0 <= i, i< len_arr)
phi_2 = (t+avg_arr<arr[i])
phi = Exists(i, And(phi_1, phi_2))
s = Solver()
s.add(phi)
print(s.check())
print(s.model())
Note that, (1) the formula is satisfiable and (2) each time I execute it, I get a different model. For instance, I just got: [avg_a = 0, t = 7718, len_arr = 1, arr = K(Int, 7719)].
I have three questions now:
What does arr = K(Int, 7719)] mean? Does this mean the array contains one Int element with value 7719? In that case, what does the K mean?
Of course, this implementation is wrong in the sense that the average and length values are independent from the array itself. How can I implement simple avg and len functions?
Where is the i index in the model given by the solver?
Also, in which sense would this implementation be different using sequences instead of arrays?
(1) arr = K(Int, 7719) means that it's a constant array. That is, at every location it has the value 7719. Note that this is truly "at every location," i.e., at every integer value. There's no "size" of the array in SMTLib parlance. For that, use sequences.
(2) Indeed, your average/length etc are not related at all to the array. There are ways of modeling this using quantifiers, but I'd recommend staying away from that. They are brittle, hard to code and maintain, and furthermore any interesting theorem you want to prove will get an unknown as answer.
(3) The i you declared and the i you used as the existential is completely independent of each other. (Latter is just a trick so z3 can recognize it as a value.) But I guess you removed that now.
The proper way to model such problems is using sequences. (Although, you shouldn't expect much proof performance there either.) Start here: https://microsoft.github.io/z3guide/docs/theories/Sequences/ and see how much you can push it through. Functions like avg will need a recursive definition most likely, for that you can use RecAddDefinition, for an example see: https://stackoverflow.com/a/68457868/936310
Stack-overflow works the best when you try to code these yourself and ask very specific questions about how to proceed, as opposed to overarching questions. (But you already knew that!) Best of luck..

Dynamically indexing an array in C

Is it possible to create arrays based of their index as in
int x = 4;
int y = 5;
int someNr = 123;
int foo[x][y] = someNr;
dynamically/on the run, without creating foo[0...3][0...4]?
If not, is there a data structure that allow me to do something similar to this in C?
No.
As written your code make no sense at all. You need foo to be declared somewhere and then you can index into it with foo[x][y] = someNr;. But you cant just make foo spring into existence which is what it looks like you are trying to do.
Either create foo with correct sizes (only you can say what they are) int foo[16][16]; for example or use a different data structure.
In C++ you could do a map<pair<int, int>, int>
Variable Length Arrays
Even if x and y were replaced by constants, you could not initialize the array using the notation shown. You'd need to use:
int fixed[3][4] = { someNr };
or similar (extra braces, perhaps; more values perhaps). You can, however, declare/define variable length arrays (VLA), but you cannot initialize them at all. So, you could write:
int x = 4;
int y = 5;
int someNr = 123;
int foo[x][y];
for (int i = 0; i < x; i++)
{
for (int j = 0; j < y; j++)
foo[i][j] = someNr + i * (x + 1) + j;
}
Obviously, you can't use x and y as indexes without writing (or reading) outside the bounds of the array. The onus is on you to ensure that there is enough space on the stack for the values chosen as the limits on the arrays (it won't be a problem at 3x4; it might be at 300x400 though, and will be at 3000x4000). You can also use dynamic allocation of VLAs to handle bigger matrices.
VLA support is mandatory in C99, optional in C11 and C18, and non-existent in strict C90.
Sparse arrays
If what you want is 'sparse array support', there is no built-in facility in C that will assist you. You have to devise (or find) code that will handle that for you. It can certainly be done; Fortran programmers used to have to do it quite often in the bad old days when megabytes of memory were a luxury and MIPS meant millions of instruction per second and people were happy when their computer could do double-digit MIPS (and the Fortran 90 standard was still years in the future).
You'll need to devise a structure and a set of functions to handle the sparse array. You will probably need to decide whether you have values in every row, or whether you only record the data in some rows. You'll need a function to assign a value to a cell, and another to retrieve the value from a cell. You'll need to think what the value is when there is no explicit entry. (The thinking probably isn't hard. The default value is usually zero, but an infinity or a NaN (not a number) might be appropriate, depending on context.) You'd also need a function to allocate the base structure (would you specify the maximum sizes?) and another to release it.
Most efficient way to create a dynamic index of an array is to create an empty array of the same data type that the array to index is holding.
Let's imagine we are using integers in sake of simplicity. You can then stretch the concept to any other data type.
The ideal index depth will depend on the length of the data to index and will be somewhere close to the length of the data.
Let's say you have 1 million 64 bit integers in the array to index.
First of all you should order the data and eliminate duplicates. That's something easy to achieve by using qsort() (the quick sort C built in function) and some remove duplicate function such as
uint64_t remove_dupes(char *unord_arr, char *ord_arr, uint64_t arr_size)
{
uint64_t i, j=0;
for (i=1;i<arr_size;i++)
{
if ( strcmp(unord_arr[i], unord_arr[i-1]) != 0 ){
strcpy(ord_arr[j],unord_arr[i-1]);
j++;
}
if ( i == arr_size-1 ){
strcpy(ord_arr[j],unord_arr[i]);
j++;
}
}
return j;
}
Adapt the code above to your needs, you should free() the unordered array when the function finishes ordering it to the ordered array. The function above is very fast, it will return zero entries when the array to order contains one element, but that's probably something you can live with.
Once the data is ordered and unique, create an index with a length close to that of the data. It does not need to be of an exact length, although pledging to powers of 10 will make everything easier, in case of integers.
uint64_t* idx = calloc(pow(10, indexdepth), sizeof(uint64_t));
This will create an empty index array.
Then populate the index. Traverse your array to index just once and every time you detect a change in the number of significant figures (same as index depth) to the left add the position where that new number was detected.
If you choose an indexdepth of 2 you will have 10² = 100 possible values in your index, typically going from 0 to 99.
When you detect that some number starts by 10 (103456), you add an entry to the index, let's say that 103456 was detected at position 733, your index entry would be:
index[10] = 733;
Next entry begining by 11 should be added in the next index slot, let's say that first number beginning by 11 is found at position 2023
index[11] = 2023;
And so on.
When you later need to find some number in your original array storing 1 million entries, you don't have to iterate the whole array, you just need to check where in your index the first number starting by the first two significant digits is stored. Entry index[10] tells you where the first number starting by 10 is stored. You can then iterate forward until you find your match.
In my example I employed a small index, thus the average number of iterations that you will need to perform will be 1000000/100 = 10000
If you enlarge your index to somewhere close the length of the data the number of iterations will tend to 1, making any search blazing fast.
What I like to do is to create some simple algorithm that tells me what's the ideal depth of the index after knowing the type and length of the data to index.
Please, note that in the example that I have posed, 64 bit numbers are indexed by their first index depth significant figures, thus 10 and 100001 will be stored in the same index segment. That's not a problem on its own, nonetheless each master has his small book of secrets. Treating numbers as a fixed length hexadecimal string can help keeping a strict numerical order.
You don't have to change the base though, you could consider 10 to be 0000010 to keep it in the 00 index segment and keep base 10 numbers ordered, using different numerical bases is nonetheless trivial in C, which is of great help for this task.
As you make your index depth become larger, the amount of entries per index segment will be reduced
Please, do note that programming, especially lower level like C consists in comprehending the tradeof between CPU cycles and memory use in great part.
Creating the proposed index is a way to reduce the number of CPU cycles required to locate a value at the cost of using more memory as the index becomes larger. This is nonetheless the way to go nowadays, as masive amounts of memory are cheap.
As SSDs' speed become closer to that of RAM, using files to store indexes is to be taken on account. Nevertheless modern OSs tend to load in RAM as much as they can, thus using files would end up in something similar from a performance point of view.

Context-free grammar in C

I have an assignment to make a program in C that displays a number (n < 50) of valid, context-free grammar strings using the following context-free grammar:
S -> AA|0
A -> SS|1
I had few concepts of how to do it, but after analyzing them more and more, none of them were right.
For now, I'm planning to make an array and randomly change [..., A, ...] for [..., S, S, ...] or [..., 1, ...] until there are only 0s and 1s and then check whether the same thing was already randomly generated.
I'm still not convinced if that is the right approach, and I still don't know exactly how to do that or where to keep the final words because the basic form will be an array of chars of different length. Also, in C, is a two dimensional array of chars equal to an array of strings?
Does this make any sense, and is it a proper way to do it? Or am I missing something?
You can simply make a random decision every time you need to decide on something. For example:
function A():
if (50% random chance)
return "1"
else
return concat(S(), S())
function S():
if (50% random chance)
return "0"
else
return concat(A(), A())
Calling S() multiple times give me these outputs:
"0"
"00110110100100101111010111111111001111101011100100011000000110101110000110101110
10001000110001111100011000101011000001101111000110110011101010111111111011010011
10000000101111100100011011010000000101000111110010001000101001100110100111111111
1001010011"
"11"
"10010010101111010111101"
All valid strings for your grammar. Note that you may need to tweak a little the random chances. This sample has a high probability to generate very small strings like "11".
Try to think of the context-free grammar as a set of rules that allow you to generate new strings in a language. For example, the first rule:
S -> AA | 0
How could you generate a word S in this language? One way is with a function that generates, at random, either the string "0" or two A words, concatenated.
Similarly, to implement the second rule:
A -> SS | 1
write a function that generates, at random, either "1" or two S words concatenated.
You asked several questions...
Regarding The question: BTW in C, is two dimensional array of chars equal to array of strings?
Yes.
Here are ways to declare arrays of strings, each example shows varying flexibility in terms of usage:
char **ArrayOfStrings; //most flexible declaration -
//pointer to pointer, can use `calloc()` or `malloc()` to create memory for
//any number of strings of any length (all strings will have same length)
or
char *ArrayOfStrings[10]; //somewhat flexible -
//pointer to array of 10 strings, again can use `c(m)alloc()` to allocate memory for
//each string to have any lenth (all strings will have same length)
or
ArrayOfStrings[5][10]; //Not flexible - (but still very useful)
//2 dimensional array of 5 strings, each with space for up to 9 chars + '\0'
//Note: In C, by definition, strings must always be NULL terminated.
Note: Although each of these forms are valid, and very useful when used correctly, It is good to be aware there are differences in the way each will behave in practice. (read the link for a good discussion on that)

The idea of C array in matlab

Is it possible to apply the idea of array in C to MATLAB
For example, if we have
Double array[10];
and if we want to assign a value we write for example
Array[5]=2;
Is there any way to write equivalent one in MATLAB ?
I'm not sure what you mean by "Is it possible to apply the idea of array in C to MATLAB". An array is just a 1D list of numbers (or other data types). MATLAB primarily is designed to work with matrices (MATLAB is short for Matrix laborartory) and an array or vector is simply a special case of a matrix. So I guess the answer to your question is yes, if I have understood correctly.
To initialise arrays or matrices in MATLAB we use zeros or ones:
>> array = zeros(1,5)
array =
0 0 0 0 0
We can then index elements of the array in the same way as C:
>> array(3) = 3
array =
0 0 3 0 0
Note, however, that MATLAB array indexing is one based whereas C arrays are zero based.
This article describes matrix/array indexing in MATLAB.
You can define your own class, override the [] operator.
I described the mechanism in Here
Since it is a custom function, you might as well change the 1-based indexing to 0-based indexing.
Regarding the constructor, I doubt that you can do it.
Anyhow, why would you want to do it?
You will confuse all of the Matlab users, and cause havoc.
When in Rome, do as Romans do.
Yes you can. Arrays are used in C and MATLAB and they can be used for the same functions. Except, please keep in mind the array-indexing of C and MATLAB are different.
The first element of a C array has an index of zero. i.e. in X = [10 20 30 40], x[0] will return 10. But in MATLAB, this will give an error. To access the number 10, you have to use the expression x[1] in MATLAB.
There is no indexing operator []. You must use () for indexing array.
If you write
x = 1:10;
x[2]
then you'll get the following error
x[2]
|
Error: Unbalanced or unexpected parenthesis or bracket.

changing the index of array

so far, i m working on the array with 0th location but m in need to change it from 0 to 1 such that if earlier it started for 0 to n-1 then now it should start form 1 to n. is there any way out to resolve this problem?
C arrays are zero-based and always will be. I strongly suggest sticking with that convention. If you really need to treat the first element as having index 1 instead of 0, you can wrap accesses to that array in a function that does the translation for you.
Why do you need to do this? What problem are you trying to solve?
Array indexing starts at zero in C; you cannot change that.
If you've specific requirements/design scenarios that makes sense to start indexes at one, declare the array to be of length n + 1, and just don't use the zeroth position.
Subtract 1 from the index every time you access the array to achieve "fake 1-based" indexing.
If you want to change the numbering while the program is running, you're asking for something more than just a regular array. If things only ever shift by one position, then allocate (n+1) slots and use a pointer into the array.
enum { array_size = 1000 };
int padded_array[ array_size + 1 ];
int *shiftable_array = padded_array; /* define pointer */
shiftable_array[3] = 5; /* pointer can be used as array */
some_function( shiftable_array );
/* now we want to renumber so element 1 is the new element 0 */
++ shiftable_array; /* accomplished by altering the pointer */
some_function( shiftable_array ); /* function uses new numbering */
If the shift-by-one operation is repeated indefinitely, you might need to implement a circular buffer.
You can't.
Well in fact you can, but you have to tweak a bit. Define an array, and then use a pointer to before the first element. Then you can use indexes 1 to n from this pointer.
int array[12];
int *array_starts_at_one = &array[-1]; // Don't use index 0 on this one
array_starts_at_one[1] = 1;
array_starts_at_one[12] = 12;
But I would advise against doing this.
Some more arguments why arrays are zero based can be found here. Infact its one of the very important and good features of the C programming language. However you can implement a array and start indexing from 1, but that will really take a lot of effort to keep track off.
Say you declare a integer array
int a[10];
for(i=1;i<10;i++)
a[i]=i*i;
You need to access all arrays with the index 1. Ofcourse you need to declare with the size (REQUIRED_SIZE_NORMALLY+1).
You should also note here that you can still access the a[0] element but you have to ignore it from your head and your code to achieve what you want to.
Another problem would be for the person reading your code. He would go nuts trying to figure out why did the numbering start from 1 and was the 0th index used for some hidden purpose which unfortunately he is unaware of.

Resources