Context-free grammar in C - c

I have an assignment to make a program in C that displays a number (n < 50) of valid, context-free grammar strings using the following context-free grammar:
S -> AA|0
A -> SS|1
I had few concepts of how to do it, but after analyzing them more and more, none of them were right.
For now, I'm planning to make an array and randomly change [..., A, ...] for [..., S, S, ...] or [..., 1, ...] until there are only 0s and 1s and then check whether the same thing was already randomly generated.
I'm still not convinced if that is the right approach, and I still don't know exactly how to do that or where to keep the final words because the basic form will be an array of chars of different length. Also, in C, is a two dimensional array of chars equal to an array of strings?
Does this make any sense, and is it a proper way to do it? Or am I missing something?

You can simply make a random decision every time you need to decide on something. For example:
function A():
if (50% random chance)
return "1"
else
return concat(S(), S())
function S():
if (50% random chance)
return "0"
else
return concat(A(), A())
Calling S() multiple times give me these outputs:
"0"
"00110110100100101111010111111111001111101011100100011000000110101110000110101110
10001000110001111100011000101011000001101111000110110011101010111111111011010011
10000000101111100100011011010000000101000111110010001000101001100110100111111111
1001010011"
"11"
"10010010101111010111101"
All valid strings for your grammar. Note that you may need to tweak a little the random chances. This sample has a high probability to generate very small strings like "11".

Try to think of the context-free grammar as a set of rules that allow you to generate new strings in a language. For example, the first rule:
S -> AA | 0
How could you generate a word S in this language? One way is with a function that generates, at random, either the string "0" or two A words, concatenated.
Similarly, to implement the second rule:
A -> SS | 1
write a function that generates, at random, either "1" or two S words concatenated.

You asked several questions...
Regarding The question: BTW in C, is two dimensional array of chars equal to array of strings?
Yes.
Here are ways to declare arrays of strings, each example shows varying flexibility in terms of usage:
char **ArrayOfStrings; //most flexible declaration -
//pointer to pointer, can use `calloc()` or `malloc()` to create memory for
//any number of strings of any length (all strings will have same length)
or
char *ArrayOfStrings[10]; //somewhat flexible -
//pointer to array of 10 strings, again can use `c(m)alloc()` to allocate memory for
//each string to have any lenth (all strings will have same length)
or
ArrayOfStrings[5][10]; //Not flexible - (but still very useful)
//2 dimensional array of 5 strings, each with space for up to 9 chars + '\0'
//Note: In C, by definition, strings must always be NULL terminated.
Note: Although each of these forms are valid, and very useful when used correctly, It is good to be aware there are differences in the way each will behave in practice. (read the link for a good discussion on that)

Related

Swift String.Index vs transforming the String to an Array

In the swift doc, they say they use String.Index to index strings, as different characters can take a different amount of memory.
But I saw a lot of people transforming a String into an array var a = Array(s) so they can index by int instead of String.Index (which is definitely easier)
So I wanted to test by myself if it's exactly the same for all unicode character:
let cafeA = "caf\u{E9}" // eAcute
let cafeB = "caf\u{65}\u{301}" // combinedEAcute
let arrayCafeA = Array(cafeA)
let arrayCafeB = Array(cafeB)
print("\(cafeA) is \(cafeA.count) character \(arrayCafeA.count)")
print("\(cafeB) is \(cafeB.count) character \(arrayCafeB.count)")
print(cafeA == cafeB)
print("- A scalar")
for scalar in cafeA.unicodeScalars {
print(scalar.value)
}
print("- B scalar")
for scalar in cafeB.unicodeScalars {
print(scalar.value)
}
And here is the output :
café is 4 character 4
café is 4 character 4
true
- A scalar
99
97
102
233
- B scalar
99
97
102
101
769
And sure enough, as mentioned in the doc strings are just an array of Character, and then the grapheme cluster is down within the Character object, so why don't they indexed it by int ? what's the point of creating/using String.Index actually ?
In a String, the byte representation is packed, so there's no way to know where the character boundaries are without traversing the whole string from the start.
When converting to an array, this is traversal is done once, and the result is an array of characters that are equidistantly spaced out in memory, which is what allows constant time subscripting by an Int index. Importantly, the array is preserved, so many subscripting operations can be done upon the same array, requiring only one traversal of the String's bytes, for the initial unpacking.
It is possible extend String with a subscript that indexes it by an Int, and you see it often come up on SO, but that's ill advised. The standard library programmers could have added it, but they purposely chose not to, because it obscures the fact that every indexing operation requires a separate traversal of the String's bytes, which is O(string.count). All of a sudden, innocuous code like this:
for i in string.indices {
print(string[i]) // Looks O(1), but is actually O(string.count)!
}
becomes quadratic.

Simulating an appendable array in C...Kinda

I'm trying to write a C code that does what a chunk of python code I have written does.
I tried to keep all its lines simple, but there still turns out to be some stuff I wrote that C cannot do.
My code will take an array of coordinates and replace/add items to that array over time.
For example:
[[[0,1]],[[2,1],[1,14]],[[1,1]]] ==> [[[0,1]],[[2,1],[1,14],[3,2]],[[1,1]]]
or
[[[0,1]],[[2,1],[1,14]],[[1,1]]] ==> [[[0,1]],[[40]],[[1,1]]]
I think this is impossible in C, but how about instead using strings to represent the lists so they can be added to? Like this:
[['0$1$'],['2$1$1$14$'],['1$1$']] ==> [['0$1$'],['2$1$1$14$3$2'],['1$1$']]
and
[['0$1$'],['2$1$1$14$'],['1$1$']] ==> [['0$1$'],['40$'],['1$1$']]
In my code, I know each array in the array is either one or more pairs of numbers or just one number so this method works for me.
Can C do this and if so please provide an example.
If you know that both the length of a string and the number of said strings won't exceed a certain value, you can do this:
char Strings[NUMBER_OF_STRINGS][MAX_STRING_LENGTH + 1]; // for the null terminator
It would then be a good practice to zero all this memory:
for (size_t i = 0; i < NUMBER_OF_STRINGS; i++)
memset(Strings[i], 0, MAX_STRING_LENGTH + 1);
And if you want to append a string, use strcat:
strcat(Strings[i], SourceString);
A safer (though slightly more costly since you need to call strlen which walks the entire string) solution would be:
strncat(Strings[i], SourceString, MAX_STRING_LENGTH - strlen(Strings[i]));

Storing one value for each item in 2-Dimensional array

Luckily I came up with a decent title, describing what I was curious about.
While this is really hard for me to explain, I am doing my best.
I tried, storing values in 3D array as such:
char arr[10][10][1];
To copy a string I have to do it in arr[y][x], (And I sadly I can't in just arr[y])but then, because of a reason still unknown for me, I could overflow the buffer with arr[8][8][8]. Maybe because of the size of char** but anyway.
I couldn't find a slot to store a character for each item (x and y)
I tried, it the other way:
char arr[1][10][10];
Assuming that I have 1 item * x and y.
To store a string, I have to do it in arr[0][y], which means the 3rd cell will be a character from the string.
So as a resume, I am trying to store one value for each character in x and y.
Do I really need 4D array for this?
Additional clarification:
I am aware what 1D and 2D arrays are for. Seems I can't understand the 3D array.
I thought that I can store an additional item for each character at y or x.
Example:
char arr[y][x][z];
Where y is the line, x is the column and z is the additional item that applies to all the characters.
A string is an array of characters. An array of strings is therefore an array of arrays of characters. Why you think you need the 3rd dimension, I have no idea.
When you allocate a multi-dimensional array statically, you must specify the maximum number of items that it can contain. In this case, you must specify how many bytes long the string is allowed to be, including one byte for null termination. This is the right-most [] in the expression, in your case 1 byte.
So you haven't actually allocated any memory at all to store a string: 1 byte is enough to store the null termination and nothing else. This is why you get a crash/seg fault when you attempt [x][y][z] when z is any other value than 0. And you cannot store anything meaningful there either.
Size of char** has absolutely nothing to do with this whatsoever. Pointers are not arrays.
I'd strongly suggest that your study this C FAQ about pointers and arrays.
Now what you probably want to do is something like this:
char string_array [10][20+1]; // 10 strings each containing 20 letters + null
strcpy(string_array[0], "hello");
strcpy(string_array[1], "world");
...
printf("%s\n", string_array[0]);
printf("%s\n", string_array[1]);
...
No need for the 3rd dimension as such .
you can use for example a[x][y];
and you can access this using *a[];
As you can also see that while using command line arguments where 2D array *argv[] is used to store a number of strings from command line. It explains you the best how 2D arrays are used.
For further reference you can have a look at this http://www.tutorialspoint.com/cprogramming/c_multi_dimensional_arrays.htm

Why array indexes are zero based in most programming languages?

C++, C#, C, D, Java,... are zero based.
Matlab is the only language I know that begin at 1.
Arrays are zero based in c and c++ as the represent the offset from the beginning of the list of the item.
These two lines have identical result in c.
anArray[3] = 4;
*(anArray +3) = 4;
The first is the standard indexer the second takes the pointer adds three to id and then dereffrences it. Which is the same as the indexer.
Well, consider Dijkstra's famous article, Why numbering should start at zero. He argues that numbering should start at 0 because it means that the valid indexes into an array can be described as 0 <= i < N. This is clearly more appealing than 1 <= i < N + 1, on an aesthetic level.
(One could ask, "why not say 0 < i <= N", but he argues against that, too, again for aesthetic reasons.)
I guess because arrays use pointer arithmetic to refer to some value. Basically arrays have contiguous memory and if you want to refer to 5th element (a[4]) then a + 4 * size of int is performed
Say if you start with 1 then to refer to 5th element you will have to do something like a + (5-1) * size of int
I guess it has mostly historical reasons, new languages just try to use the existing convention which programmers are familiar with.
Older languages from which this rule originated were close to the metal, and an index is really the distance from the starting element, hence 0 makes sense for the first element.
Probably "C" got it because it is more efficient. To calculate address of item in 0-based array it is enough to multiple Index by ItemSize, for 1-based array you have to calculate (Index-1)*ItemSize. "C" and then "C++" where most popular languages, so new languages have to follow same rules, it helps to avoid mistakes for those who use C/C++.
But this question seems to be offtopic and i guess it will be closed by moderator.
P.S. In Delphi/Pascal strings are 1-based, but for arrays you have to provide range and so you can use what you like.
Because there are 10 integers 0..9

An array of length 4-20?

I'd like for my array to be of a set length using a simple format. Please, let me know how this is done.
What I already have:
arr[100]
Pseudocode: what I would like to have:
arr[4-20] or arr[$min_int THROUGH $max_int]
Additional detail edit: The int should be within the range array = (4, 20). The input may contain leading zeros. I'd like to keep the length of the array restricted (i.e., to 9 or 10 characters).
Arrays simply do not work this way in C. You will need to implement it yourself by only looping through valid indices (and wasting memory in the process) or by using a data structure better suited to the job, like a map (which you will have to find in a library or write yourself as it does not exist in the language).
#define ARRMINIDX 4
#define ARRMAXIDX 20
int arrmem[ARRMAXIDX+1-ARRMINIDX];
#define arr(x) arrmem[ARRMINIDX+(x)]
// process elements of arr
for( i = ARRMINIDX; i <= ARRMAXIDX; i++ )
dosomething(arr(i));
OTOH, this make not be what you want at all, given your comment
I want an array with 0-1 elements: a limited int or limited "numeric
int"--string mimicking an int.
which I can't make heads or tails of in this context. Are you saying that you want a string of 4-20 chars that represents an integer?

Resources