How to check if "set" in c - c

If I allocate a C array like this:
int array[ 5 ];
Then, set only one object:
array[ 0 ] = 7;
How can I check whether all the other keys ( array[1], array[2], …) are storing a value? (In this case, of course, they aren't.)
Is there a function like PHP's isset()?
if ( isset(array[ 1 ]) ) ...

There isn't things like this in C. A static array's content is always "set". You could, however, fill in some special value to pretend it is uninitialized, e.g.
// make sure this value isn't really used.
#define UNINITIALIZED 0xcdcdcdcd
int array[5] = {UNINITIALIZED, UNINITIALIZED, UNINITIALIZED, UNINITIALIZED, UNINITIALIZED};
array[0] = 7;
if (array[1] != UNINITIALIZED) {
...

You can't
There values are all undefined (thus random).
You could explicitly zero out all values to start with so you at least have a good starting point. But using magic numbers to detect if an object has been initialized is considered bad practice (but initializing variables is considered good practice).
int array[ 5 ] = {};
But if you want to explicitly check if they have been explicitly set (without using magic numbers) since creation you need to store that information in another structure.
int array[ 5 ] = {}; // Init all to 0
int isSet[ 5 ] = {}; // Init all to 0 (false)
int getVal(int index) {return array[index];}
int isSet(int index) {return isSet[index];}
void setVal(int index,int val) {array[index] = val; isSet[index] = 1; }

In C, all the elements will have values (garbage) at the time of allocation. So you cannot really have a function like what you are asking for.
However, you can by default fill it up with some standard values like 0 or INT_MIN using memset() and then write an isset() code.

I don't know php, but one of two things is going on here
the php array is actually a hash-map (awk does that)
the php array is being filled with nullable types
in either case there is a meaningful concept of "not set" for the values of the array. On the other hand a c array of built in type has some value in every cell at all times. If the array is uninitialized and is automatic or was allocated on the heap those values may be random, but they exist.
To get the php behavior:
Implement (or find a library wit) and use a hashmap instead on an array.
Make it an array of structures which include an isNull field.
Initialize the array to some sentinal value in all cells.

One solution perhaps is to use a separate array of flags. When you assign one of the elements, set the flag in the boolean array.
You can also use pointers. You can use null pointers to represent data which has not been assigned yet. I made an example below:
int * p_array[3] = {NULL,NULL,NULL};
p_array[0] = malloc(sizeof(int));
*p_array[0] = (int)0;
p_array[2] = malloc(sizeof(int));
*p_array[2] = (int)4;
for (int x = 0; x < 3; x++) {
if (p_array[x] != NULL) {
printf("Element at %i is assigned and the value is %i\n",x,*p_array[x]);
}else{
printf("Element at %i is not assigned.\n",x);
}
}
You could make a function which allocates the memory and sets the data and another function which works like the isset function in PHP by testing for NULL for you.
I hope that helps you.
Edit: Make sure the memory is deallocated once you have finished. Another function could be used to deallocate certain elements or the entire array.
I've used NULL pointers before to signify data has not been created yet or needs to be recreated.

An approach I like is to make 2 arrays, one a bit-array flagging which indices of the array are set, and the other containing the actual values. Even in cases where you don't need to know whether an item in the array is "set" or not, it can be a useful optimization. Zeroing a 1-bit-per-element bit array is a lot faster than initializing an 8-byte-per-element array of size_t, especially if the array will remain sparse (mostly unfilled) for its entire lifetime.
One practical example where I used this trick is in a substring search function, using a Boyer-Moore-style bad-character skip table. The table requires 256 entries of type size_t, but only the ones corresponding to characters which actually appear in the needle string need to be filled. A 1kb (or 2kb on 64-bit) memset would dominate cpu usage in the case of very short searches, leading other implementations to throw around heuristics for whether or not to use the table. But instead, I let the skip table go uninitialized, and used a 256-bit bit array (only 32 bytes to feed to memset) to flag which entries are in use.

Related

Modifying swift [CChar] arrays in C functions without returning value

I started learning C and wanted to try some of the Swift-C interoperability.
I have a small C function which reads me a file and concatenates some useful letters into a char* variable. After some testing, I cannot find a way to pass my obtained char* data back to swift. I have written a small dummy code to illustrate what I am trying to achieve.
var letters: [CChar] = []
functionWithArray(&letters)
print("Back in swift: \(letters)")
And the C function is:
void functionWithArray(char* letters) {
int arrayLenght = 5;
int testLenght = 10; // Expand array to this value (testing)
int currentArrayPosition = 0; //Keep track of the assigned values
letters = malloc(sizeof(char)*arrayLenght);
while (currentArrayPosition < testLenght) {
if (currentArrayPosition == arrayLenght) {
arrayLenght++;
letters = realloc(letters, sizeof(char)*arrayLenght);
}
letters[currentArrayPosition] = *"A";
++currentArrayPosition;
}
printf("End of C function: %s\n", letters);
}
I get this as an output:
End of C function: AAAAAAAAAA
Back in swift: []
Program ended with exit code: 0
As you can see, inside the C function I've got the desired result, but back in swift I could not find a way to obtain the modified array. I do not return letters directly with the function because I need to return more values from that function. I'm new to C so please be kind.
There are two main issues with your approach here — one in C and one in Swift:
In C, function parameters are passed by value, and are effectively mutable local variables. That means that when functionWithArray receives char *letters, letters is a local variable containing a pointer value to the buffer of letters in memory. Importantly, that means that letters is assignable, but not in the way that you think:
letters = malloc(sizeof(char)*arrayLenght);
allocates an entirely new buffer through malloc, and assigns the newly-created pointer value to your local letters variable. Before the assignment, letters is a pointer to the buffer you were getting from Swift; after, to an unrelated buffer in memory. These two buffers are completely unrelated to one another, and because letters is just a local variable, this assignment is not propagaged in any way outside of the function.
Note that this is just a rule of C: as you learn more C, you'll likely discover that in order to assign a variable from inside of a function to outside of a function, you need to wrap the variable in another layer of pointers and write through that pointer (e.g., you would need to receive char **letters and assign *letters = malloc(...) to have any effect on a variable being passed in — and the variable couldn't be passed in directly, but rather, its address would need to be passed in).
However, you can't generally make use of this fact because,
The implicit conversion of an Array<T> to an UnsafeMutablePointer<T> (e.g. [CChar] → UnsafeMutablePointer<CChar> in Swift == char * in C) does not allow you to assign an entirely new buffer to the array instance. You can write into the contents of the buffer by writing to pointer values, but you cannot allocate a new block of memory and reassign the contents of the array to that new block
Instead, you'll need to either:
Have functionWithArray return an entirely new array and length from C — you mention this isn't possible for functionWithArray specifically because of the other values it needs to return, but theoretically you can also create a C struct which wraps up all of the return values together and return one of those instead
Rewrite functionWithArray to receive an array and a length, and pre-reserve enough space in the array up-front to fill it appropriately:
var letters: [CChar] = []
letters.reserveCapacity(/* however much you need */)
functionWithArray(&letters, letters.capacity)
In functionWithArray, don't reassign letters, but instead fill it up to the capacity given to you with results. Of course, this will only work if you know in Swift ahead of time how much space functionWithArray will need, which you might not
Alternatively, you can also use Array.init(unsafeUninitializedCapacity:initializingWith:) to combine these operations by having Array preallocate some space, and you can pass in the inout UnsafeMutableBufferPointer<CChar> to C where you can allocate memory if you need to and assign to the buffer pointer, then write out to the inout Int how many array elements you allocated and initialized. This does also require a capacity, though, and is a more complicated solution
Of these two approaches, if functionWithArray really does need to dynamically reallocate memory and grow the buffer, then (1) is likely going to be easier.

Is it good programming practice in C to use first array element as array length?

Because in C the array length has to be stated when the array is defined, would it be acceptable practice to use the first element as the length, e.g.
int arr[9]={9,0,1,2,3,4,5,6,7};
Then use a function such as this to process the array:
int printarr(int *ARR) {
for (int i=1; i<ARR[0]; i++) {
printf("%d ", ARR[i]);
}
}
I can see no problem with this but would prefer to check with experienced C programmers first. I would be the only one using the code.
Well, it's bad in the sense that you have an array where the elements does not mean the same thing. Storing metadata with the data is not a good thing. Just to extrapolate your idea a little bit. We could use the first element to denote the element size and then the second for the length. Try writing a function utilizing both ;)
It's also worth noting that with this method, you will have problems if the array is bigger than the maximum value an element can hold, which for char arrays is a very significant limitation. Sure, you can solve it by using the two first elements. And you can also use casts if you have floating point arrays. But I can guarantee you that you will run into hard traced bugs due to this. Among other things, endianness could cause a lot of issues.
And it would certainly confuse virtually every seasoned C programmer. This is not really a logical argument against the idea as such, but rather a pragmatic one. Even if this was a good idea (which it is not) you would have to have a long conversation with EVERY programmer who will have anything to do with your code.
A reasonable way of achieving the same thing is using a struct.
struct container {
int *arr;
size_t size;
};
int arr[10];
struct container c = { .arr = arr, .size = sizeof arr/sizeof *arr };
But in any situation where I would use something like above, I would probably NOT use arrays. I would use dynamic allocation instead:
const size_t size = 10;
int *arr = malloc(sizeof *arr * size);
if(!arr) { /* Error handling */ }
struct container c = { .arr = arr, .size = size };
However, do be aware that if you init it this way with a pointer instead of an array, you're in for "interesting" results.
You can also use flexible arrays, as Andreas wrote in his answer
In C you can use flexible array members. That is you can write
struct intarray {
size_t count;
int data[]; // flexible array member needs to be last
};
You allocate with
size_t count = 100;
struct intarray *arr = malloc( sizeof(struct intarray) + sizeof(int)*count );
arr->count = count;
That can be done for all types of data.
It makes the use of C-arrays a bit safer (not as safe as the C++ containers, but safer than plain C arrays).
Unforntunately, C++ does not support this idiom in the standard.
Many C++ compilers provide it as extension though, but it is not guarantueed.
On the other hand this C FLA idiom may be more explicit and perhaps more efficient than C++ containers as it does not use an extra indirection and/or need two allocations (think of new vector<int>).
If you stick to C, I think this is a very explicit and readable way of handling variable length arrays with an integrated size.
The only drawback is that the C++ guys do not like it and prefer C++ containers.
It is not bad (I mean it will not invoke undefined behavior or cause other portability issues) when the elements of array are integers, but instead of writing magic number 9 directly you should have it calculate the length of array to avoid typo.
#include <stdio.h>
int main(void) {
int arr[9]={sizeof(arr)/sizeof(*arr),0,1,2,3,4,5,6,7};
for (int i=1; i<arr[0]; i++) {
printf("%d ", arr[i]);
}
return 0;
}
Only a few datatypes are suitable for that kind of hack. Therefore, I would advise against it, as this will lead to inconsistent implementation styles across different types of arrays.
A similar approach is used very often with character buffers where in the beginning of the buffer there is stored its actual length.
Dynamic memory allocation in C also uses this approach that is the allocated memory is prefixed with an integer that keeps the size of the allocated memory.
However in general with arrays this approach is not suitable. For example a character array can be much larger than the maximum positive value (127) that can be stored in an object of the type char. Moreover it is difficult to pass a sub-array of such an array to a function. Most of functions that designed to deal with arrays will not work in such a case.
A general approach to declare a function that deals with an array is to declare two parameters. The first one has a pointer type that specifies the initial element of an array or sub-array and the second one specifies the number of elements in the array or sub-array.
Also C allows to declare functions that accepts variable length arrays when their sizes can be specified at run-time.
It is suitable in rather limited circumstances. There are better solutions to the problem it solves.
One problem with it is that if it is not universally applied, then you would have a mix of arrays that used the convention and those that didn't - you have no way of telling if an array uses the convention or not. For arrays used to carry strings for example you have to continually pass &arr[1] in calls to the standard string library, or define a new string library that uses "Pascal strings" rather then "ASCIZ string" conventions (such a library would be more efficient as it happens),
In the case of a true array rather then simply a pointer to memory, sizeof(arr) / sizeof(*arr) will yield the number of elements without having to store it in the array in any case.
It only really works for integer type arrays and for char arrays would limit the length to rather short. It is not practical for arrays of other object types or data structures.
A better solution would be to use a structure:
typedef struct
{
size_t length ;
int* data ;
} intarray_t ;
Then:
int data[9] ;
intarray_t array{ sizeof(data) / sizeof(*data), data } ;
Now you have an array object that can be passed to functions and retain the size information and the data member can be accesses directly for use in third-party or standard library interfaces that do not accept the intarray_t. Moreover the type of the data member can be anything.
Obviously NO is the answer.
All programming languages has predefined functions stored along with the variable type. Why not use them??
In your case is more suitable to access count /length method instead of testing the first value.
An if clause sometimes take more time than a predefined function.
On the first look seems ok to store the counter but imagine you will have to update the array. You will have to do 2 operations, one to insert other to update the counter. So 2 operations means 2 variables to be changed.
For statically arrays might be ok to have them counter then the list, but for dinamic ones NO NO NO.
On the other hand please read programming basic concepts and you will find your idea as a bad one, not complying with programming principles.

C Array Memory Error?(Beginner)

When I try to print all the values in the array(which should be zero?), it starts printing 0's but at the end prints wonky numbers:
"(printing zeros)...0,0,0,0,0,0,0,1810432,0,1809600,0,1809600,0,0,0,5,0,3907584..."
When I extend the array, only at the end do the numbers start to mess up. Is this a memory limitation or something? Very confused, would greatly appreciate if anyone could help a newbie out.
Done in CS50IDE, not sure if that changes anything
int main()
{
int counter [100000];
for(int i = 0; i < 100000; i++)
{
printf("%i,", counter[i]);
}
}
Your array isn't initialized. You simply declare it but never actually set it. In C (and C++, Objective-C) you need to manually set a starting value. Unlike Python, Java, JavaScript or C# this isn't done for you...
which should be zero?
The above assertion is incorrect.
auto variables (variables declared within a block without the static keyword) are not initialized to any particular value when they are created; their value is indeterminate. You can't rely on that value being 0 or anything else.
static variables (declared at file scope or with the static keyword) are initialized to 0 or NULL, depending on type.
You can initialize all of the elements of the array to 0 by doing
int counter [100000] = {{0}};
If there are fewer elements in the initializer than there are elements in the array, then the extra elements are initialized as though they were static - 0 or NULL. So the first element is being explicitly initialized to 0, and the remaining 99999 elements are implicitly initialized to 0.
The reason why this is happening is because you reserved 100000*4 = 400000 bytes of memory but didn't write anything to it (didn't initialize it).
So therefore, garbage is printed if you access a memory location which hasn't been written to yet. The reason why 0's aren't printed is because we want optimization and don't want the compiler wasting time in writing to 100000 integer addresses and also the best practices expect a developer to never access a memory place that he has never written to or allocated yet. If you try printing:
printf("%d\n", counter[100000]);
This would also print a garbage value, but you didn't allocate that did you? It's because C/C++ don't restrict or raise errors when you try to do such operation unlike Java.
Try it yourself
for (int i=0; i<100000; i++) {
counter[i] = i;
printf("%d\n", counter[i]);
}
Now only numbers from 1,2,3....99999 will be printed on the screen.
When you declare an array in C, it does not set the elements to zero by default. Instead, it will be filled with whatever data last occupied that location in memory, which could be anything.
The fact that the first portion of the array contained zeros is just a coincidence.
This beginning state of an array is referred to as an "uninitialized" array, as you have not provided any initial values for the array. Before you can use the array, it should be "initialized", meaning that you specify a default value for each position.

Array size calculation without using inbuilt functions

How do I find the size of an integer array without using any inbuilt (standard) functions? Here's my attempt at it:
int fun(int a[25],int ele)
{
int flag=0,i=0;
while(a[i]!=NULL)
{
flag++;
i++;
}
return flag;
}
The most common way of sending data around in arrays is by null-terminating the arrays. (However, this may not work for you if, for example, 0 is a valid integer to have in your array. In this case, you might want to use -1, for example.)
int array_len(int *arr)
{
const int TERMINATOR = 0; // or -1, as the case may be
int i = 0;
while (arr[i] != TERMINATOR)
i++;
return i;
}
However, a better method is probably just sending not only an array, but an array and a length whenever passing around data. That way, you don't need to keep calling functions like this to get array lengths in your various functions.
You can't.
The behaviour on going past the bounds of the array is undefined.
You could model the array with some sort of value acting as a terminator, but that's hardly practical. Pass the size as an extra parameter, of if you really want to have just one argument, use a struct.
C does not store the size of the array with it. In C strings a NULL terminator is used to determine the size of the array, but this is convention. Either pass the size as an argument to the function, or choose a value that is considered the end of the array and search for it.
In you while loop condition -
while(a[i]!=NULL)//replace NULL with such value which is unique and not used in your array.
Use -1 or something.
You may think to use '\0' instead of NULL but then if your array has 0 in between also then also loop will stop. (if 0 is in array don't use '\0').

Check if 2d pointer array has user defined value in C?

Sample code:
float** a;
a = (float**) malloc(numNodes * sizeof(float*));
for(int i=0; i<`numNodes`; i++)
{
a[i] = (float*)malloc((numNodes-1) * sizeof(float));
}
I am creating a dynamic 2d array above. Before populating, I noticed that each block in the array already holds this value: -431602080.000000 and not NULL. Why is this?
There are situations where not all spaces within the array are used.
So, my query is simple, Is there an elegant way to check if the each block has this default value or a user defined value?
Thanks in advance.
The content of memory allocated with malloc (as well as of variables allocated on the stack) is undefined, so it may very well be anything. Usually you get space filled with zeroes (because the OS blanks memory pages that were used by other processes) or residues of the previous use of those memory pages (this is often the case if the memory page belonged to your process), but this is what happens under the hood, the C standard does not give any guarantees.
So, in general there's no "default value" and no way to check if your memory has been changed; however you can init the memory blocks you use with magic values that you're sure that will not be used as "real data", but it'll be just a convention internal to your application.
Luckily, for floating point variables there are several magic values like quiet NaN you can use for this purpose; in general you can use the macro NAN defined in <math.h> to set a float to NaN.
By the way, you shouldn't read uninitialized floats and doubles, since the usual format they are stored in (IEEE 754) contains some magic values (like the signaling NaN) that can raise arithmetic exceptions when they are read, so if your uninitialized memory happens to contain such bit pattern your application will probably crash.
C runtimes are not required to initialize any memory that you didn't initialize yourself and the values that they hold are essentially random garbage left over from the last time that memory was used. You will have to set them all to NULL explicitly first or use calloc.
Extending the good answer of Matteo Italia:
The code of initialization of a single array would look like:
float* row;
row = malloc( numNodes*sizeof(float) );
for (int i=0; i<numNodes; ++i) {
row[i] = nanf(); // set a Not-a-Number magic value of type float
}
(I'll leave it up to you to change this for your multi-dimensional array)
Then somewhere:
float value = ...; // read the array
if (isnan(value)) {
// not initialized
} else {
// initialized - do something with this
}
One thing is important to remember: NaN == NaN will yield false, so it's best to use isnan(), not == to test for the presence of this value.
In C automatic variables doesn't get automatically initialized. You need to explicitly set your variable to 0, if it's what you want.
The same is true for malloc that does'n initialize the space on the heap it allocates. You can use calloc if you want to initialize it:
a = malloc( numNodes*sizeof(float*) ); // no need to initialize this
for ... {
a[i] = calloc( numNodes-1, sizeof(float) );
}
Before populating, I noticed that each block in the array already holds this value: -431602080.000000 and not NULL. Why is this?
malloc() doesn't initialize the memory which it allocates. You need to use calloc() if you want 0 initialization
void *calloc(size_t nelem, size_t elsize);
The calloc() function allocates unused space for an array of nelem elements each of whose size in bytes is elsize. The space shall be initialized to all bits 0.

Resources