Int-stream in C

Int-stream in C - c

I'm implementing a function in C where I convert a byte[] to an int[]. The problem is that the length of the int[] depends on the contents of the byte[] (not just the length of the byte[]) so I won't know the total length of the int[] until I've iterated the entire byte[]. I'm therefore looking for some form av int-stream or dynamically increasing int-list which I can write to and then convert to a int[] once I'm done writing all the ints. My C-experience is a bit limited at the moment so I'm not really sure what's considered best practice to solve this kind of problem. Any suggestions?

The easiest method would be to allocate the int[] to be the same length (number of elements) as the byte[], and when you're done and know the size, call realloc to shrink it.
This assumes, of course, that interpreting the data would never create more integers than there are bytes in the stream.

There are a few ways of doing this I can think of.
I'm assuming, based on your question, that the transformation of your char[] to the corresponding int[]s is expensive (which is why you want to avoid performing that calculation twice - once to determine the size, and again to populate the contents.
So, here's how I would go about it:
First, is there a maximum size you can associate to the transformation? EX: Is there a maximum 2-to-1 size difference? (For each char in the char[] can it create "up to X" ints?)
If this is the case, and memory usage isn't an issue (you're not super constrained) - Go ahead and alloc the maximum size, populate it as you perform your translation, and realloc when you're done to shrink your memory footprint.
If this is not the case, you're in tougher waters, and should look to non-contiguous schemes - such as a linked list. Once you've performed your translation and built your linked list, you can then allocate space for your array, and visit each element in the linked list to populate the array.

First, inspect byte[] to determine the resulting int[] size. Then use malloc() to allocate the appropriately sized int[] structure.
#include <stdlib.h>
...
// imagine that the resulting int[] size depends on the sum of the bytes
int j, size = 0;
for (j = 0; byte[j]; ++j)
size += byte[j];
int *int_array = (int *) malloc (size);
for (j = 0; j < size; ++j)
int_array [j] = whatever;

First, If you can use C++, then you can just use a vector, which is a dynamically-sized array. Otherwise, you'll have to first iterate through your byte array to determine what the int array size should be, then dynamically allocate the int array. Second, C doesn't have a byte type, so the type normally used is char.
#include <stdlib.h>
char byte_array[ size ];
int i, int_size = 0;
int *int_array;
for ( i = 0; i < size; i++ ) {
int_size += f( byte_array[i] );
}
int_array = (int*) malloc( int_size );
where f() is some function you write that looks at one element of the byte array to help determine how large the int array should be.

Related

looping over an array in C

Say I want to loop over an array, so I used a basic for loop and accessed each element in it with the index but what happens if I don't know how long my array is?
#include <stdio.h>
#include <stdlib.h>
int main(){
int some_array[] = {2,3,5,7,2,17,2,5};
int i;
for (i=0;i<8;i++){
printf("%d\n",some_array[i]);
}
return 0;
}
This is just a simple example but if I don't know how big the array is, then how can I place a correct stopping argument in the loop?
In Python this is not needed since the StopIteration exception kicks in, but how can I implement it in C?

Just do like this:
for (i=0; i<sizeof(some_array)/sizeof(some_array[0]); i++){
printf("%d\n",some_array[i]);
}
But do beware. It will not work if you pass the array to a function. If you want to use it in a function, then write the function so that you also pass the size as argument. Like this:
void foo(int *arr, size_t size);
And call it like this:
foo(some_array, sizeof(some_array)/sizeof(some_array[0]));
But if you have a function that just take a pointer, there is absolutely no standard way to find out the size of it. You have to implement that yourself.

You have to know the size of the array. That's one of the most important rules of C programming. You, the programmer, are always responsible for knowing how large your array is. Sure, if you have a stack array or a static array, you can do this:
int array[size];
int size_of_array = sizeof array / sizeof *array;
for (int i = 0; i < size_of_array; i++) {
// do something with each array[i]
}
But as you can see, you needed the variable size in the first place. So what's the point of trying to discover the size if you were forced to know it already?
And if you try to pass this array to any function
some_function(array); /
you have to pass the size of the array too, because once the array is no longer in the same function that declared it, there is no mechanism to find its size again (unless the contents of the array indicate the size somehow, such as storing the number of elements in array[0] or using a sentinel to let you count the number of elements).
void some_function(int *array) {
/* Iterate over the elements until a sentinel is found.
* In this example, the sentinel is a negative number.
* Sentinels vary from application to application and
* implicitly tell you the size of the array.
*/
for (int i = 0; array[i] >= 0; i++) {
// do something with array[i]
}
}
And if it is a dynamically-allocated array, then you need to explicitly declare the number of elements anyway:
int size = 10;
int *array = malloc(sizeof *array * 10);
So, to summarize, you must always know the size of the array. There is no such thing in C as iterating over an array whose size you don't know.

You can use sizeof() to get the size of the array in bytes then divide the result by the size of the data type:
size_t n = sizeof(some_array)/sizeof(some_array[0]);

In general, you can calculate the size of the array with:
sizeof(ArrayName)/sizeof(ArrayType)
but this does not work with dynamically created arrays

Add item to empty array in C and getting array length

I've taking many attempts at solving this problem but failed every time.
I have an array
char *array[1024] = {};
Now I would like to add an item to the array and would also access the items by numbers
For example:
array[0] would be the first item
array[1] would be the second
array[2] would be the third item
But also I would like to know how many items are in the array so I could use something like
for(int i = 0; i <= totalitemsinarray; i++) {
print(array[i]);
}

You cannot change the size of an array in C. You can however allocate a sufficiently large array and then fill it up with entries. First, declare an array with a sufficient size, say, 1024.
char *array[1024];
Then declare a variable fill that counts the number of used slots in array. Initialize it to 0 as 0 slots are used in the beginning. Then, each time you insert an item, increment fill:
array[fill++] = ...;
...
array[fill++] = ...;
Make sure that you never attempt to insert more than 1024 items into the array, C doesn't check that for you.
For a more flexible approach, use malloc() to allocate memory for the array and then periodically enlarge it with realloc() when it's full. If you increase the array size in exponential steps (say, multiply with Φ = 0.5 + 0.5 √2 &approx; 1.61), this runs in O(1) amortised time per entry inserted.

There is no way to do what you're asking directly with C. One option could be if you knew that only certain values were valid. For example, you have an array of char *s so often people use NULL as a flag/invalid value. In that case you could initialize your array to have all NULLs and use that to know the size of the array:
char *array[1024];
memset(array, 0, sizeof(array));
/* .... */
for (int i = 0; i < sizeof(array)/sizeof(char*); i++) {
if (array[i]) {
printf("%s\n", array[i]);
}
}

char *array[1024] = {};
First, that is an array with 1024 char pointers/strings. Those elements can be 0s or plain garbage. If you don't plan to set them all you may want to nullify the array.
For the matter of storing the values and the count you might want to have a look at structs. For example:
typedef struct elem {
int count;
char *value;
} elem;
Then elem.count would be the number and elem.value would be the value accordingly.
And then initialize them in a for loop.

The only really valid way to approach this, is to dynamically grow the array. Allocate the array on the heap, and manage two counts: 1. the count of currently used elements, and 2. the count of elements for which you currently have memory allocated. Something like this:
//the setup
size_t arrayLength = 0, allocatedSize = 8;
int* array = malloc(sizeof(*array) * allocatedSize);
//grow the array -> first check that we have space to add an element
if(arrayLength == allocatedSize) {
array = realloc(array, allocatedSize *= 2);
assert(array);
}
assert(arrayLength < allocatedSize);
//grow the array -> add an element
array[arrayLength++] = ...;
You see, the realloc() call is not too much hassle, but it will protect you from bugs when the requirements change. My experience is that any fixed limit in the code, as insanely large as it may seem to be, will eventually be exceeded, and miserable failure will result. The only safeguard is to use as much memory as needed everywhere.

To know the size of an array in c

I am learning C language. I want to know the size of an array inside a function. This function receive a pointer pointing to the first element to the array. I don't want to send the size value like a function parameter.
My code is:
#include <stdio.h>
void ShowArray(short* a);
int main (int argc, char* argv[])
{
short vec[] = { 0, 1, 2, 3, 4 };
short* p = &vec[0];
ShowArray(p);
return 0;
}
void ShowArray(short* a)
{
short i = 0;
while( *(a + i) != NULL )
{
printf("%hd ", *(a + i) );
++i;
}
printf("\n");
}
My code doesn't show any number. How can I fix it?
Thanks.

Arrays in C are simply ways to allocate contiguous memory locations and are not "objects" as you might find in other languages. Therefore, when you allocate an array (e.g. int numbers[5];) you're specifying how much physical memory you want to reserve for your array.
However, that doesn't tell you how many valid entries you have in the (conceptual) list for which the physical array is being used at any specific point in time.
Therefore, you're required to keep the actual length of the "list" as a separate variable (e.g. size_t numbers_cnt = 0;).
I don't want to send the size value like a function parameter.
Since you don't want to do this, one alternative is to use a struct and build an array type yourself. For example:
struct int_array_t {
int *data;
size_t length;
};
This way, you could use it in a way similar to:
struct int_array_t array;
array.data = // malloc for array data here...
array.length = 0;
// ...
some_function_call(array); // send the "object", not multiple arguments
Now you don't have to write: some_other_function(data, length);, which is what you originally wanted to avoid.
To work with it, you could simply do something like this:
void display_array(struct int_array_t array)
{
size_t i;
printf("[");
for(i = 0; i < array.length; ++i)
printf("%d, ", array.data[i]);
printf("]\n");
}
I think this is a better and more reliable alternative than another suggestion of trying to fill the array with sentinel values (e.g. -1), which would be more difficult to work with in non-trivial programs (e.g. understand, maintain, debug, etc) and, AFAIK, is not considered good practice either.
For example, your current array is an array of shorts, which would mean that the proposed sentinel value of -1 can no longer be considered a valid entry within this array. You'd also need to zero out everything in the memory block, just in case some of those sentinels were already present in the allocated memory.
Lastly, as you use it, it still wouldn't tell you what the actual length of your array is. If you don't track this in a separate variable, then you'll have to calculate the length at runtime by looping over all the data in your array until you come across a sentinel value (e.g. -1), which is going to impact performance.
In other words, to find the length, you'd have to do something like:
size_t len = 0;
while(arr[len++] != -1); // this is O(N)
printf("Length is %u\n", len);
The strlen function already suffers from this performance problem, having a time-complexity of O(N), because it has to process the entire string until it finds the NULL char to return the length.
Relying on sentinel values is also unsafe and has produced countless bugs and security vulnerabilities in C and C++ programs, to the point where even Microsoft recommends banning their use as a way to help prevent more security holes.
I think there's no need to create this kind of problem. Compare the above, with simply writing:
// this is O(1), does not rely on sentinels, and makes a program safer
printf("Length is %u\n", array.length);
As you add/remove elements into array.data you can simply write array.length++ or array.length-- to keep track of the actual amount of valid entries. All of these are constant-time operations.
You should also keep the maximum size of the array (what you used in malloc) around so that you can make sure that array.length never goes beyond said limit. Otherwise you'd get a segfault.

One way, is to use a terminator that is unique from any value in the array. For example, you want to pass an array of ints. You know that you never use the value -1. So you can use that as your terminator:
#define TERM (-1)
void print(int *arr)
{
for (; *arr != TERM; ++arr)
printf("%d\n", *arr);
}
But this approach is usually not used, because the sentinel could be a valid number. So normally, you will have to pass the length.
You can't use sizeof inside of the function, because as soon as you pass the array, it decays into a pointer to the first element. Thus, sizeof arr will be the size of a pointer on your machine.

#include <stdio.h>
void ShowArray(short* a);
int main (int argc, char* argv[])
{
short vec[] = { 0, 1, 2, 3, 4 };
short* p = &vec[0];
ShowArray(p);
return 0;
}
void ShowArray(short* a)
{
short i = 0;
short j;
j = sizeof(*a) / sizeof(short);
while( i < j )
{
printf("%hd ", *(a + i) );
++i;
}
printf("\n");
}
Not sure if this will work tho give it a try (I don't have a pc at the moment)

how to deal with large 2D arrays

i have a 2D array of size 5428x5428 size.and it is a symmetric array. but while compiling it gives me an error saying that array size too large. can anyone provide me a way?

This array is to large for program stack memory - thats your error.
int main()
{
double arr[5428][5428]; // 8bytes*5428*5428 = 224MB
// ...
// use arr[y][x]
// ...
// no memory freeing needed
}
Use dynamic array allocation:
int main()
{
int i;
double ** arr;
arr = (double**)malloc(sizeof(double*)*5428);
for (i = 0; i < 5428; i++)
arr[i] = (double*)malloc(sizeof(double)*5428);
// ...
// use arr[y][x]
// ...
for (i = 0; i < 5428; i++)
free(arr[i]);
free(arr);
}
Or allocate plain array of size MxN and use ptr[y*width+x]
int main()
{
double * arr;
arr = (double*)malloc(sizeof(double)*5428*5428);
// ...
// use arr[y*5428 + x]
// ...
free(arr);
}
Or use combined method:
int main()
{
int i;
double * arr[5428]; // sizeof(double*)*5428 = 20Kb of stack for x86
for(i = 0; i < 5428; i++)
arr[i] = (double)malloc(sizeof(double)*5428);
// ...
// use arr[y][x]
// ...
for(i = 0; i < 5428; i++)
free(arr[i]);
}

When arrays get large, there are a number of solutions. The one that is good for you depends heavily on what you are actually doing.
I'll list a few to get you thinking:
Buy more memory.
Move your array from the stack to the heap.
The stack has tighter size limitations than the heap.
Simulate portions of the array (you say yours is symmetric, so just under 1/2 of the data is redundant).
In your case, the array is symmetric, so instead of using an array, use a "simulated array"
int getArray(array, col, row);
void setArray(array, col, row, value);
where array is a data structure tha only holds the lower left half and the diagonal. The getArray(..) then determines if the column is greater than the row, and if it is, it returns (note the reversed entries getArray(array, row, col); This leverages the symmetric property of the array without the need to actually hold both symmetric sides.
Simulate the array using a list (or tree or hash table) of "only the value holding items"
This works very well for sparse arrays, as you no longer need to allocate memory to hold large numbers of zero (or empty) values. In the event that someone "looks up" a non-set value, your code "discovers" no value set for that entry, and then returns the "zero" or empty value without it actually being stored in your array.
Again without more details, it is hard to know what kind of solution is the best approach.

When you create local variables, they go on the stack, which is of limited size. You're blowing through that limit.
You want your array to go on the heap, which is all the virtual memory your system has, i.e. gigs and gigs on a modern system. There are two ways to manage that. One is to dynamically allocate the array as in k06a's answer; use malloc() or your platform-specific allocator function (e.g. GlobalAlloc() on Windows) . The second is to declare the array as a global or module static variable, outside of any function.
Using a global or static has the disadvantage that this memory will be allocated for the entire lifetime of your program. Also, pretty much everybody hates globals on principle. On the other hand, you can use the two-dimensional array syntax, "array[x][y]" and the like, to access array elements... easier than doing array[x + y * width], plus you don't have to remember whether you're supposed to be doing "x + y * width" or "x * height + y" .

C Program: regular versus ragged character-string arrays

I'm trying to write more efficient code in a C program, and I need some help getting my pointers and assignments correct. I've shown two methods below, each using the following declarations and strncpy:
int kk, arraysize;
char person_name[100] = "";
char * array_person_name, * array_param;
...
strncpy(person_name, "John Smith", 100);
arraysize = <this value is downloaded from database>;
...
Method A (rectangular array):
array_person_name = malloc( sizeof(char) * arraysize *100 );
array_param = malloc( sizeof(char) * arraysize * 2 );
for (kk = 0; kk < arraysize; kk++) {
strncpy(array_person_name[kk], person_name, 100);
strncpy(array_param[kk], "bt", 2);
}
Method B (ragged array):
for (kk = 0; kk < arraysize; kk++) {
array_person_name[kk] = &person_name;
array_param[kk] = "bt";
}
Notice that the arrays I'm trying to create place the same value into each element of the array. Method A is an (rectangular) array of arraysize elements, each element itself being an array of 100 characters. Method B attempts not to waste storage space by creating an (ragged) array of arraysize elements, where each element is a pointer-to-char.
QUESTION 1: Am I allocating memory (e.g. malloc) correctly in Method A?
QUESTION 2: Does the syntax look correct for Method B?
QUESTION 3: How do I allocate memory for the arrays in method B?
QUESTION 4: Am I correct that Method B is generally preferred?

You are pretty far off here. 1:yes, 2:no, 3:no, 4:yes. I'm not going to do it all, but here are a few hints.
You need space to store the strings and space to store pointers to the strings (the latter isn't strictly necessary for Method A). The first will have type char*, the second will have type char**.
For Method A, you are allocating the string storage correctly, but you need to allocate the storage for the string pointers correctly (hint: you need arraysize instances of a char* pointer). It then gets initialized to pointers which differ from each other by 100 characters.
For Method B, there is no easy way of allocating space to store the strings, as you don't know how much space you'll need. You could iterate through all the strings once just to count their length, or do one malloc per string, or use a fixed size chunk and allocate more when you run out.
Method B uses the same string storage pointer array as Method A. You need to assign the string pointers into the array once you know where they will go.