Creating a File of Random Size [1...500] KB - c

Foreword: I have simplified the problem into its key functionalities, so if it sounds weird it is because this is a small aspect of the whole program.
Problem:
I want to create something like 100 text files: I'll loop and use my loop counter to name the files.
Then, I want to populate each file with random strings. I use my String struck defined below for this. I want to fill the file up from [1KB up to 500KB].
struct String // And yes I am using my own String library.
{
char *c;
int length;
int maxLength;
}
Lets assume I have the file opened (probably at the moment I create it, so it is empty). Now I would check something like this.
int range = Random.Range(0,500);
I would get a number that would predetermine the file size. So if range == 100 then the file would be populated with 100KB of "data".
I would first have my string created.
// Maybe making this 100 chars would help?
String *s1 = makeString("abcdefghijklmnopqrstuvwxyz");
How would I figure out how many times I have to write my String s1 into the file to make it the size of range? Preferably before writing to the file, I wouldn't want to write first then check, then write again.
And How would I get a random integer value in C? I used to the Random.Range in C#.

To keep it simple, it would best if you can make your string size the common denominator of 1KB (1024 bytes). So you don't have to take care fraction.
After that you can do as #naitoon mentioned above (range*1024)/s1->length. If each of the character of your string is 1 byte long.
As for random integer, you can call the standard library rand() which returns integer between 0 to RAND_MAX, which is at least 32767.
Also, in order to keep the random number with your range(0~500), you can do a modular of the return value.
range = rand() % 500;

Related

Dynamically indexing an array in C

Is it possible to create arrays based of their index as in
int x = 4;
int y = 5;
int someNr = 123;
int foo[x][y] = someNr;
dynamically/on the run, without creating foo[0...3][0...4]?
If not, is there a data structure that allow me to do something similar to this in C?
No.
As written your code make no sense at all. You need foo to be declared somewhere and then you can index into it with foo[x][y] = someNr;. But you cant just make foo spring into existence which is what it looks like you are trying to do.
Either create foo with correct sizes (only you can say what they are) int foo[16][16]; for example or use a different data structure.
In C++ you could do a map<pair<int, int>, int>
Variable Length Arrays
Even if x and y were replaced by constants, you could not initialize the array using the notation shown. You'd need to use:
int fixed[3][4] = { someNr };
or similar (extra braces, perhaps; more values perhaps). You can, however, declare/define variable length arrays (VLA), but you cannot initialize them at all. So, you could write:
int x = 4;
int y = 5;
int someNr = 123;
int foo[x][y];
for (int i = 0; i < x; i++)
{
for (int j = 0; j < y; j++)
foo[i][j] = someNr + i * (x + 1) + j;
}
Obviously, you can't use x and y as indexes without writing (or reading) outside the bounds of the array. The onus is on you to ensure that there is enough space on the stack for the values chosen as the limits on the arrays (it won't be a problem at 3x4; it might be at 300x400 though, and will be at 3000x4000). You can also use dynamic allocation of VLAs to handle bigger matrices.
VLA support is mandatory in C99, optional in C11 and C18, and non-existent in strict C90.
Sparse arrays
If what you want is 'sparse array support', there is no built-in facility in C that will assist you. You have to devise (or find) code that will handle that for you. It can certainly be done; Fortran programmers used to have to do it quite often in the bad old days when megabytes of memory were a luxury and MIPS meant millions of instruction per second and people were happy when their computer could do double-digit MIPS (and the Fortran 90 standard was still years in the future).
You'll need to devise a structure and a set of functions to handle the sparse array. You will probably need to decide whether you have values in every row, or whether you only record the data in some rows. You'll need a function to assign a value to a cell, and another to retrieve the value from a cell. You'll need to think what the value is when there is no explicit entry. (The thinking probably isn't hard. The default value is usually zero, but an infinity or a NaN (not a number) might be appropriate, depending on context.) You'd also need a function to allocate the base structure (would you specify the maximum sizes?) and another to release it.
Most efficient way to create a dynamic index of an array is to create an empty array of the same data type that the array to index is holding.
Let's imagine we are using integers in sake of simplicity. You can then stretch the concept to any other data type.
The ideal index depth will depend on the length of the data to index and will be somewhere close to the length of the data.
Let's say you have 1 million 64 bit integers in the array to index.
First of all you should order the data and eliminate duplicates. That's something easy to achieve by using qsort() (the quick sort C built in function) and some remove duplicate function such as
uint64_t remove_dupes(char *unord_arr, char *ord_arr, uint64_t arr_size)
{
uint64_t i, j=0;
for (i=1;i<arr_size;i++)
{
if ( strcmp(unord_arr[i], unord_arr[i-1]) != 0 ){
strcpy(ord_arr[j],unord_arr[i-1]);
j++;
}
if ( i == arr_size-1 ){
strcpy(ord_arr[j],unord_arr[i]);
j++;
}
}
return j;
}
Adapt the code above to your needs, you should free() the unordered array when the function finishes ordering it to the ordered array. The function above is very fast, it will return zero entries when the array to order contains one element, but that's probably something you can live with.
Once the data is ordered and unique, create an index with a length close to that of the data. It does not need to be of an exact length, although pledging to powers of 10 will make everything easier, in case of integers.
uint64_t* idx = calloc(pow(10, indexdepth), sizeof(uint64_t));
This will create an empty index array.
Then populate the index. Traverse your array to index just once and every time you detect a change in the number of significant figures (same as index depth) to the left add the position where that new number was detected.
If you choose an indexdepth of 2 you will have 10² = 100 possible values in your index, typically going from 0 to 99.
When you detect that some number starts by 10 (103456), you add an entry to the index, let's say that 103456 was detected at position 733, your index entry would be:
index[10] = 733;
Next entry begining by 11 should be added in the next index slot, let's say that first number beginning by 11 is found at position 2023
index[11] = 2023;
And so on.
When you later need to find some number in your original array storing 1 million entries, you don't have to iterate the whole array, you just need to check where in your index the first number starting by the first two significant digits is stored. Entry index[10] tells you where the first number starting by 10 is stored. You can then iterate forward until you find your match.
In my example I employed a small index, thus the average number of iterations that you will need to perform will be 1000000/100 = 10000
If you enlarge your index to somewhere close the length of the data the number of iterations will tend to 1, making any search blazing fast.
What I like to do is to create some simple algorithm that tells me what's the ideal depth of the index after knowing the type and length of the data to index.
Please, note that in the example that I have posed, 64 bit numbers are indexed by their first index depth significant figures, thus 10 and 100001 will be stored in the same index segment. That's not a problem on its own, nonetheless each master has his small book of secrets. Treating numbers as a fixed length hexadecimal string can help keeping a strict numerical order.
You don't have to change the base though, you could consider 10 to be 0000010 to keep it in the 00 index segment and keep base 10 numbers ordered, using different numerical bases is nonetheless trivial in C, which is of great help for this task.
As you make your index depth become larger, the amount of entries per index segment will be reduced
Please, do note that programming, especially lower level like C consists in comprehending the tradeof between CPU cycles and memory use in great part.
Creating the proposed index is a way to reduce the number of CPU cycles required to locate a value at the cost of using more memory as the index becomes larger. This is nonetheless the way to go nowadays, as masive amounts of memory are cheap.
As SSDs' speed become closer to that of RAM, using files to store indexes is to be taken on account. Nevertheless modern OSs tend to load in RAM as much as they can, thus using files would end up in something similar from a performance point of view.

Storing large numbers from user input into an array of integers

I am currently working on a C project that requires the creation, storage and mathematical usage of numbers that are too large to be put into normal variable types. To do this, we were instructed to represent numbers as a sequence of digits stored in an array of integers. I use a struct defined as so:
struct BigInt {
int val[300000];
int size;
};
(I know I can dynamically allocate memory, and that that is
preferable, however this is how I am most comfortable doing it, it has
worked perfectly fine so far and this is how the professor instructed us to do it.)
I then define member A:
struct BigInt A={NULL};
I can generate and store, then add, subtract and multiply random numbers with this, and they can have any number digits up to 300000(far more than I will ever need to account for). For example, if the number 1432 was generated and stored into BigInt A, A.size would be 4 and A.val[2] would be 3.
Now I need to create a way to store user input into this type. For example, the user needs to be able go straight from inputting 50! and then it be stored into this struct array type I have created. How would I go about doing this?
The only ways that I could think of would be to store the user input as a string then have the math in that string be executed multiple times, each time storing a different digit, or reading numbers straight off of stdout, but I don't know if either of those are even possible or would solve my problem.
You can try using string as follows:
char s[300001];
scanf("%s", s);
A.size = strlen(s);
for(int i = 0; i < A.size; i++){
A.val[i] = s[i] - '0';
}
I think it will solve your problem, but this way of implementation for big integers is not efficient though.
Sorry for previous answer, to solve in c you need to use array of chars to store each digits.

Arduino: Generating 100 int random array results in first 10 integers of 0

I am using an Arduino UNO with 32.000 bytes of storage.
While writing my program I made a small function that filled an array with random numbers.
This function uses a variable I defined at the top of the script as:
int mode1[100];
And this is the function that fills the array above. RandomSeed takes a number for its seed, whcih is provided by analogRead(0), with the 0 meaning pin 0 on my Arduino board.
void fillArray(int aSize){
if (aSize == 100){
randomSeed(analogRead(0));
for (int i=0; i < aSize; i++){
mode1[i] = random(1, aSize);}
When I call the fillArray function like fillArray(100); it will generate my integers. I then read them on my computer via this piece of code:
Serial.println("Filled array");
for (int i=0; i <= 99; i++){Serial.println((int)mode1[i]);}
Everything seemed to work fine but I noticed that the first 10 integers that my function will generate are always 0. My main problem is that I don't know how to troubleshoot this because the script gives me no errors. My question therefore is: What is the cause of the first 10 integers of that array always being 0?
My possible explanations for the cause are that the AnalogRead function does something strange (Currently it just has a pin in the A0 slot with nothing connected to it, which should work. I also know it's not the storage capacity, since 32.000 bytes is enough to store 100 integers in an array, and my script is only 5000 bytes.
I'm a bit stuck on this because I do not know what is causing the problem, any help would on the subject would be appreciated.
I cannot comment (not enough reputation yet), but did you check the number of lines that are on your output ? You may have "something" else in your code sending data before your start your log. You may want to write lines with "mode1[xxx]= yyy" where xxx is the index and yyy the value stored at index to make sure you that are looking at your array content.

How do i determine the size of an array from a character read from a file?

I have to allocate a dynamic array and i know how many columns there will be on the array but i don't know how many rows, all i have is a number on a .txt file. I have tried the following code but i am not sure it will work:
int x = (int)fgetc(file)-48;
Since the ascii value of 0 is 48, i assumned that i needed to cast the character read from the file in order to be able to use it as my rows number.
I assume i should be able to allocate the array the 2D array as it follows:
m = (int **)malloc(x*sizeof(int*));
for (i=0;i<x;i++)
{
m[i] = (int*)malloc(10*sizeof(int));
}
Am i correct? Any help will be highly apretiated.
You can design a list and dynamically insert your rows.
First off fgetc() returns an integer, so casting it as an int will do nothing. Second you're only reading in one integer at a time with fgetc() so you will have a 1 digit number in x.
Your array allocation looks correct, but you can also allocate the columns as an array of int * on the stack and then allocate the rows dynamically as m[i] = (int*)malloc(x*sizeof(int)); from i = 0->9
Do I understand correctly that your file looks like
327 // number of lines
1 2 3 // line 1
33 44 55 // line 2
... repeats until 327 lines have been printed, all with 3 elements? Note that the line breaks would be optional and could be any whitespace.
The canonical way to read a number from text in C is using scanf. scanf uses, like printf, a weird looking format string as the fist parameter. Each basic type has a letter associated with it, for integers it's d or, more intuitively, i. These letters are prefixed with a %. So to read an integer, you would write scanf("%d", &lines); if lines is an int holding the number of lines. (Do rather not use x, for readability).
The way you allocate your array is correct (provided x holds the number of lines and 10 is the known line length). One style issue is that the 10 should be #defined as a macro so that you can use e.g. malloc(LINE_LEN*sizeof(int)). That helps later when that number should ever change and you have (in a real world program) scattered references to m over several source files.
If this is just a little program and the array isnt't inordinately large and does not need to live longer than the function call (which, in the case of main(), may be long enough in any case), the easiest would be to use a variable size array in C; provided you use a modestly modern compiler:
#define LINE_LEN 10
int lineCount;
scanf("%d", &lineCount);
int m[lineCount][LINE_LEN];
// fill from file
If you compile with gcc you'll probably need to specify "-std=c99" as a command line option for that.

An array of length 4-20?

I'd like for my array to be of a set length using a simple format. Please, let me know how this is done.
What I already have:
arr[100]
Pseudocode: what I would like to have:
arr[4-20] or arr[$min_int THROUGH $max_int]
Additional detail edit: The int should be within the range array = (4, 20). The input may contain leading zeros. I'd like to keep the length of the array restricted (i.e., to 9 or 10 characters).
Arrays simply do not work this way in C. You will need to implement it yourself by only looping through valid indices (and wasting memory in the process) or by using a data structure better suited to the job, like a map (which you will have to find in a library or write yourself as it does not exist in the language).
#define ARRMINIDX 4
#define ARRMAXIDX 20
int arrmem[ARRMAXIDX+1-ARRMINIDX];
#define arr(x) arrmem[ARRMINIDX+(x)]
// process elements of arr
for( i = ARRMINIDX; i <= ARRMAXIDX; i++ )
dosomething(arr(i));
OTOH, this make not be what you want at all, given your comment
I want an array with 0-1 elements: a limited int or limited "numeric
int"--string mimicking an int.
which I can't make heads or tails of in this context. Are you saying that you want a string of 4-20 chars that represents an integer?

Resources