Deshuffle text with C - c

I wanted to get an opinion here. Would it be possible to write a C program that deshuffles a text file? What do I mean by that? Say I have the following data in a textfile:
1 X
4 T
3 Z
2 L
And I wanted to deshuffle it and output another file as so:
1 X
2 L
3 Z
4 T
Such that all of the data following the number is preserved with the actual index number. Do know that I have 40,500 shuffled entities, so that should probably be taken into account since that could take a long time if the program needs to loop through all of the entities for each entity...And I only used letters for representation. The actual data files don't have letters, but rather have floats. Sorry if this causes any confusion
So, bottom line, would this be possible with C? And if so, could I get a hint at where to start? I could obviously input all of the textfile data into an array dat[][], but how should I deshuffle it then?
Thanks!
Amit

Look up qsort, its part of stdlib.

I'd just use sort -n instead of writing a program.

If you know the largest index in the file and there are no holes (i.e. all indices are present), you can create an array of that size, then as you read each line of the file, put the data into the correct location in the array. When you're done reading the file, your array will have all the elements in the correct order.

You can build a vector< pair<int,float[4]> > and just sort it using STL.
Just noticed a c tag.... But C idea is about the same:
Build an array with the data:
struct pair
{
int idx;
float val[4];
}
pair *data;
And then just sort it.

For those interested, I ended up using a series of loops to write this program, the code is show below. The only downside of it is that if my nrow variable is equal to 40,500, this program can take as long as 30 mins to run on a 3GHz dual core computer. I'm sure there are ways to optimize it, though at least it does what I want...for now.
Here's the code:
#include "stdlib.h"
#include "stdio.h"
// Deshuffle the output files of the bond/swap mode
int main(int argc, char* argv[])
{
if (argc < 4) {
printf("\ndeshuffle usage: [number of chains] [number of atoms] [.dat file] \n");
exit(1);
}
int i,j,k,l;
int nch = atoi( argv[1] );
int ns = atoi( argv[2] );
double **in_dat, **s_dat, dm1;
int nrow = ns*nch;
in_dat = (double**) calloc(nrow, sizeof(double*));
s_dat = (double**) calloc(nrow, sizeof(double*));
for (i=0; i<nrow; i++) {
in_dat[i] = (double*) calloc(6, sizeof(double));
s_dat[i] = (double*) calloc(6, sizeof(double));
for (j=0; j<6; j++)
in_dat[i][j] = 0.0;
s_dat[i][j] = 0.0;
}
// store input data into 2D array in_dat
FILE *inp;
inp = fopen( argv[3], "r" );
for (i=0; i<nrow; i++) {
for (j=0; j<6; j++) {
fscanf(inp, "%lf", &dm1);
in_dat[i][j] = dm1;
}
}
fclose(inp);
// Sort data in s_dat based on comparison with in_dat
k=0;
while (k < (nrow)) {
for (i=0; i<nrow; i++) {
for (l=0; l<nrow; l++) {
if (in_dat[l][0] == (k+1)) {
for (j=0; j<6; j++) {
s_dat[i][j] = in_dat[l][j];
}
k++;
break;
}
}
}
}
// Write sorted data to file
FILE *otp;
otp = fopen("results.out", "w");
for (i=0; i<nrow; i++) {
for (j=0; j<6; j++) {
fprintf(otp, "%lf \t", s_dat[i][j]);
if (j==5)
fprintf(otp, "\n");
}
}
fclose(otp);
printf("\n Done. \n\n");
return 0;
}

Related

Better way to write a csv in C with many variables and digit specifiers?

I currently have something like 13+ variables and 500,000 datapoints I'm outputting to a csv file, and I might add more outputs later since the project is in its early stages. The last fprintf line is huge.
int dig = 4;
for(k = 0; k < cfg_ptr->nprt; k++){
/*normalize output*/
en_norm = en[k]*norm_factor;
en0_norm = en0[k]*norm_factor;
mu = rmu[k]*norm_factor;
mu0 = rmu0[k]*norm_factor;
fprintf(ofp,"%d,"\
"%.*E,%.*E,%.*E,%.*E,%.*E,%.*E,%.*E,%.*E,%.*E,%.*E,%.*E,"\
"%.*E,%.*E,%.*E,%.*E,%.*E,%.*E,%.*E,%.*E,%.*E,%.*E,%.*E\n"\
otp[k],dig,\
thet[k],dig,zet[k],dig,ptch[k],dig,pol[k],dig,rho[k],dig,mu,dig,en_norm,dig,rhol[k],dig,pphi[k],dig,xflr[k],dig,zflr[k],dig,\
thet0[k],dig,zet0[k],dig,ptch0[k],dig,pol0[k],dig,rho0[k],dig,mu0,dig,en0_norm,dig,rhol0[k],dig,pphi0[k],dig,xflr0[k],dig,zflr0[k]);
}
("dig" is the number of digits I want to output). I don't know if there's any way I can simplify typing this while being memory efficient. Some things I'm looking for specifically is not having to keep on repeating the "dig," or formatting a specific number of times the %.*E, repeats.
The solution I saw when looking this up was here
Writing to a CSV file in C
which was exactly what I already did.
agreagate arrays into array of pointers, then you can have as many as you want arraysa to print.
int myprint(FILE *fo, double **data, size_t size, size_t index, int dig)
{
int len = 0;
for(size_t i = 0; i < size; i++)
{
len += fprintf(fo, "%.*E%c", dig, data[i][index], i == size - 1 ? '\n' : ',');
}
return len;
}
double thet[100],zet[100],ptch[100],pol[100];
double *holder[] = {thet,zet,ptch,pol};
#define HS (sizeof(holder) / sizeof(holder[0]))
int main(void)
{
for(size_t k = 0; k < 100; k++)
myprint(stdout, holder, HS, k, 7);
}

A function that inserts array into a binary file

My task is to write a function unesi_niz which allows the user to enter array of real numbers (maximum 100) where the entry ends with entering the number -1. The array that is entered in this way should be written to the binary file niz.bin as values of type double. The file must not contain anything other than members of the string (so it must not contain the entered number -1).
Then write a srednja_vrijednost function that calculates the mean value of the numbers in the niz.bin file and returns it. If the file does not exist or is empty, 0 should be returned.
So i started like this:
#include <stdio.h>
#include <stdlib.h>
#define vel 100
int i = 0;
int j = 0;
void unesi_niz() {
double pomocna;
double niz[100];
while (i != 100) {
scanf("%lf", &pomocna);
if (pomocna != -1) {
niz[i] = pomocna;
i++;
} else
break;
}
FILE *ulaz = fopen("niz.bin", "w");
if (!ulaz) {
printf("Greska pri otvaranju.\n"); //opening fault
}
for (j = 0; j < i; j++) {
fwrite(niz, sizeof(double), j, ulaz);
}
fclose(ulaz);
}
double srednja_vrijednost() {
double suma = 0;
if (i == 0)
return 0;
FILE *ulaz = fopen("niz.bin", "r");
if (!ulaz) {
printf("Greska pri otvaranju.\n");//opening fault
return 0;
}
double niz[100];
fread(niz, sizeof(double), i, ulaz);
int j;
for (j = 0; j < i; j++) {
suma += niz[j];
}
fclose(ulaz);
return suma / i;
}
int main() {
unesi_niz();
double n=srednja_vrijednost();
printf("%g\n", n);
return 0;
}
My code has several problems. the first is the wrong return value of the function srednja_vrijednost, when I enter the values 5 10 15, the result is 1.6667, which is nonsense, and then many "Profiler errors", my debug console says Error in line 56, main.c file: The program accesses a variable that is not initialized, however I don't see any "Forbidden Action".
Hope some of you can see what I have done wrong :)
Your code fails in unesi_niz at these lines:
for (j = 0; j < i; j++) {
fwrite(niz, sizeof(double), j, ulaz);
}
fwrite takes a pointer to the data, the size of individual elements in bytes and the number of such elements. This means your code writes j elements starting from the first element each time. You probably want to write 1 element each time. Or better yet, you want to write all i elements, since fwrite allows you to write more than one element.
fwrite(nix, sizeof(double), i, ulaz);
As an aside, your "srednja_vrijednost" logic will work, but only because you already know the size of your array in the current process (stored in i). I am not entirely sure what you are trying to do but I suspect you want to be able to read the same file back even after your process exits. For that, you would need some logic to find the size of the array. You can do this either by writing the length as well into the file, or (similar to the input) write an ending -1, or just figure out the size by calculating the file size.

C - Passing a 3D arrays of chars to a function

I'm trying to write a program that analyzes a (3 x 4) matrix of strings provided by the user. Ultimately, it needs to output the longest string present in the matrix, along with that string's length.
My program seems to read the input correctly, as judged its success in echoing back the input strings, but it does not correctly output the longest word. I'm sure I'm committing some kind of pointer-related error when I pass the value of longest word, but I do not have any idea how to solve it.
Here is my code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define M 4
#define N 5
#define MAX_DIM 20
void findMAX(char matrice[N][M][MAX_DIM]) {
char maxr;
int index;
int i, j, it;
index = 0;
maxr = *(*(*(matrice+0)+0)+MAX_DIM);
for (i = 0; i < N-1; i++) {
for (j = 0; j < M-1; j++) {
if (index < strlen(matrice[i][j])) {
index = strlen(matrice[i][j]);
// I save the longer line's value
it = i;
// I save the maximum's value
maxr = *(*(*(matrice+i)+j)+MAX_DIM);
}
}
}
printf ("The MAX is: -/%s/- and it's long: -/%d/- \n", maxr, index);
printf ("It is content in the: %d line, which is: \n", it);
for (j = 0; j < N-1; j++) {
printf("%s ", matrice[it][j]);
}
}
void leggi(char matrice[N][M][MAX_DIM]) {
int i, j;
for (i = 0; i < N-1; i++) {
for (j = 0; j < M-1; j++) {
printf ("Insert the element matrix [%d][%d]: ", i, j);
scanf ("%s", matrice[i][j]);
fflush(stdin);
}
}
}
void stampa(char matrice[N][M][MAX_DIM]) {
int i, j;
printf("\n(4 x 3) MATRIX\n");
for (i = 0; i < N-1; i++) {
for (j = 0; j < M-1; j++) {
printf("%s ", matrice[i][j]);
}
printf("\n\n");
}
}
int main(int argc, char *argv[]) {
char matrix[N][M][MAX_DIM]; //Matrix of N*M strings, which are long MAX_DIM
printf("************************************************\n");
printf("** FIND THE LINE WITH THE MAXIMUM ELEMENT **\n");
printf("** IN A (4 x 3) MATRIX **\n");
printf("************************************************\n");
printf ("Matrix Reading & Printing\n");
leggi (matrix);
stampa (matrix);
findMAX(matrix);
return 0;
}
First of all to address some misconceptions conveyed by another answer, consider your 3D array declared as
char matrix[N][M][MAX_DIM];
, where N, M, and MAX_DIM are macros expanding to integer constants.
This is an ordinary array (not a variable-length array).
If you want to pass this array to a function, it is perfectly acceptable to declare the corresponding function parameter exactly the same way as you've declared the array, as indeed you do:
void findMAX(char matrice[N][M][MAX_DIM])
But it is true that what is actually passed is not the array itself, but a pointer to its first element (by which all other elements can also be accessed. In C, multidimensional arrays are arrays of arrays, so the first element of a three-dimensional array is a two-dimensional array. In any case, that function declaration is equivalent to both of these:
void findMAX(char (*matrice)[M][MAX_DIM])
void findMAX(char matrice[][M][MAX_DIM])
Note in particular that the first dimension is not conveyed. Of those three equivalent forms, I find the last clearest in most cases.
It is quite odd, though, the way you access array elements in your findMAX() function. Here is the prototypical example of what you do:
maxr = *(*(*(matrice+i)+j)+MAX_DIM);
But what an ugly and confusing expression that is, especially compared to this guaranteed-equivalent one:
maxr = matrice[i][j][MAX_DIM];
Looking at that however, and it how you are using it, I find that although the assignment is type-correct, you are probably using the wrong type. maxr holds a single char. If you mean it to somehow capture the value of a whole string, then you need to declare it either as an array (into which you will copy strings' contents as needed), or as a pointer that you will set to point to the string of interest. The latter approach is more efficient, and I see nothing to recommend the former for your particular usage.
Thus, I think you want
char *maxr;
... and later ...
maxr = matrice[0][0];
... and ...
maxr = matrice[i][j];
That sort of usage should be familiar to you from, for example, your function stampo(); the primary difference is that now you're assigning the expression to a variable instead of passing it directly to a function.
And it turns out that changing maxr's type that way will correct the real problem here, which #AnttiHaapala already pointed out in comments: this function call ...
printf ("The MAX is: -/%s/- and it's long: -/%d/- \n", maxr, index);
requires the second argument (maxr) to be a pointer to a null-terminated array of char in order to correspond to the %s directive in the format string. Before, you were passing a single char instead, but with this correction you should get mostly the expected result.
You will probably, however, see at least one additional anomaly. You final loop in that function has the wrong bound. You are iterating with j, which is used as an index for the second dimension of your array. That dimension's extent is M, but the loop runs to N - 1.
Finally, I should observe that it's odd that you allocate space for a 5 x 4 array (of char arrays) and then ignore the last row and column. But that's merely wasteful, not wrong.
Try something like this:
void findMAX(char matrice[N][M][MAX_DIM]){
// char maxr
char maxr[MAX_DIM];
int index;
int i, j, it;
index = 0;
// maxr = *(*(*(matrice+0)+0)+MAX_DIM);
strncpy(maxr, *(*(matrice+0)+0), MAX_DIM);
for (i = 0; i < N-1; i++)
{
for (j = 0; j < M-1; j++)
{
if (index < strlen(matrice[i][j]))
{
index = strlen(matrice[i][j]);
it = i;
// maxr = *(*(*(matrice+i)+j)+MAX_DIM);
strncpy(maxr, *(*(matrice+i)+j), MAX_DIM);
}
}
}
printf ("The MAX is: -/%s/- and it's long: -/%d/- \n", maxr, index);
printf ("It is content in the: %d line, which is: \n", it);
// for (j = 0; j < N-1; j++){
for (j = 0; j < M-1; j++){
printf("%s ", matrice[it][j]);
}
}
It's possible to pass multi-dimensional arrays to C functions if the size of the minor dimensions is known at compile time. However the syntax is unacceptable
void foo( int (*array2d)[6] )
Often array dimensions aren't known at compile time and it is necessary to create a flat array and access via
array2D[y*width+x]
Generally it's easier just to use this method even if array dimensions are known.
To clarify in response to a comment, C99 allows passing of variable size arrays using the more intuitive syntax. However the standard isn't supported by Microsoft's Visual C++ compiler, which means that you can't use it for many practical purposes.

Performing Operations on an Array

Ok, so I'm new to both programming and posting on this site and my question is kind of complicated, but I'm going to attempt to articulate my situation. I'm trying to write a code to read a file (a list of polar coordinates) of unknown size, store the values in an array, then perform a conversion and get out a new array with the converted data (technically I'm using this new array in a predesigned function to graph the data in C++, but this is not important since it's part of the assignment and certainly not the source of error).
I'm having 2 major issues, the first of which is the variable array size. In the code following, you can see that there's no constraint on my x and y arrays, so the loop goes on to convert values up to x[999] even if there's nothing in r[] beyond r[40]. I can constrain the size of my r and theta arrays because I can code to stop reading values when the fscanf function reaches the end of the file, but I don't know the analogy for my x and y arrays. (Also, I'd greatly appreciate it if someone could explain why my character array doesn't have this issue, even though it's defined to be up to 128 characters. It might give me the insight I need)
Secondly, the conversion doesn't quite match up. So when I print the converted values, the function converts them correctly, but stores the value converted from r[i] and theta[i] to x[i+1] and y [i+1]. I really have no idea what is causing this problem.
I certainly hope my issues make sense. Any help would be appreciated! If anything is unclear, please let me know and I'll do my best to explain. Thank you!
#include <stdio.h>
#include <math.h>
#define ARRAYSIZE 1000
#define FILENAMESIZE 128
float convertx(float r[], float theta[], int numPoints);
float converty(float r[], float theta[], int numPoints);
int cmain() {
int i;
float r[ARRAYSIZE], theta[ARRAYSIZE], x[ARRAYSIZE], y[ARRAYSIZE];
char filename[FILENAMESIZE];
FILE *InputFile;
printf("Type the data file name: ");
scanf("%s", filename);
InputFile = fopen(filename,"r");
if(InputFile == NULL){
printf("Error, could not open file: %s.\n",filename);
return 1;
}
for(i=0; i < ARRAYSIZE; i++){
if(fscanf(InputFile,"%f %f", &r[i], &theta[i]) == EOF) break;
printf("Value %d is %f\n",i+1,r[i]);
}
fclose(InputFile);
for(i=0; i < ARRAYSIZE; i++){
x[i] = convertx(r,theta,i);
printf("Value %d is %f\n",i+1,x[i]);
}
for(i=0; i < ARRAYSIZE; i++){
y[i] = converty(r,theta,i);
}
return 0;
}
float convertx(float r[], float theta[], int numPoints){
int i;
float x;
for(i=0; i < numPoints; i++)
x = r[i]*cos(theta[i]);
return x;
}
float converty(float r[], float theta[], int numPoints){
int i;
float y;
for(i=0; i < numPoints; i++)
y = r[i]*sin(theta[i]);
return y;
}
EDIT: Requested was a snippet of the input file. Here's the first few points (there are no commas in the file, just white space. Unfortunately, my formatting skills on this site are awful):
60, 3.9269875
60, 0.7853975
0, 0
1, 0.25
2, 0.5
For your first issue, the reason that your for-loop for r and theta stops at the end of the file is because you have a break; that exits the for-loop when you detect EOF. You don't have that in your x and y for-loops.
To fix that, instead of using your ARRAYSIZE constant, EDIT: { save the Final i value in your r and theta for-loops, it will be either the numbers of lines or +1, and use that for the end value in your x and y for-loop. Like this:
for(i=0; i < ARRAYSIZE; i++){
if(fscanf(InputFile,"%f %f", &r[i], &theta[i]) == EOF) {
int final = i;
break;
}
printf("Value %d is %f\n",i+1,r[i]);
}
fclose(InputFile);
for(i=0; i < final; i++){
x[i] = convertx(r,theta,i);
printf("Value %d is %f\n",i+1,x[i]);
}
I previously said you can use the size of the array for r or theta: for (i=0; i < r.size(); i++) {, but since you initialized its length to 1000 that won't work. }
For your 2nd issue: I think if you get rid of the for-loop in your convertx and converty functions, that will solve the problem. That's because only the last value of x or y calculated is returned, so your calculations from i=0 to i=(numPoints-2) are being thrown away.

Not giving the right output

The program should create a 2D table 8*8 which consists o random number<3
it should print that table.
Another task is to translate this table into another
For Example
120
210
111
The number in the center should be changed to the sum of all numbers around it 1+2+0+2+0+1+1+1=8
and that should be done for everything;
then the program should be printed
if there are any numbers larger than 9 it shoul be translated to hexadecimal.....
I didn't do the hexadecimal yet. but it is still not working ....
#include <stdio.h>
#include <stdlib.h>
#define cols 8
#define rows 8
void printA(int A[][cols]);
void printC(char C[][cols]);
void SumThemUp(int A[][cols], char C[][cols]);
int main()
{
srand(time(NULL));
int A[rows][cols];
char C[rows][cols];
int i, j;
for(i=0; i<rows; i++)
for(j=0; j<cols; j++)
A[i][j]=rand()%3;
printA(A);
SumThemUp(A,C);
printC(C);
return 0;
}
void printA(int A[][cols])
{ int i, j;
for(i=0;i<rows;i++)
{for(j=0;j<cols; j++)
{printf("%d ", A[i][j]);}
printf("\n");}
return ;
}
void printC(char C[][cols])
{
int i, j;
for(i=0;i<rows;i++)
{for(j=0;j<cols; j++)
{printf("%ch ", C[i][j]);}
printf("\n");}
return ;
}
void SumThemUp(int A[][cols], char C[][cols])
{
int i,j;
for(i=0;i<rows;i++)
{for(j=0;j<cols; j++)
C[i][j]=0;}
for(i=0;i<rows;i++)
{for(j=0;j<cols; j++)
A[i][j]=C[i++][j];
}
for(j=0;j<cols; j++)
{for(i=0;i<rows;i++)
C[i][j]+=A[i][j++];
}return;
}
So - I'm not entirely sure I know what you want the output to be -- but there are several problems with what you have:
0: For your arrays, the names should describe what the array actually holds, A and C are quite ambiguous.
1: Use { } for scoping, and put the { } on their own lines. (Maybe it just pasted poorly in Stack Overflow)
2: You have a set of loops which basically sets everything in C to 0:
for(i=0;i<rows;i++)
{
for(j=0;j<cols; j++)
{
C[i][j]=0;
}
}
Then immediately after that you have:
for(i=0;i<rows;i++)
{
for(j=0;j<cols; j++)
{
A[i][j]=C[i++][j]; // <--- problem here
}
}
So after that, both A and C are full of all 0s. On top of that, you have i++ inline when accessing columns in C. This actually changes the value that your for loop is using, so i is getting incremented for every row and every column. Presumably you want:
A[i][j]=C[i+1][j];
3: You have a similar problem here:
for(j=0;j<cols; j++)
{
for(i=0;i<rows;i++)
{
C[i][j]+=A[i][j++]; // Presumably you want j+1
}
}
4: Why are you using a char array for C? If it's holding the sum of integers it should probably be declared int. If this was your idea of printing the ints as hex (or just plain ints), it would be easier to simply use printf to output the ints as hex:
// use %d to print the integer "normally" (base 10)
// use %x if you want a hex value with lowercase letters
// use %X if you want a hex value with capital letters
printf("125 as hex is: 0x%x", 125); // 0x7d
I hope that points you in the right direction.
-- Dan
Do I understand correctly, that given matrix A, you want to get matrix C in SumThemUp, where each cell in C is a sum of its adjacent cells? In that case, these lines look suspicious as you modify the loop counters
A[i][j]=C[i++][j];
and
C[i][j]+=A[i][j++];
.
Anyway, a simple example, how I would do the summing part.
NB! Note that I use int type for matrix C. Given that you want to convert it to hex and you happend to have values 3 in all adjacent cells somewhere, you get decimal value of 3 * 8 = 24, which requires more than one character to represent. Thus, you should convert to hex during printing. (I understand that char can contain intergral values up to 255 also, but for the sake of consistency)
void SumThemUp(int A[][cols], int C[][cols]) {
int i, j, di, dj, i2, j2;
// iterate through all the rows
for (i=0 ; i<rows ; ++i) {
for (j=0 ; j<cols ; ++j) {
// initialize the cell to zero
C[i][j] = 0;
// iterate over nearby cells
for (di=-1 ; di<=1 ; ++di) {
for (dj=-1 ; dj<=1 ; ++dj) {
// do not count in the center
if (di == 0 && dj == 0) {
continue;
}
// make sure, we do not try to count in cells
// outside the matrix
i2 = i + di;
j2 = j + di;
if (i2 < 0 || j2 < 0 || i2 >= rows || j2 >= cols) {
continue;
}
// append the score here
C[i][j] += A[i2][j2];
}
}
}
}
}
Also, I did not test this code, so it may contain mistakes, but maybe it helps you finishing your summing part.
NB! And take note of comments of #Dan.

Resources