libjpeg turbo tjCompressFromYUV - c

I'd like to compress a planar 4:2:0 YUV buffer to a jpeg image using libturbojpeg in C, but I'm having trouble using the tjCompressFromYUV() function.
This is my code:
#define PADDING 2
tjhandle tjh;
unsigned long buf_size;
unsigned char *jpegBuf = NULL;
unsigned long jpegSize;
int width = 352;
int height = 288;
int quality = 70;
unsigned char *ucp_frame;
int j;
FILE *fp = NULL;
ucp_frame = malloc(width * height * 3 / 2);
if ( NULL == ucp_frame ) {
printf("malloc error ucp_frame\n");
return 0;
}
fp = fopen("planar_352x288.raw", "rb");
if( NULL == fp ) {
printf("fopen error\n");
return 0;
}
j = fread( ucp_frame, 1, width * height * 3 / 2, fp);
if( j != width * height * 3 / 2 ) {
printf("fread error\n");
return 0;
}
fclose(fp);
tjh = tjInitCompress();
if( NULL == tjh ) {
printf("tjInitCompress error '%s'\n", tjGetErrorStr() );
return 0;
}
buf_size = tjBufSizeYUV2( width, PADDING, height, TJSAMP_420);
jpegBuf = tjAlloc(buf_size);
if( tjCompressFromYUV( tjh, ucp_frame, width, PADDING, height,
TJSAMP_420, &jpegBuf, &jpegSize, quality,
TJFLAG_NOREALLOC ) ) {
printf("tjCompressFromYUV error '%s'\n", tjGetErrorStr() );
}
The error string returned by tjGetErrorStr() is "Bogus input colorspace".
I tried linking libturbojpeg versions 1.4.2 and 1.4.90.
Any help wolud be appreciated,
Thanks

Turbojpeg API tjCompressFromYUV allows you such options for jpegBuf:
#param jpegBuf address of a pointer to an image buffer that will receive the
JPEG image. TurboJPEG has the ability to reallocate the JPEG buffer to
accommodate the size of the JPEG image. Thus, you can choose to:
pre-allocate the JPEG buffer with an arbitrary size using #tjAlloc() and
let TurboJPEG grow the buffer as needed,
set *jpegBuf to NULL to tell TurboJPEG to allocate the buffer
for you, or
pre-allocate the buffer to a "worst case" size determined by calling
tjBufSize(). This should ensure that the buffer never has to be
re-allocated (setting #TJFLAG_NOREALLOC guarantees this.)
If you choose option 1, *jpegSize should be set to the size of your
pre-allocated buffer. In any case, unless you have set #TJFLAG_NOREALLOC,
you should always check *jpegBuf upon return from this function, as
it may have changed.
So by using 2-nd option, there is no need to call tjBufSizeYUV2 and tjAlloc, simply have jpegBuf=NULL before calling tjCompressFromYUV and do tjFree after compressing.

Ok it turned out the problem was in the program containing the code I posted, in a smaller test program the tjBufSizeYUV2 call performs as expected.
On a side note it seems that if jpegBuf is pre-allocated before calling tjBufSizeYUV2, the flag argument passed to tjBufSizeYUV2 must contain TJFLAG_NOREALLOC, or else jpegBuf won't be freed even if tjFree(jpegBuf); is later called.

#define PADDING 4
jpegBuf = tjAlloc(width*height*3/2);

Related

How to read imaginary data from text file, with C

I am unable to read imaginary data from text file.
Here is my .txt file
abc.txt
0.2e-3+0.3*I 0.1+0.1*I
0.3+0.1*I 0.1+0.4*I
I want to read this data into a matrix and print it.
I found the solutions using C++ here and here. I don't know how to do the same in C.
I am able to read decimal and integer data in .txt and print them.
I am also able to print imaginary data initialized at the declaration, using complex.h header. This is the program I have writtern
#include<stdio.h>
#include<stdlib.h>
#include<complex.h>
#include<math.h>
int M,N,i,j,k,l,p,q;
int b[2];
int main(void)
{
FILE* ptr = fopen("abc.txt", "r");
if (ptr == NULL) {
printf("no such file.");
return 0;
}
long double d=0.2e-3+0.3*I;
long double c=0.0000000600415046630252;
double matrixA[2][2];
for(i=0;i<2; i++)
for(j=0;j<2; j++)
fscanf(ptr,"%lf+i%lf\n", creal(&matrixA[i][j]), cimag(&matrixA[i][j]));
//fscanf(ptr, "%lf", &matrixA[i][j]) for reading non-imainary data, It worked.
for(i=0;i<2; i++)
for(j=0;j<2; j++)
printf("%f+i%f\n", creal(matrixA[i][j]), cimag(matrixA[i][j]));
//printf("%lf\n", matrixA[i][j]); for printing non-imainary data, It worked.
printf("%f+i%f\n", creal(d), cimag(d));
printf("%Lg\n",c);
fclose(ptr);
return 0;
}
But I want to read it from the text, because I have an array of larger size, which I can't initialize at declaration, because of it's size.
There are two main issues with your code:
You need to add complex to the variables that hold complex values.
scanf() needs pointers to objects to store scanned values in them. But creal() returns a value, copied from its argument's contents. It is neither a pointer, nor could you get the address of the corresponding part of the complex argument.
Therefore, you need to provide temporary objects to scanf() which receive the scanned values. After successfully scanning, these values are combined to a complex value and assigned to the indexed matrix cell.
Minor issues not contributing to the core problem are:
The given source is "augmented" with unneeded #includes, unused variables, global variables, and experiments with constants. I removed them all to see the real thing.
The specifier "%f" (as many others) lets scanf() skip whitespace like blanks, tabs, newlines, and so on. Providing a "\n" mostly does more harm than one would expect.
I kept the "*I" to check the correct format. However, an error will only be found on the next call of scanf(), when it cannot scan the next number.
You need to check the return value of scanf(), always! It returns the number of conversions that were successful.
It is a common and good habit to let the compiler calculate the number of elements in an array. Divide the total size by an element's size.
Oh, and sizeof is an operator, not a function.
It is also best to return symbolic values to the caller, instead of magic numbers. Fortunately, the standard library defines these EXIT_... macros.
The signs are correctly handled by scanf() already. There is no need to tell it more. But for a nice output with printf(), you use the "+" as a flag to always output a sign.
Since the sign is now placed directly before the number, I moved the multiplication by I (you can change it to lower case, if you want) to the back of the imaginary part. This also matches the input format.
Error output is done via stderr instead of stdout. For example, this enables you to redirect the standard output to a pipe or file, without missing potential errors. You can also redirect errors somewhere else. And it is a well-known and appreciated standard.
This is a possible solution:
#include <stdio.h>
#include <stdlib.h>
#include <complex.h>
int main(void)
{
FILE* ptr = fopen("abc.txt", "r");
if (ptr == NULL) {
perror("\"abc.txt\"");
return EXIT_FAILURE;
}
double complex matrixA[2][2];
for (size_t i = 0; i < sizeof matrixA / sizeof matrixA[0]; i++)
for (size_t j = 0; j < sizeof matrixA[0] / sizeof matrixA[0][0]; j++) {
double real;
double imag;
if (fscanf(ptr, "%lf%lf*I", &real, &imag) != 2) {
fclose(ptr);
fprintf(stderr, "Wrong input format\n");
return EXIT_FAILURE;
}
matrixA[i][j] = real + imag * I;
}
fclose(ptr);
for (size_t i = 0; i < sizeof matrixA / sizeof matrixA[0]; i++)
for (size_t j = 0; j < sizeof matrixA[0] / sizeof matrixA[0][0]; j++)
printf("%+f%+f*I\n", creal(matrixA[i][j]), cimag(matrixA[i][j]));
return EXIT_SUCCESS;
}
Here's a simple solution using scanf() and the format shown in the examples.
It writes the values in the same format that it reads them — the output can be scanned by the program as input.
/* SO 7438-4793 */
#include <stdio.h>
static int read_complex(FILE *fp, double *r, double *i)
{
int offset = 0;
char sign[2];
if (fscanf(fp, "%lg%[-+]%lg*%*[iI]%n", r, sign, i, &offset) != 3 || offset == 0)
return EOF;
if (sign[0] == '-')
*i = -*i;
return 0;
}
int main(void)
{
double r;
double i;
while (read_complex(stdin, &r, &i) == 0)
printf("%g%+g*I\n", r, i);
return 0;
}
Sample input:
0.2e-3+0.3*I 0.1+0.1*I
0.3+0.1*I 0.1+0.4*I
-1.2-3.6*I -6.02214076e23-6.62607015E-34*I
Output from sample input:
0.0002+0.3*I
0.1+0.1*I
0.3+0.1*I
0.1+0.4*I
-1.2-3.6*I
-6.02214e+23-6.62607e-34*I
The numbers at the end with large exponents are Avogadro's Number and the Planck Constant.
The format is about as stringent are you can make it with scanf(), but, although it requires a sign (+ or -) between the real and imaginary parts and requires the * and I to be immediately after the imaginary part (and the conversion will fail if the *I is missing), and accepts either i or I to indicate the imaginary value:
It doesn't stop the imaginary number having a second sign (so it will read a value such as "-6+-4*I").
It doesn't stop there being white space after the mandatory sign (so it will read a value such as "-6+ 24*I".
It doesn't stop the real part being on one line and the imaginary part on the next line.
It won't handle either a pure-real number or a pure-imaginary number properly.
The scanf() functions are very flexible about white space, and it is very hard to prevent them from accepting white space. It would require a custom parser to prevent unwanted spaces. You could do that by reading the numbers and the markers separately, as strings, and then verifying that there's no space and so on. That might be the best way to handle it. You'd use sscanf() to convert the string read after ensuring there's no embedded white space yet the format is correct.
I do not know which IDE you are using for C, so I do not understand this ./testprog <test.data.
I have yet to find an IDE that does not drive me bonkers. I use a Unix shell running in a terminal window. Assuming that your program name is testprog and the data file is test.data, typing ./testprog < test.data runs the program and feeds the contents of test.data as its standard input. On Windows, this would be a command window (and I think PowerShell would work much the same way).
I used fgets to read each line of the text file. Though I know the functionality of sscanf, I do not know how to parse an entire line, which has about 23 elements per line. If the number of elements in a line are few, I know how to parse it. Could you help me about it?
As I noted in a comment, the SO Q&A How to use sscanf() in loops? explains how to use sscanf() to read multiple entries from a line. In this case, you will need to read multiple complex numbers from a single line. Here is some code that shows it at work. It uses the POSIX getline() function to read arbitrarily long lines. If it isn't available to you, you can use fgets() instead, but you'll need to preallocate a big enough line buffer.
#include <stdio.h>
#include <stdlib.h>
#include <complex.h>
#ifndef CMPLX
#define CMPLX(r, i) ((double complex)((double)(r) + I * (double)(i)))
#endif
static size_t scan_multi_complex(const char *string, size_t nvalues,
complex double *v, const char **eoc)
{
size_t nread = 0;
const char *buffer = string;
while (nread < nvalues)
{
int offset = 0;
char sign[2];
double r, i;
if (sscanf(buffer, "%lg%[-+]%lg*%*[iI]%n", &r, sign, &i, &offset) != 3 || offset == 0)
break;
if (sign[0] == '-')
i = -i;
v[nread++] = CMPLX(r, i);
buffer += offset;
}
*eoc = buffer;
return nread;
}
static void dump_complex(size_t nvalues, complex double values[nvalues])
{
for (size_t i = 0; i < nvalues; i++)
printf("%g%+g*I\n", creal(values[i]), cimag(values[i]));
}
enum { NUM_VALUES = 128 };
int main(void)
{
double complex values[NUM_VALUES];
size_t nvalues = 0;
char *buffer = 0;
size_t buflen = 0;
int length;
size_t lineno = 0;
while ((length = getline(&buffer, &buflen, stdin)) > 0 && nvalues < NUM_VALUES)
{
const char *eoc;
printf("Line: %zu [[%.*s]]\n", ++lineno, length - 1, buffer);
size_t nread = scan_multi_complex(buffer, NUM_VALUES - nvalues, &values[nvalues], &eoc);
if (*eoc != '\0' && *eoc != '\n')
printf("EOC: [[%s]]\n", eoc);
if (nread == 0)
break;
dump_complex(nread, &values[nvalues]);
nvalues += nread;
}
free(buffer);
printf("All done:\n");
dump_complex(nvalues, values);
return 0;
}
Here is a data file with 8 lines with 10 complex numbers per line):
-1.95+11.00*I +21.72+64.12*I -95.16-1.81*I +64.23+64.55*I +28.42-29.29*I -49.25+7.87*I +44.98+79.62*I +69.80-1.24*I +61.99+37.01*I +72.43+56.88*I
-9.15+31.41*I +63.84-15.82*I -0.77-76.80*I -85.59+74.86*I +93.00-35.10*I -93.82+52.80*I +85.45+82.42*I +0.67-55.77*I -58.32+72.63*I -27.66-81.15*I
+87.97+9.03*I +7.05-74.91*I +27.60+65.89*I +49.81+25.08*I +44.33+77.00*I +93.27-7.74*I +61.62-5.01*I +99.33-82.80*I +8.83+62.96*I +7.45+73.70*I
+40.99-12.44*I +53.34+21.74*I +75.77-62.56*I +54.16-26.97*I -37.02-31.93*I +78.20-20.91*I +79.64+74.71*I +67.95-40.73*I +58.19+61.25*I +62.29-22.43*I
+47.36-16.19*I +68.48-15.00*I +6.85+61.50*I -6.62+55.18*I +34.95-69.81*I -88.62-81.15*I +75.92-74.65*I +85.17-3.84*I -37.20-96.98*I +74.97+78.88*I
+56.80+63.63*I +92.83-16.18*I -11.47+8.81*I +90.74+42.86*I +19.11-56.70*I -77.93-70.47*I +6.73+86.12*I +2.70-57.93*I +57.87+29.44*I +6.65-63.09*I
-35.35-70.67*I +8.08-21.82*I +86.72-93.82*I -28.96-24.69*I +68.73-15.36*I +52.85+94.65*I +85.07-84.04*I +9.98+29.56*I -78.01-81.23*I -10.67+13.68*I
+83.10-33.86*I +56.87+30.23*I -78.56+3.73*I +31.41+10.30*I +91.98+29.04*I -9.20+24.59*I +70.82-19.41*I +29.21+84.74*I +56.62+92.29*I +70.66-48.35*I
The output of the program is:
Line: 1 [[-1.95+11.00*I +21.72+64.12*I -95.16-1.81*I +64.23+64.55*I +28.42-29.29*I -49.25+7.87*I +44.98+79.62*I +69.80-1.24*I +61.99+37.01*I +72.43+56.88*I]]
-1.95+11*I
21.72+64.12*I
-95.16-1.81*I
64.23+64.55*I
28.42-29.29*I
-49.25+7.87*I
44.98+79.62*I
69.8-1.24*I
61.99+37.01*I
72.43+56.88*I
Line: 2 [[-9.15+31.41*I +63.84-15.82*I -0.77-76.80*I -85.59+74.86*I +93.00-35.10*I -93.82+52.80*I +85.45+82.42*I +0.67-55.77*I -58.32+72.63*I -27.66-81.15*I]]
-9.15+31.41*I
63.84-15.82*I
-0.77-76.8*I
-85.59+74.86*I
93-35.1*I
-93.82+52.8*I
85.45+82.42*I
0.67-55.77*I
-58.32+72.63*I
-27.66-81.15*I
Line: 3 [[+87.97+9.03*I +7.05-74.91*I +27.60+65.89*I +49.81+25.08*I +44.33+77.00*I +93.27-7.74*I +61.62-5.01*I +99.33-82.80*I +8.83+62.96*I +7.45+73.70*I]]
87.97+9.03*I
7.05-74.91*I
27.6+65.89*I
49.81+25.08*I
44.33+77*I
93.27-7.74*I
61.62-5.01*I
99.33-82.8*I
8.83+62.96*I
7.45+73.7*I
Line: 4 [[+40.99-12.44*I +53.34+21.74*I +75.77-62.56*I +54.16-26.97*I -37.02-31.93*I +78.20-20.91*I +79.64+74.71*I +67.95-40.73*I +58.19+61.25*I +62.29-22.43*I]]
40.99-12.44*I
53.34+21.74*I
75.77-62.56*I
54.16-26.97*I
-37.02-31.93*I
78.2-20.91*I
79.64+74.71*I
67.95-40.73*I
58.19+61.25*I
62.29-22.43*I
Line: 5 [[+47.36-16.19*I +68.48-15.00*I +6.85+61.50*I -6.62+55.18*I +34.95-69.81*I -88.62-81.15*I +75.92-74.65*I +85.17-3.84*I -37.20-96.98*I +74.97+78.88*I]]
47.36-16.19*I
68.48-15*I
6.85+61.5*I
-6.62+55.18*I
34.95-69.81*I
-88.62-81.15*I
75.92-74.65*I
85.17-3.84*I
-37.2-96.98*I
74.97+78.88*I
Line: 6 [[+56.80+63.63*I +92.83-16.18*I -11.47+8.81*I +90.74+42.86*I +19.11-56.70*I -77.93-70.47*I +6.73+86.12*I +2.70-57.93*I +57.87+29.44*I +6.65-63.09*I]]
56.8+63.63*I
92.83-16.18*I
-11.47+8.81*I
90.74+42.86*I
19.11-56.7*I
-77.93-70.47*I
6.73+86.12*I
2.7-57.93*I
57.87+29.44*I
6.65-63.09*I
Line: 7 [[-35.35-70.67*I +8.08-21.82*I +86.72-93.82*I -28.96-24.69*I +68.73-15.36*I +52.85+94.65*I +85.07-84.04*I +9.98+29.56*I -78.01-81.23*I -10.67+13.68*I]]
-35.35-70.67*I
8.08-21.82*I
86.72-93.82*I
-28.96-24.69*I
68.73-15.36*I
52.85+94.65*I
85.07-84.04*I
9.98+29.56*I
-78.01-81.23*I
-10.67+13.68*I
Line: 8 [[+83.10-33.86*I +56.87+30.23*I -78.56+3.73*I +31.41+10.30*I +91.98+29.04*I -9.20+24.59*I +70.82-19.41*I +29.21+84.74*I +56.62+92.29*I +70.66-48.35*I]]
83.1-33.86*I
56.87+30.23*I
-78.56+3.73*I
31.41+10.3*I
91.98+29.04*I
-9.2+24.59*I
70.82-19.41*I
29.21+84.74*I
56.62+92.29*I
70.66-48.35*I
All done:
-1.95+11*I
21.72+64.12*I
-95.16-1.81*I
64.23+64.55*I
28.42-29.29*I
-49.25+7.87*I
44.98+79.62*I
69.8-1.24*I
61.99+37.01*I
72.43+56.88*I
-9.15+31.41*I
63.84-15.82*I
-0.77-76.8*I
-85.59+74.86*I
93-35.1*I
-93.82+52.8*I
85.45+82.42*I
0.67-55.77*I
-58.32+72.63*I
-27.66-81.15*I
87.97+9.03*I
7.05-74.91*I
27.6+65.89*I
49.81+25.08*I
44.33+77*I
93.27-7.74*I
61.62-5.01*I
99.33-82.8*I
8.83+62.96*I
7.45+73.7*I
40.99-12.44*I
53.34+21.74*I
75.77-62.56*I
54.16-26.97*I
-37.02-31.93*I
78.2-20.91*I
79.64+74.71*I
67.95-40.73*I
58.19+61.25*I
62.29-22.43*I
47.36-16.19*I
68.48-15*I
6.85+61.5*I
-6.62+55.18*I
34.95-69.81*I
-88.62-81.15*I
75.92-74.65*I
85.17-3.84*I
-37.2-96.98*I
74.97+78.88*I
56.8+63.63*I
92.83-16.18*I
-11.47+8.81*I
90.74+42.86*I
19.11-56.7*I
-77.93-70.47*I
6.73+86.12*I
2.7-57.93*I
57.87+29.44*I
6.65-63.09*I
-35.35-70.67*I
8.08-21.82*I
86.72-93.82*I
-28.96-24.69*I
68.73-15.36*I
52.85+94.65*I
85.07-84.04*I
9.98+29.56*I
-78.01-81.23*I
-10.67+13.68*I
83.1-33.86*I
56.87+30.23*I
-78.56+3.73*I
31.41+10.3*I
91.98+29.04*I
-9.2+24.59*I
70.82-19.41*I
29.21+84.74*I
56.62+92.29*I
70.66-48.35*I
The code would handle lines with any number of entries on a line (up to 128 in total because of the limit on the size of the array of complex numbers — but that can be fixed too.

Two-dimensional char array too large exit code 139

Hey guys I'm attempting to read in workersinfo.txt and store it into a two-dimensional char array. The file is around 4,000,000 lines with around 100 characters per line. I want to store each file line on the array. Unfortunately, I get exit code 139(Not enough memory). I'm aware I have to use malloc() and free() but I've tried a couple of things and I haven't been able to make them work.Eventually I have to sort the array by ID number but I'm stuck on declaring the array.
The file looks something like this:
First Name, Last Name,Age, ID
Carlos,Lopez,,10568
Brad, Patterson,,20586
Zack, Morris,42,05689
This is my code so far:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
FILE *ptr_file;
char workers[4000000][1000];
ptr_file =fopen("workersinfo.txt","r");
if (!ptr_file)
perror("Error");
int i = 0;
while (fgets(workers[i],1000, ptr_file)!=NULL){
i++;
}
int n;
for(n = 0; n < 4000000; n++)
{
printf("%s", workers[n]);
}
fclose(ptr_file);
return 0;
}
The Stack memory is limited. As you pointed out in your question, you MUST use malloc to allocate such a big (need I say HUGE) chunk of memory, as the stack cannot contain it.
you can use ulimit to review the limits of your system (usually including the stack size limit).
On my Mac, the limit is 8Mb. After running ulimit -a I get:
...
stack size (kbytes, -s) 8192
...
Or, test the limit using:
struct rlimit slim;
getrlimit(RLIMIT_STACK, &rlim);
rlim.rlim_cur // the stack limit
I truly recommend you process each database entry separately.
As mentioned in the comments, assigning the memory as static memory would, in most implementations, circumvent the stack.
Still, IMHO, allocating 400MB of memory (or 4GB, depending which part of your question I look at), is bad form unless totally required - especially for a single function.
Follow-up Q1: How to deal with each DB entry separately
I hope I'm not doing your homework or anything... but I doubt your homework would include an assignment to load 400Mb of data to the computer's memory... so... to answer the question in your comment:
The following sketch of single entry processing isn't perfect - it's limited to 1Kb of data per entry (which I thought to be more then enough for such simple data).
Also, I didn't allow for UTF-8 encoding or anything like that (I followed the assumption that English would be used).
As you can see from the code, we read each line separately and perform error checks to check that the data is valid.
To sort the file by ID, you might consider either running two lines at a time (this would be a slow sort) and sorting them, or creating a sorted node tree with the ID data and the position of the line in the file (get the position before reading the line). Once you sorted the binary tree, you can sort the data...
... The binary tree might get a bit big. did you look up sorting algorithms?
#include <stdio.h>
// assuming this is the file structure:
//
// First Name, Last Name,Age, ID
// Carlos,Lopez,,10568
// Brad, Patterson,,20586
// Zack, Morris,42,05689
//
// Then this might be your data structure per line:
struct DBEntry {
char* last_name; // a pointer to the last name
char* age; // a pointer to the name - could probably be an int
char* id; // a pointer to the ID
char first_name[1024]; // the actual buffer...
// I unified the first name and the buffer since the first name is first.
};
// each time you read only a single line, perform an error check for overflow
// and return the parsed data.
//
// return 1 on sucesss or 0 on failure.
int read_db_line(FILE* fp, struct DBEntry* line) {
if (!fgets(line->first_name, 1024, fp))
return 0;
// parse data and review for possible overflow.
// first, zero out data
int pos = 0;
line->age = NULL;
line->id = NULL;
line->last_name = NULL;
// read each byte, looking for the EOL marker and the ',' seperators
while (pos < 1024) {
if (line->first_name[pos] == ',') {
// we encountered a devider. we should handle it.
// if the ID feild's location is already known, we have an excess comma.
if (line->id) {
fprintf(stderr, "Parsing error, invalid data - too many fields.\n");
return 0;
}
// replace the comma with 0 (seperate the strings)
line->first_name[pos] = 0;
if (line->age)
line->id = line->first_name + pos + 1;
else if (line->last_name)
line->age = line->first_name + pos + 1;
else
line->last_name = line->first_name + pos + 1;
} else if (line->first_name[pos] == '\n') {
// we encountered a terminator. we should handle it.
if (line->id) {
// if we have the id string's possition (the start marker), this is a
// valid entry and we should process the data.
line->first_name[pos] = 0;
return 1;
} else {
// we reached an EOL without enough ',' seperators, this is an invalid
// line.
fprintf(stderr, "Parsing error, invalid data - not enough fields.\n");
return 0;
}
}
pos++;
}
// we ran through all the data but there was no EOL marker...
fprintf(stderr,
"Parsing error, invalid data (data overflow or data too large).\n");
return 0;
}
// the main program
int main(int argc, char const* argv[]) {
// open file
FILE* ptr_file;
ptr_file = fopen("workersinfo.txt", "r");
if (!ptr_file)
perror("File Error");
struct DBEntry line;
while (read_db_line(ptr_file, &line)) {
// do what you want with the data... print it?
printf(
"First name:\t%s\n"
"Last name:\t%s\n"
"Age:\t\t%s\n"
"ID:\t\t%s\n"
"--------\n",
line.first_name, line.last_name, line.age, line.id);
}
// close file
fclose(ptr_file);
return 0;
}
Followup Q2: Sorting array for 400MB-4GB of data
IMHO, 400MB is already touching on the issues related to big data. For example, implementing a bubble sort on your database should be agonizing as far as performance goes (unless it's a single time task, where performance might not matter).
Creating an Array of DBEntry objects will eventually get you a larger memory foot-print then the actual data..
This will not be the optimal way to sort large data.
The correct approach will depend on your sorting algorithm. Wikipedia has a decent primer on sorting algorythms.
Since we are handling a large amount of data, there are a few things to consider:
It would make sense to partition the work, so different threads/processes sort a different section of the data.
We will need to minimize IO to the hard drive (as it will slow the sorting significantly and prevent parallel processing on the same machine/disk).
One possible approach is to create a heap for a heap sort, but only storing a priority value and storing the original position in the file.
Another option would probably be to employ a divide and conquer algorithm, such as quicksort, again, only sorting a computed sort value and the entry's position in the original file.
Either way, writing a decent sorting method will be a complicated task, probably involving threading, forking, tempfiles or other techniques.
Here's a simplified demo code... it is far from optimized, but it demonstrates the idea of the binary sort-tree that holds the sorting value and the position of the data in the file.
Be aware that using this code will be both relatively slow (although not that slow) and memory intensive...
On the other hand, it will require about 24 bytes per entry. For 4 million entries, it's 96MB, somewhat better then 400Mb and definitely better then the 4GB.
#include <stdlib.h>
#include <stdio.h>
// assuming this is the file structure:
//
// First Name, Last Name,Age, ID
// Carlos,Lopez,,10568
// Brad, Patterson,,20586
// Zack, Morris,42,05689
//
// Then this might be your data structure per line:
struct DBEntry {
char* last_name; // a pointer to the last name
char* age; // a pointer to the name - could probably be an int
char* id; // a pointer to the ID
char first_name[1024]; // the actual buffer...
// I unified the first name and the buffer since the first name is first.
};
// this might be a sorting node for a sorted bin-tree:
struct SortNode {
struct SortNode* next; // a pointer to the next node
fpos_t position; // the DB entry's position in the file
long value; // The computed sorting value
}* top_sorting_node = NULL;
// this function will free all the memory used by the global Sorting tree
void clear_sort_heap(void) {
struct SortNode* node;
// as long as there is a first node...
while ((node = top_sorting_node)) {
// step forward.
top_sorting_node = top_sorting_node->next;
// free the original first node's memory
free(node);
}
}
// each time you read only a single line, perform an error check for overflow
// and return the parsed data.
//
// return 0 on sucesss or 1 on failure.
int read_db_line(FILE* fp, struct DBEntry* line) {
if (!fgets(line->first_name, 1024, fp))
return -1;
// parse data and review for possible overflow.
// first, zero out data
int pos = 0;
line->age = NULL;
line->id = NULL;
line->last_name = NULL;
// read each byte, looking for the EOL marker and the ',' seperators
while (pos < 1024) {
if (line->first_name[pos] == ',') {
// we encountered a devider. we should handle it.
// if the ID feild's location is already known, we have an excess comma.
if (line->id) {
fprintf(stderr, "Parsing error, invalid data - too many fields.\n");
clear_sort_heap();
exit(2);
}
// replace the comma with 0 (seperate the strings)
line->first_name[pos] = 0;
if (line->age)
line->id = line->first_name + pos + 1;
else if (line->last_name)
line->age = line->first_name + pos + 1;
else
line->last_name = line->first_name + pos + 1;
} else if (line->first_name[pos] == '\n') {
// we encountered a terminator. we should handle it.
if (line->id) {
// if we have the id string's possition (the start marker), this is a
// valid entry and we should process the data.
line->first_name[pos] = 0;
return 0;
} else {
// we reached an EOL without enough ',' seperators, this is an invalid
// line.
fprintf(stderr, "Parsing error, invalid data - not enough fields.\n");
clear_sort_heap();
exit(1);
}
}
pos++;
}
// we ran through all the data but there was no EOL marker...
fprintf(stderr,
"Parsing error, invalid data (data overflow or data too large).\n");
return 0;
}
// read and sort a single line from the database.
// return 0 if there was no data to sort. return 1 if data was read and sorted.
int sort_line(FILE* fp) {
// allocate the memory for the node - use calloc for zero-out data
struct SortNode* node = calloc(sizeof(*node), 1);
// store the position on file
fgetpos(fp, &node->position);
// use a stack allocated DBEntry for processing
struct DBEntry line;
// check that the read succeeded (read_db_line will return -1 on error)
if (read_db_line(fp, &line)) {
// free the node's memory
free(node);
// return no data (0)
return 0;
}
// compute sorting value - I'll assume all IDs are numbers up to long size.
sscanf(line.id, "%ld", &node->value);
// heap sort?
// This is a questionable sort algorythm... or a questionable implementation.
// Also, I'll be using pointers to pointers, so it might be a headache to read
// (it's a headache to write, too...) ;-)
struct SortNode** tmp = &top_sorting_node;
// move up the list until we encounter something we're smaller then us,
// OR untill the list is finished.
while (*tmp && (*tmp)->value <= node->value)
tmp = &((*tmp)->next);
// update the node's `next` value.
node->next = *tmp;
// inject the new node into the tree at the position we found
*tmp = node;
// return 1 (data was read and sorted)
return 1;
}
// writes the next line in the sorting
int write_line(FILE* to, FILE* from) {
struct SortNode* node = top_sorting_node;
if (!node) // are we done? top_sorting_node == NULL ?
return 0; // return 0 - no data to write
// step top_sorting_node forward
top_sorting_node = top_sorting_node->next;
// read data from one file to the other
fsetpos(from, &node->position);
char* buffer = NULL;
ssize_t length;
size_t buff_size = 0;
length = getline(&buffer, &buff_size, from);
if (length <= 0) {
perror("Line Copy Error - Couldn't read data");
return 0;
}
fwrite(buffer, 1, length, to);
free(buffer); // getline allocates memory that we're incharge of freeing.
return 1;
}
// the main program
int main(int argc, char const* argv[]) {
// open file
FILE *fp_read, *fp_write;
fp_read = fopen("workersinfo.txt", "r");
fp_write = fopen("sorted_workersinfo.txt", "w+");
if (!fp_read) {
perror("File Error");
goto cleanup;
}
if (!fp_write) {
perror("File Error");
goto cleanup;
}
printf("\nSorting");
while (sort_line(fp_read))
printf(".");
// write all sorted data to a new file
printf("\n\nWriting sorted data");
while (write_line(fp_write, fp_read))
printf(".");
// clean up - close files and make sure the sorting tree is cleared
cleanup:
printf("\n");
fclose(fp_read);
fclose(fp_write);
clear_sort_heap();
return 0;
}

Display RGBA32-BMP Images on Linux

today I got some code to review.
Since the code is going to work on an headless pc the code saves every frame as a seperate RGBa image.
On my Ubuntu install I cannot view theses images, GIMP complains about a broken header. Imagemagick options convert or display also did not show any images.
Here's the code fragment that generates the image:
if (act.doScreenshot || (act.doVideo && buddhabrot_animate.animating))
{
uchar4* tmpBuffer = new uchar4[env.static_env.save.imageW
* env.static_env.save.imageH];
for (int i = 0; i < env.static_env.save.imageW * env.static_env.save.imageH; i++)
{
const unsigned char tmp = tmpBuffer[i].x;
tmpBuffer[i].x = tmpBuffer[i].z;
tmpBuffer[i].z = tmp;
}
char filename[128];
FILE* fp = fopen(filename, "w+b");
BITMAPFILEHEADER bmpFH;
BITMAPINFOHEADER bmpIH;
memset(&bmpFH, 0, sizeof(bmpFH));
memset(&bmpIH, 0, sizeof(bmpIH));
bmpFH.bfType = 19778; //"BM"
bmpFH.bfSize = sizeof(bmpFH) + sizeof(bmpIH) + env.static_env.save.imageW * env.static_env.save.imageH;
bmpFH.bfOffBits = sizeof(bmpFH) + sizeof(bmpIH);
bmpIH.biSize = sizeof(bmpIH);
bmpIH.biWidth = env.static_env.save.imageW;
bmpIH.biHeight = env.static_env.save.imageH;
bmpIH.biPlanes = 1;
bmpIH.biBitCount = 32;
fwrite(&bmpFH, 1, sizeof(bmpFH), fp);
fwrite(&bmpIH, 1, sizeof(bmpIH), fp);
fwrite
(tmpBuffer,
env.static_env.save.imageW * env.static_env.save.imageH,
sizeof(uchar4),
fp);
fclose(fp);
delete[] tmpBuffer;
Is there any way to look at the image?
Or maybe another way to save the images as JPGs?
You don't calculate bmpFH.bfSize correctly, you need to multiply the number of pixels in the image by the size of the pixels (4). For example:
bmpFH.bfSize = sizeof(bmpFH) + sizeof(bmpIH) + env.static_env.save.imageW * env.static_env.save.imageH * sizeof(uchar4);
You should also initialize bmpIH.biCompression to BI_RGB. It'll work anyway because its value happens to be zero, but it's good to be explicit. You also might want to negate the value you're assigning to bmpIH.biHeight as positive height values indicate a bottom up image.

fread returns no data in my buffer in spite of saying it read 4096 bytes

I'm porting some C code that loads sprites from files containing multiple bitmaps. Basically the code fopens the file, fgetcs some header info, then freads the bitmap data. I can see that the fgetcs are returning proper data, but the outcome of the fread is null. Here's the code - fname does exist, the path is correct, fil is non-zero, num is the number of sprites in the file (encoded into the header, little-endian), pak is an array of sprites, sprite is a typedef of width, height and bits, and new_sprite inits one for you.
FILE *fil;
uint8 *buffu;
uint8 read;
int32 x,num;
int32 w,h,c;
fil = fopen(fname, "rb");
if (!fil) return NULL;
num = fgetc(fil);
num += fgetc(fil)*256;
if (num > max) max = num;
for (x=0;x<max;x++) {
// header
w=fgetc(fil);
w+=fgetc(fil)*256;
h=fgetc(fil);
h+=fgetc(fil)*256;
fgetc(fil); // stuff we don't use
fgetc(fil);
fgetc(fil);
fgetc(fil);
// body
buffu = (uint8*)malloc(w * h);
read=fread(buffu,1,w*h,fil);
pak->spr[x]=new_sprite(w,h);
memcpy(pak->spr[x]->data, buffu, w*h);
// done
free(buffu);
}
I've stepped through this code line by line, and I can see that w and h are getting set up properly, and read=4096, which is the right number of bits. However, buffer is "" after the fread, so of course memcpy does nothing useful and my pak is filled with empty sprites.
My apologies for what is surely a totally noob question, but I normally use Cocoa so this pure-C file handling is new to me. I looked all over for examples of fread, and they all look like the one here - which apparently works fine on Win32.
Since fgetc seems to work, you could try this as a test
int each;
int byte;
//body
buffu = malloc(w * h);
for (each = 0; each < w*h; each++) {
byte = fgetc(fil);
if ( byte == EOF) {
printf("End of file\n");
break;
}
buffu[each] = (uint8)byte;
printf ("byte: %d each: %d\n", byte, each);
}
pak->spr[x]=new_sprite(w,h);
memcpy(pak->spr[x]->data, buffu, w*h);
// done
You say:
However, buffer is "" after the fread, so of course memcpy does nothing useful
But that is not true at all. memcpy() is not a string function, it will copy the requested number of bytes. Every time. If that isn't "useful", then something else is wrong.
Your buffer, when treated as a string (which it is not, it's a bunch of binary data) will look like an empty string if the first byte happens to be 0. The remaining 4095 bytes can be whatever, to C's string printing functions it will look "empty".

fwrite() and file corruption

I'm trying to write a wchar array to a file in C, however there is some sort of corruption and unrelevant data like variables and paths like this
c.:.\.p.r.o.g.r.a.m. .f.i.l.e.s.\.m.i.c.r.o.s.o.f.t. .v.i.s.u.a.l. .s.t.u.d.i.o. 1.0...0.\.v.c.\.i.n.c.l.u.d.e.\.x.s.t.r.i.n.g..l.i.s.t...i.n.s.e.r.t
are written on to the file along with the correct data (example) I have confirmed that the buffer is null-terminated and contains proper data.
Heres my code:
myfile = fopen("logs.txt","ab+");
fseek(myfile,0,SEEK_END);
long int size = ftell(myfile);
fseek(myfile,0,SEEK_SET);
if (size == 0)
{
wchar_t bom_mark = 0xFFFE;
size_t written = fwrite(&bom_mark,sizeof(wchar_t),1,myfile);
}
// in another func
while (true)
{
[..]
unsigned char Temp[512];
iBytesRcvd = recv(sclient_socket,(char*)&Temp,iSize,NULL);
if(iBytesRcvd > 0 )
{
WCHAR* unicode_recv = (WCHAR*)&Temp;
fwrite(unicode_recv,sizeof(WCHAR),wcslen(unicode_recv),myfile);
fflush(myfile);
}
[..]
}
What could be causing this?
recv() will not null-terminate &Temp, so wcslen() runs over the bytes actually written by recv(). You will get correct results if you just use iBytesReceived as byte count for fwrite() instead of using wcslen() and hoping the data received is correctly null-terminated (wide-NULL-terminated, that is):
fwrite(unicode_recv, 1, iBytesReceived, myfile);

Resources