I am having a hard time understanding and parsing the info data present in a bitmap image. To better understand I read the following tutorial, Raster Data.
Now, The code present there is as follows, (Greyscale 8bit color value)
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
/*-------STRUCTURES---------*/
typedef struct {int rows; int cols; unsigned char* data;} sImage;
/*-------PROTOTYPES---------*/
long getImageInfo(FILE*, long, int);
int main(int argc, char* argv[])
{
FILE *bmpInput, *rasterOutput;
sImage originalImage;
unsigned char someChar;
unsigned char* pChar;
int nColors; /* BMP number of colors */
long fileSize; /* BMP file size */
int vectorSize; /* BMP vector size */
int r, c; /* r = rows, c = cols */
/* initialize pointer */
someChar = '0';
pChar = &someChar;
if(argc < 2)
{
printf("Usage: %s bmpInput.bmp\n", argv[0]);
//end the execution
exit(0);
}
printf("Reading filename %s\n", argv[1]);
/*--------READ INPUT FILE------------*/
bmpInput = fopen(argv[1], "rb");
//fseek(bmpInput, 0L, SEEK_END);
/*--------DECLARE OUTPUT TEXT FILE--------*/
rasterOutput = fopen("data.txt", "w");
/*--------GET BMP DATA---------------*/
originalImage.cols = (int)getImageInfo(bmpInput, 18, 4);
originalImage.rows = (int)getImageInfo(bmpInput, 22, 4);
fileSize = getImageInfo(bmpInput, 2, 4);
nColors = getImageInfo(bmpInput, 46, 4);
vectorSize = fileSize - (14 + 40 + 4*nColors);
/*-------PRINT DATA TO SCREEN-------------*/
printf("Width: %d\n", originalImage.cols);
printf("Height: %d\n", originalImage.rows);
printf("File size: %ld\n", fileSize);
printf("# Colors: %d\n", nColors);
printf("Vector size: %d\n", vectorSize);
/*----START AT BEGINNING OF RASTER DATA-----*/
fseek(bmpInput, (54 + 4*nColors), SEEK_SET);
/*----------READ RASTER DATA----------*/
for(r=0; r<=originalImage.rows - 1; r++)
{
for(c=0; c<=originalImage.cols - 1; c++)
{
/*-----read data and print in (row,column) form----*/
fread(pChar, sizeof(char), 1, bmpInput);
fprintf(rasterOutput, "(%d, %d) = %d\n", r, c, *pChar);
}
}
fclose(bmpInput);
fclose(rasterOutput);
}
/*----------GET IMAGE INFO SUBPROGRAM--------------*/
long getImageInfo(FILE* inputFile, long offset, int numberOfChars)
{
unsigned char *ptrC;
long value = 0L;
unsigned char dummy;
int i;
dummy = '0';
ptrC = &dummy;
fseek(inputFile, offset, SEEK_SET);
for(i=1; i<=numberOfChars; i++)
{
fread(ptrC, sizeof(char), 1, inputFile);
/* calculate value based on adding bytes */
value = (long)(value + (*ptrC)*(pow(256, (i-1))));
}
return(value);
} /* end of getImageInfo */
What I am not understanding:-
I am unable the understand the 'GET IMAGE INTOSUBPROGRAM' part where the code is trying to get the image infos like no of rows,columns, etc. Why are these infos stored over 4 bytes and what is the use of the value = (long)(value + (*ptrC)*(pow(256, (i-1)))); instruction.
Why there unsigned char dummy ='0' is created and then ptrC =&dummy is assigned?
Why can't we just get the no of rows in an image by just reading 1 byte of data like getting the Greyscale value at a particular row and column.
Why are we using unsigned char to store the byte, isn't there some other data type or int or long we can use effectively here?
Please help me understand these doubts(confusions!!?) I am having and forgive me if they sound noobish.
Thank you.
I would say the tutorial is quite bad in some ways and your problems to understand it are not always due to being a beginner.
I am unable the understand the 'GET IMAGE INTOSUBPROGRAM' part where the code is trying to get the image infos like no of rows,columns, etc. Why are these infos stored over 4 bytes and what is the use of the value = (long)(value + (ptrC)(pow(256, (i-1)))); instruction.
The reason to store over 4 bytes is to allow the image to be sized between 0 and 2^32-1 high and wide. If we used just one byte, we could only have images sized 0..255 and with 2 bytes 0..65535.
The strange value = (long)(value + (*ptrC)*(pow(256, (i-1)))); is something I've never seen before. It's used to convert bytes into a long so that it would work with any endianness. The idea is to use powers of 256 to set the *ptrC to the value, i.e. multiplying first byte with 1, next with 256, next with 65536 etc.
A much more readable way would be to use shifts, e.g. value = value + ((long)(*ptrC) << 8*(i-1));. Or even better would be to read bytes from the highest one to lower and use value = value << 8 + *ptrC;. In my eyes a lot better, but when the bytes come in a different order, is not always so simple.
A simple rewrite to be much easier to understand would be
long getImageInfo(FILE* inputFile, long offset, int numberOfChars)
{
unsigned char ptrC;
long value = 0L;
int i;
fseek(inputFile, offset, SEEK_SET);
for(i=0; i<numberOfChars; i++) // Start with zero to make the code simpler
{
fread(&ptrC, 1, 1, inputFile); // sizeof(char) is always 1, no need to use it
value = value + ((long)ptrC << 8*i); // Shifts are a lot simpler to look at and understand what's the meaning
}
return value; // Parentheses would make it look like a function
}
Why there unsigned char dummy ='0' is created and then ptrC =&dummy is assigned?
This is also pointless. They could've just used unsigned char ptrC and then used &ptrC instead of ptrC and ptrC instead of *ptrC. This would've also shown that it is just a normal static variable.
Why can't we just get the no of rows in an image by just reading 1 byte of data like getting the Greyscale value at a particular row and column.
What if the image is 3475 rows high? One byte isn't enough. So it needs more bytes. The way of reading is just a bit complicated.
Why are we using unsigned char to store the byte, isn't there some other data type or int or long we can use effectively here?
Unsigned char is exactly one byte long. Why would we use any other type for storing a byte then?
(4) The data of binary files is made up of bytes, which in C are represented by unsigned char. Because that's a long word to type, it is sometimes typedeffed to byte or uchar. A good standard-compliant way to define bytes is to use uint8_t from <stdint.h>.
(3) I'm not quite sure what you're trying to get at, but the first bytes - usually 54, but there are othzer BMF formats - of a BMP file make up the header, which contains information on colour depth, width and height of an image. The bytes after byte 54 store the raw data. I haven't tested yopur code, but there might be an issue with padding, because the data for each row must be padded to make a raw-data size that is divisible by 4.
(2) There isn't really a point in defining an extra pointer here. You could just as well fread(&dummy, ...) directly.
(1) Ugh. This function reads a multi-byte value from the file at position offset in the file. The file is made up of bytes, but several bytes can form other data types. For example, a 4-byte unsigned word is made up of:
uint8_t raw[4];
uint32_t x;
x = raw[0] + raw[1]*256 + raw[2]*256*256 + raw[3]*256*256*256;
on a PC, which uses Little Endian data.
That example also shows where the pow(256, i) comes in. Using the pow function here is not a good idea, because it is meant to be used with floating-point numbers. Even the multiplication by 256 is not very idiomatic. Usually, we construct values by byte shifting, where a multiplication by 2 is a left-shift by 1 and hence a multiplication by 256 is a left-shift by 8. Similarly, the additions above add non-overlapping ranges and are usually represented as a bitwise OR, |:
x = raw[0] | (raw[1]<<8) | (raw[2]<<16) | (raw[3]<<24);
The function accesses the file by re-positioning the file pointer (and leaving it at the new position). That's not very effective. It would be better to read the header as an 54-byte array and accessing the array directly.
The code is old and clumsy. Seeing something like:
for(r=0; r<=originalImage.rows - 1; r++)
is already enough for me not to trust it. I'm sure you can find a better example of reading greyscale images from BMP. You could even write your own and start with the Wikipedia article on the BMP format.
Related
The for loop should be replaced with a fread I believe, however, I am very unclear on how fread will work.
How does fread know the value of green pixel # given location and where to save the value. My understanding is that I have a chunk of heap memory, a rectangle has a tbd number of pixels. Each pixel has 3 values. How will fread (or any other method I can use)?
If anyone could just explain how the fread line below would work with my code? This is for an assignment, I am just trying to understand what is going on since it is one we will be building on.
fread(pixelD, sizeof(Pixel), width*height, file);
typdef struct
{
unsigned char green;
unsigned char blue;
unsigned char red;
}pixelD;
typedef struct
{
pixelD * pixel;
} Color;
Image * ReadImage(char *filename)
{
int width, height, maxval;
int imgSize = width * height * sizeof(pixel);
//fscanf line was given by prof
fscanf(f_in, "%s\n%d %d\n%d\n", magicNum, &width, &height, &maxval);
pixel = malloc(imgSize);
for(int i = 0; i <imgSize; i++)
{
pixel.green = pixel[i]; ????
pixel.blue = ;
pixel.red = ;
}
}
From the looks of it you are reading a PPM file.
Read the header doing something like this:
int width, height, max;
my_assert(3==fscanf(f_in, "P6%d%d%d ", &width, &height, &max));
/* TODO: error handling */
The format specifiers tells it to read the expected magic number ("P6"), then second, third and fourth words as integers (implicitly skipping any whitespace between), and then consume a whitespace ("mostly a newline" according PPM) to set the file read position to where the binary data starts. You should probably make sure width/height/max being within what your application expects and can cope with.
And then read the rest of the data into memory. fread read from the current read position size*count bytes; no formatting:
int channel_width = max < 256 ? 1 : 2; /* PPM channel width can be either 1- or 2-byte */
int rgb = 3;
int imgsize = width*height*rgb*channel_width;
void* texture = malloc(imgsize);
my_assert(imgsize==fread(texture, 1, imgsize, f_in));
/* do something with the texture memory */
At that point you can just cast the texture pointer to whatever struct you like to use, e.g. pixel1D* pixs = texture (just be careful if channels are 2-byte long since your posted struct is not). I find a structure carrying the meta and a typeless memory block more flexible since mostly working with OpenGL. Maybe that is what you meant to do with the Image type.
The code is completely untested. Have fun debugging it.
Consider the following code that loads a dataset of records into a buffer and creates a Record object for each record. A record constitutes one or more columns and this information is uncovered at run-time. However, in this particular example, I have set the number of columns to 3.
typedef unsigned int uint;
typedef struct
{
uint *data;
} Record;
Record *createNewRecord (short num_cols);
int main(int argc, char *argv[])
{
time_t start_time, end_time;
int num_cols = 3;
char *relation;
FILE *stream;
int offset;
char *filename = "file.txt";
stream = fopen(filename, "r");
fseek(stream, 0, SEEK_END);
long fsize = ftell(stream);
fseek(stream, 0, SEEK_SET);
if(!(relation = (char*) malloc(sizeof(char) * (fsize + 1))))
printf((char*)"Could not allocate buffer");
fread(relation, sizeof(char), fsize, stream);
relation[fsize] = '\0';
fclose(stream);
char *start_ptr = relation;
char *end_ptr = (relation + fsize);
while (start_ptr < end_ptr)
{
Record *new_record = createNewRecord(num_cols);
for(short i = 0; i < num_cols; i++)
{
sscanf(start_ptr, " %u %n",
&(new_record->data[i]), &offset);
start_ptr += offset;
}
}
Record *createNewRecord (short num_cols)
{
Record *r;
if(!(r = (Record *) malloc(sizeof(Record))) ||
!(r->data = (uint *) malloc(sizeof(uint) * num_cols)))
{
printf(("Failed to create new a record\n");
}
return r;
}
This code is highly inefficient. My dataset contains around 31 million records (~1 GB) and this code processes only ~200 records per minute. The reason I load the dataset into a buffer is because I'll later have multiple threads process the records in this buffer and hence I want to avoid files accesses. Moreover, I have a 48 GB RAM, so the dataset in memory should not be a problem. Any ideas on how can to speed things up??
SOLUTION: the sscanf function was actually extremely slow and inefficient.. When I switched to strtoul, the job finishes in less than a minute. Malloc-ing ~ 3 million structs of type Record took only few seconds.
Confident that a lurking non-numeric data exist in the file.
int offset;
...
sscanf(start_ptr, " %u %n", &(new_record->data[i]), &offset);
start_ptr += offset;
Notice that if the file begins with non-numeric input, offset is never set and if it had the value of 0, start_ptr += offset; would never increment.
If a non-numeric data exist later in the file like "3x", offset will get the value of 1, and cause the while loop to proceed slowly for it will never get an updated value.
Best to check results of fread(), ftell() and sscanf() for unexpected return values and act accordingly.
Further: long fsizemay be too small a size. Look to using fgetpos() and fsetpos().
Note: to save processing time, consider using strtoul() as it is certainly faster than sscanf(" %u %n"). Again - check for errant results.
BTW: If code needs to uses sscanf(), use sscanf("%u%n"), a tad faster and for your code and the same functionality.
I'm not an optimization professional but I think some tips should help.
First of all, I suggest you use filename and num_cols as macros because they tend to be faster as literals when I don't see you changing their values in code.
Seond, using a struct for storing only one member is generally not recommended, but if you want to use it with functions you should only pass pointers. Since I see you're using malloc to store a struct and again for storing the only member then I suppose that is the reason why it is too slow. You're using twice the memory you need. This might not be the case with some compilers, however. Practically, using a struct with only one member is pointless. If you want to ensure that the integer you get (in your case) is specifically a record, you can typedef it.
You should also make end_pointer and fsize const for some optimization.
Now, as for functionality, have a look at memory mapping io.
Im having trouble with fwrite corrupting files.
The idea behind this program is to just create a RAW image file that exists out of pixels that I inserted into an array called Colors[]. Basically is should be making straight lines of the different colors im placing into the arrays. Now ive tried a whole lot of methods to write it into a file but if it hasnt been written out in bit mode it just doesnt work on my RAW image display program that I have created although other RAW images do work on it though.
Would there be any easier ways of doing exactly this that I want to do ?
There a version of this program where i use an array of chars to fill the buffer but its a whole lot more code. eX //unsigned char Col[] = {'f','f','f','f','f','f'};
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *readFile(char *fileName);
int main()
{
unsigned int i;
//Pixels in 24Bit mode
unsigned int Colors[] = {0x00ff00,0x00ffff,0xffff00,0xffff00};
FILE *fpw;
fpw=fopen("MF.bin", "wb");
/*A Buffer array for the pixels to be stored in before fwrite will be used to
dump the entire array to a file.
640pixels by 480pixels * 3(each pixel has 3 ints)
*/
unsigned int Buff[640*480*3];
int y,z,CP=0;
unsigned long x;
for (y=0;y<=(480);y++)
{
if (x>=640)
{
CP = 0;
}
for(x=0;x<=640;x++)//3840);x++)
{
if ((x>=320))
{
CP = 1;
}
if ((x>=159) && (x<319))
{
CP = 1;
}
else if ((x>=319) && (x<439))
{
CP = 2;
}
else if ((x>=439) && (x<640))
{
CP = 0;
}
else if (x>=640)
{
CP =0;
}
Buff[480*y + x] = Colors[CP];
printf("%u--%u,%u\n",(480*y + x),Buff[480 + x], Colors[CP]);
}
unsigned int xx = fwrite(Buff, 1, sizeof(Buff)/sizeof(Buff[0]), fpw);
printf("--&z--",xx);
}
fclose(fpw);
}
There are a few mistakes...
Buff[480*y + x] = Colors[CP];
should be
Buff[640*y + x] = Colors[CP];
and the y loop should be < 480 not <=, same for x.
You are using unsigned ints so you don't need to multiply by 3 in your new. Using fwrite to directly write out 24 bit data won't work like that, as you have an array of 32 bit data, and your calculation for how many bytes to write is incorrect (you need to multiply not divide, but as stated above that would also be wrong because you have 32 bit data instead of 24).
There's no 24 bit data type, so you should use an array of unsigned char not unsigned long and do each colour component individually.
Here's the situation. I have to read in data from an external binary file and display the data in order and so that it makes sense to the user.
The file has data stored as follows: the first 4 bytes are an integer, then the next 8 bytes are a floating decimal, followed by the next 8 bytes (float), etc. So I need to read in 4 bytes initially, then repeatedly 8 bytes after that... until the file has no data left to read.
I have read the file in such a way that it stores its data into an array i[NUM] (where NUM is the number of elements), and each element contains 4 bytes. By doing this, I have accidentally 'split' the floats in half, the first half being stored in i[1] and the second half in i[2], also a float in i[3] and i[4], etc.
Now I am in the process of trying to 'stitch' the two halves of each float back together again in order to display them, but I am stuck.
Any suggestions are greatly appreciated.
My code so far:
#include "stdafx.h"
#include "stdio.h"
#define NUM 15
int main(void)
{
//initialising
int i[NUM], j, k;
float temp[NUM];
char temp_c[NUM];
int element_size = 4;
int element_number = NUM;
for(k=0; k<NUM; k++)
{
i[k] = 0; //clear all cells in i[]
temp[k] = 0; //clear all cells in temp[]
}
//file reading
FILE *fp;
fp = fopen("C:\\data", "rb+");
fread(&i, element_size, element_number, fp); //reads 'data' to the end and then stores each element into array i[]
fclose(fp); //close the file
//arrange and print data here
printf("Data of File\n\nN = %d",i[0]);
//this is where the rest fell apart
//No idea how to go about it
return 0;
}
If you're sure that the float is 8 bytes and the int is 4 then you can do this (probably in a loop with variables instead of the fixed indices I've used):
memcpy(&temp[0], &i[1], 8);
I'm assuming that your code for creating the file was a fwrite where you wrote the 4-byte int, then wrote the 8-byte floats.
Then you can output the floats with printf("%f\n", temp[0]); or whatever.
NB. You can avoid your initialization loop by initializing the arrays directly: int i[NUM] = { 0 }; etc. This only works for 0, not for other values.
I was trying to obtain the RGB values from a 24-bit BMP file. The image that I am using is a tiny image, all red, so all pixels BGR configuration should be B:0 G:0 R:255. I do this:
int main(int argc, char **argv)
{
principal();
return 0;
}
typedef struct {
unsigned char blue;
unsigned char green;
unsigned char red;
} rgb;
typedef struct {
int ancho, alto;
rgb *pixeles[MAX_COORD][MAX_COORD];
} tBitmapData;
void principal()
{
FILE *fichero;
tBitmapData *bmpdata = (tBitmapData *) malloc(sizeof(tBitmapData));
rgb *pixel;
int i, j, num_bytes;
unsigned char *buffer_imag;
char nombre[] = "imagen.bmp";
fichero = fopen(nombre, "r");
if (fichero == NULL)
puts("No encontrado\n");
else {
fseek(fichero, 18, SEEK_SET);
fread(&(bmpdata->ancho), sizeof((bmpdata->ancho)), 4, fichero);
printf("Ancho: %d\n", bmpdata->ancho);
fseek(fichero, 22, SEEK_SET);
fread(&(bmpdata->alto), sizeof((bmpdata->alto)), 4, fichero);
printf("Alto: %d\n", bmpdata->alto);
}
num_bytes = (bmpdata->alto * bmpdata->ancho * 3);
fseek(fichero, 54, SEEK_SET);
for (j = 0; j < bmpdata->alto; j++) {
printf("R G B Fila %d\n", j + 1);
for (i = 0; i < bmpdata->ancho; i++) {
pixel =
(rgb *) malloc(sizeof(rgb) * bmpdata->alto *
bmpdata->ancho * 3);
fread(pixel, 1, sizeof(rgb), fichero);
printf("Pixel %d: B: %3d G: %d R: %d \n", i + 1,
pixel->blue, pixel->green, pixel->red);
}
}
fclose(fichero);
}
The problem is that when I print them, the first pixels are fine, B:0 G:0 R:255, but then they start to change to B:0 G:255 R:0, and then to B:255 G:0 R:0. If the width is 10 pixels, then the change happens every 10 pixels.
In the BMP file format, each row of pixel data may be padded in order to round up to a multiple of 4 bytes.
If you have 10 24-bit pixels, that's 30 bytes, which are then followed by 2 bytes of padding. Your code doesn't skip over the padding.
I think your fread(3) calls are wrong:
fread(&(bmpdata->ancho), sizeof((bmpdata->ancho)), 4, fichero);
This asks to read 4*sizeof((bmpdata->ancho)) bytes into an int. I assume sizeof((bmpdata->ancho)) returns 4, so I think you're scribbling over unrelated memory with these two calls. Change the 4 to 1 -- you're only reading one item.
You never use num_bytes; delete it. Unused code makes thinking about the used code that much more difficult. :)
You're allocating three times as much memory as you need to:
pixel =
(rgb *) malloc(sizeof(rgb) * bmpdata->alto *
bmpdata->ancho * 3);
The 3 looks like an attempt to account for each of red, green, blue, in your rgb structure, but sizeof(rgb) already knows the correct size of the structure. (Which might be 4 bytes, for convenient alignment for 32-bit CPUs, or it might be 12 bytes, again for alignment (each char on its own 4 byte boundary), or maybe even 24 bytes on 64-bit systems that really enjoy working with data aligned on 8 byte boundaries.)
And the final thing I note:
fread(pixel, 1, sizeof(rgb), fichero);
Because the C compiler is allowed to insert holes into structures, you cannot assume that the on-disk format matches your in-memory structure definition. You need to either use the GNU C extension __packed__ attribute or you need to read data from the from using libraries or structures designed for the bmp format. If this is a fun project for you, then definitely try the __packed__ route: if it works, good, if it doesn't work, hopefully you can learn why not, and re-write your code to load each element of the structure manually. If you're just trying to get something that can correctly parse bitmaps, then you might want to try to find some pre-written libraries that already parse images correctly.
(And yes, it is VERY IMPORTANT to get image parsing correct; CVE has a list of malformed image exploits that allow attackers control over programs, many of them are remotely exploitable.)