What's the Common Lisp equivalent of the C function fread? - c

I'm attempting to port some C code to Common Lisp (details probably irrelevant, but I'm trying to read an rgb image file into a block of memory to bind a texture for use with cl-opengl). The C version of what I'm trying to do is:
void * data;
FILE * file;
int bytecount = width * height * 3;
file = fopen ( rgbfilepath, "rb" );
data = malloc( bytecount );
fread( data, bytecount, file);
fclose( file );
//do stuff with data...
It has been a few years since I did this, but my understanding of this code is that it is reading a bunch of bytes from the file into the malloc-ed memory without paying any attention whatsoever to the content of those bytes.
After some googling, I found http://rosettacode.org/wiki/Read_entire_file#Common_Lisp and http://www.codecodex.com/wiki/Read_a_file_into_a_byte_array#Common_Lisp which are very similar. My version looks like
(with-open-file (stream rgb-file-path)
(let ((data (make-string (file-length stream)))
(read-sequence data stream)
;;do stuff with data...
When I run this thing on an rgb file, I get the following complaint from SBCL:
debugger invoked on a SB-INT:STREAM-DECODING-ERROR in thread
#<THREAD "main thread" RUNNING {10055E6EA3}>:
:UTF-8 stream decoding error on
#<SB-SYS:FD-STREAM
for "file /home/john/code/lisp/commonlisp/opengl-practice/wall.rgb"
{1005A8EB53}>:
the octet sequence #(218 0) cannot be decoded.
My speculative interpretation of this is that the use of make-string expects the bytes to be characters, while the rgb file I am loading just has a bunch of bytes, not necessarily valid ASCII or whatever character set is expected. But I could be way off. Any suggestions about how to duplicate what fread() does?
Thanks in advance!

If you really want to do this raw, you need to specify an element-type to open (or with-open-file) that is a subtype of integer (most likely something like '(unsigned-byte 8) so that it is read as octets instead of characters.
I would use a library to read images, though. I have used opticl in the past, which is quite straightforward and uses two- or three-dimensional simple-arrays to represent image data (two dimensions plus one for the colours).

Many thanks to Svante for leading me in the right direction. Here is what worked:
(setf mystream (open rgb-file-name :direction :input :element-type '(unsigned-byte 8)))
(setf data (make-array (file-length mystream) :element-type '(unsigned-byte 8)))
(read-sequence data mystream)
(close mystream)

Related

Getting width and height from jpeg image file

I wrote this function to given filename(a jpeg file) shall print its size in pixels, w and h. According to tutorial that I'm reading,
//0xFFC0 is the "Start of frame" marker which contains the file size
//The structure of the 0xFFC0 block is quite simple [0xFFC0][ushort
length][uchar precision][ushort x][ushort y]
So, I wrote this struct
#pragma pack(1)
struct imagesize {
unsigned short len; /* 2-bytes */
unsigned char c; /* 1-byte */
unsigned short x; /* 2-bytes */
unsigned short y; /* 2-bytes */
}; //sizeof(struct imagesize) == 7
#pragma pack()
and then:
#define SOF 0xC0 /* start of frame */
void jpeg_test(const char *filename)
{
FILE *fh;
unsigned char buf[4];
unsigned char b;
fh = fopen(filename, "rb");
if(fh == NULL)
fprintf(stderr, "cannot open '%s' file\n", filename);
while(!feof(fh)) {
b = fgetc(fh);
if(b == SOF) {
struct imagesize img;
#if 1
ungetc(b, fh);
fread(&img, 1, sizeof(struct imagesize), fh);
#else
fread(buf, 1, sizeof(buf), fh);
int w = (buf[0] << 8) + buf[1];
int h = (buf[2] << 8) + buf[3];
img.x = w;
img.y = h;
#endif
printf("%dx%d\n",
img.x,
img.y);
break;
}
}
fclose(fh);
}
But I'm getting 520x537 instead of 700x537, that's the real size.
Can someone point and explain where I'm wrong?
A JPEG file consists of a number of sections. Each section starts with 0xff, followed by 1-byte section identifier, followed by number of data bytes in the section (in 2 bytes), followed by the data bytes. The sequence 0xffc0, or any other 0xff-- two-byte sequence, inside the data byte sequence, has no significance and does not mark a start of a section.
As an exception, the very first section does not contain any data or length.
You have to read each section header in turn, parse the length, then skip corresponding number of bytes before starting to read next section. You cannot just search for 0xffc0, let alone just 0xc0, without regard to the section structure.
Source.
There are several issues to consider, depending on how "universal" you want your program to be. First, I recommend using libjpeg. A good JPEG parser can be a bit gory, and this library does a lot of the heavy lifting for you.
Next, to clarify n.m.'s statement, you have no guarantee that the first 0xFFCO pair is the SOF of interest. I've found that modern digital cameras like to load up the JPEG header with a number of APP0 and APP1 blocks, which can mean that the first SOF marker you encounter during a sequential read may actually be the image thumbnail. This thumbnail is usually stored in JPEG format (as far as I have observed, anyway) and is thus equipped with its own SOF marker. Some cameras and/or image editing software can include an image preview that is larger than a thumbnail (but smaller than the actual image). This preview image is usually JPEG and again has it's own SOF marker. It's not unusual for the image SOF marker to be the last one.
Most (all?) modern digital cameras also encode the image attributes in the EXIF tags. Depending upon your application requirements, this might be the most straightforward, unambiguous way to obtain the image size. The EXIF standard document will tell you all you need to know about writing an EXIF parser. (libExif is available, but it never fit my applications.) Regardless, if you roll your own EXIF or rely on a library, there are some good tools for inspecting EXIF data. jhead is very good tool, and I've also had good luck with ExifTool.
Lastly, pay attention to endianess. SOF and other standard JPEG markers are big-endian, but EXIF markers may vary.
As you mention, the spec states that the marker is 0xFFC0. But it seems that you only ever look for a single byte with the code if (b==SOF)
If you open the file up with a hex editor, and search for 0xFFC0 you'll find the marker. Now as long as the first 0xC0 in the file is the marker, your code will work. If it's not though, you get all sorts of undefined behaviour.
I'd be inclined to read the whole file first. It's a jpg right, how big could it be? (thought this is important if on an embedded system) Then just step through it looking for the first char of my marker. When found, I'd use a memcmp to see if the next 3bytes mathed the rest of the sig.

Loading an 8bpp grayscale BMP in C

I can't make sense of the BMP format, I know its supposed to be simple, but somehow I'm missing something. I thought it was 2 headers followed by the actual bytes defining the image, but the numbers do not add up.
For instance, I'm simply trying to load this BMP file into memory (640x480 8bpp grayscale) and just write it back to a different file. From what I understand, there are two different headers BITMAPFILEHEADER and BITMAPINFOHEADER. The BITMAPFILEHEADER is 14 bytes, and the BITMAPINFOHEADER is 40 bytes (this one depends on the BMP, how can I tell that's another story). Anyhow, the BITMAPFILEHEADER, through its parameter bfOffBits says that the bitmap bits start at offset 1078. This means that there are 1024 ( 1078 - (40+14) ) other bytes, carrying more information. What are those bytes, and how do I read them, this is the problem. Or is there a more correct way to load a BMP and write it to disk ?
For reference here is the code I used ( I'm doing all of this under windows btw.)
#include <windows.h>
#include <iostream>
#include <stdio.h>
HANDLE hfile;
DWORD written;
BITMAPFILEHEADER bfh;
BITMAPINFOHEADER bih;
int main()
hfile = CreateFile("image.bmp",GENERIC_READ,FILE_SHARE_READ,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,NULL);
ReadFile(hfile,&bfh,sizeof(bfh),&written,NULL);
ReadFile(hfile,&bih,sizeof(bih),&written,NULL);
int imagesize = bih.biWidth * bih.biHeight;
image = (unsigned char*) malloc(imagesize);
ReadFile(hfile,image,imagesize*sizeof(char),&written,NULL);
CloseHandle(hfile);
I'm then doing the exact opposite to write to a file,
hfile = CreateFile("imageout.bmp",GENERIC_WRITE,FILE_SHARE_WRITE,NULL,CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL,NULL);
WriteFile(hfile,&bfh,sizeof(bfh),&written,NULL);
WriteFile(hfile,&bih,sizeof(bih),&written,NULL);
WriteFile(hfile,image,imagesize*sizeof(char),&written,NULL);
CloseHandle(hfile);
Edit --- Solved
Ok so I finally got it right, it wasn't really complicated after all. As Viktor pointed out, these 1024 bytes represent the color palette.
I added the following to my code:
RGBQUAD palette[256];
// [...] previous declarations [...] int main() [...] then read two headers
ReadFile(hfile,palette,sizeof(palette),&written,NULL);
And then when I write back I added the following,
WriteFile(hfile,palette,sizeof(palette),&written,NULL);
"What are those bytes, and how do I read them, this is the problem."
Those bytes are Palette (or ColorTable in .BMP format terms), as Retired Ninja mentioned in the comment. Basically, it is a table which specifies what color to use for each 8bpp value encountered in the bitmap data.
For greyscale the palette is trivial (I'm not talking about color models and RGB -> greyscale conversion):
for(int i = 0 ; i < 256 ; i++)
{
Palette[i].R = i;
Palette[i].G = i;
Palette[i].B = i;
}
However, there's some padding in the ColorTable's entries, so it takes 4 * 256 bytes and not 256 * 3 needed by you. The fourth component in the ColorTable's entry (RGBQUAD Struct) is not the "alpha channel", it is just something "reserved". See the MSDN on RGBQUAD (MSDN, RGBQUAD).
The detailed format description can be found on the wikipedia page:Wiki, bmp format
There's also this linked question on SO with RGBQUAD structure: Writing BMP image in pure c/c++ without other libraries
As Viktor says in his answer, those bits are the pallete. As for how should you read them, take a look at this header-only bitmap class. In particular look at references to ColorTable for how it treats the pallette bit depending on the type of BMP is it was given.

How do I convert a G.726 ADPCM signal into a PCM signal?

I usually look to SoX or Window's built in audio libraries for this stuff, but it appears that neither have G.726 codecs.
So I have a sequence of bytes that I know are encoded as G.726 although the bit-rate and whether it is mu-law or A-law is not known at this time (experimentation will determine those parameters), and I need to decode them into a normal PCM signal.
So I downloaded the reference implementation from the ITU-T (ITU-T Recommendation G.191) but I'm kind of confused on how to use the G726_decode function. According to the documentation inp_buf and out_buf need to have the same length smpno and both buffers are 16-bit buffers. This seems to me like a step is missing; otherwise no compression is accomplished by using G.726. According to the Wikipedia page on G.726 sample size depends on bit rate (from 2 to 5 bits). Am I supposed to do the decompression into samples myself? So if I assume maximum compression (2 bit samples) then each byte will produce 4 samples.
Example:
char b = /* read the code from input */
short inp[4], output[4];
inp[0] = b & 0x0003;
inp[1] = b & 0x000C >> 2;
inp[2] = (b & 0x0030) >> 4;
inp[3] = (b & 0x00C0) >> 6;
G726_state state;
memset(&state, 0, sizeof(G726_state));
G726_decode(inp, output, 4, "u", 2, 1, &state);
/* ouput now contains 4 PCM samples */
Or am I missing something completely?
Looks like ffmpeg actually isn't able to do this, as I thought it surely would be able to... however, while I was googling I did find this post to the ffmpeg mailing list which offers a solution.
Basically, there is a separate program called g72x++ which seems to be able to decode the audio to raw PCM for you.

What could cause a Labwindows/CVI C program to hate the number 2573?

Using Windows
So I'm reading from a binary file a list of unsigned int data values. The file contains a number of datasets listed sequentially. Here's the function to read a single dataset from a char* pointing to the start of it:
function read_dataset(char* stream, t_dataset *dataset){
//...some init, including setting dataset->size;
for(i=0;i<dataset->size;i++){
dataset->samples[i] = *((unsigned int *) stream);
stream += sizeof(unsigned int);
}
//...
}
Where read_dataset in such a context as this:
//...
char buff[10000];
t_dataset* dataset = malloc( sizeof( *dataset) );
unsigned long offset = 0;
for(i=0;i<number_of_datasets; i++){
fseek(fd_in, offset, SEEK_SET);
if( (n = fread(buff, sizeof(char), sizeof(*dataset), fd_in)) != sizeof(*dataset) ){
break;
}
read_dataset(buff, *dataset);
// Do something with dataset here. It's screwed up before this, I checked.
offset += profileSize;
}
//...
Everything goes swimmingly until my loop reads the number 2573. All of a sudden it starts spitting out random and huge numbers.
For example, what should be
...
1831
2229
2406
2637
2609
2573
2523
2247
...
becomes
...
1831
2229
2406
2637
2609
0xDB00000A
0xC7000009
0xB2000008
...
If you think those hex numbers look suspicious, you're right. Turns out the hex values for the values that were changed are really familiar:
2573 -> 0xA0D
2523 -> 0x9DB
2247 -> 0x8C7
So apparently this number 2573 causes my stream pointer to gain a byte. This remains until the next dataset is loaded and parsed, and god forbid it contain a number 2573. I have checked a number of spots where this happens, and each one I've checked began on 2573.
I admit I'm not so talented in the world of C. What could cause this is completely and entirely opaque to me.
You don't specify how you obtained the bytes in memory (pointed to by stream), nor what platform you're running on, but I wouldn't be surprised to find your on Windows, and you used the C stdio library call fopen(filename "r"); Try using fopen(filename, "rb");. On Windows (and MS-DOS), fopen() translates MS-DOS line endings "\r\n" (hex 0x0D 0x0A) in the file to Unix style "\n", unless you append "b" to the file mode to indicate binary.
A couple of irrelevant points.
sizeof(*dataset) doesn't do what you think it does.
There is no need to use seek on every read
I don't understand how you are calling a function that only takes one parameter but you are giving it two (or at least I don't understand why your compiler doesn't object)

Jpeglib code gives garbled output, even the bundled example code?

I'm on Ubuntu Intrepid and I'm using jpeglib62 6b-14. I was working on some code, which only gave a black screen with some garbled output at the top when I tried to run it. After a few hours of debugging I got it down to pretty much the JPEG base, so I took the example code, wrote a little piece of code around it and the output was exactly the same.
I'm convinced jpeglib is used in a lot more places on this system and it's simply the version from the repositories so I'm hesitant to say that this is a bug in jpeglib or the Ubuntu packaging.
I put the example code below (most comments stripped). The input JPEG file is an uncompressed 640x480 file with 3 channels, so it should be 921600 bytes (and it is). The output image is JFIF and around 9000 bytes.
If you could help me with even a hint, I'd be very grateful.
Thanks!
#include <stdio.h>
#include <stdlib.h>
#include "jpeglib.h"
#include <setjmp.h>
int main ()
{
// read data
FILE *input = fopen("input.jpg", "rb");
JSAMPLE *image_buffer = (JSAMPLE*) malloc(sizeof(JSAMPLE) * 640 * 480 * 3);
if(input == NULL or image_buffer == NULL)
exit(1);
fread(image_buffer, 640 * 3, 480, input);
// initialise jpeg library
struct jpeg_compress_struct cinfo;
struct jpeg_error_mgr jerr;
cinfo.err = jpeg_std_error(&jerr);
jpeg_create_compress(&cinfo);
// write to foo.jpg
FILE *outfile = fopen("foo.jpg", "wb");
if (outfile == NULL)
exit(1);
jpeg_stdio_dest(&cinfo, outfile);
// setup library
cinfo.image_width = 640;
cinfo.image_height = 480;
cinfo.input_components = 3; // 3 components (R, G, B)
cinfo.in_color_space = JCS_RGB; // RGB
jpeg_set_defaults(&cinfo); // set defaults
// start compressing
int row_stride = 640 * 3; // number of characters in a row
JSAMPROW row_pointer[1]; // pointer to the current row data
jpeg_start_compress(&cinfo, TRUE); // start compressing to jpeg
while (cinfo.next_scanline < cinfo.image_height) {
row_pointer[0] = & image_buffer[cinfo.next_scanline * row_stride];
(void) jpeg_write_scanlines(&cinfo, row_pointer, 1);
}
jpeg_finish_compress(&cinfo);
// clean up
fclose(outfile);
jpeg_destroy_compress(&cinfo);
}
You're reading a JPEG file into memory (without decompressing it) and writing out that buffer as if it were uncompressed, that's why you're getting garbage. You need to decompress the image first before you can feed it into the JPEG compressor.
In other words, the JPEG compressor assumes that its input is raw pixels.
You can convert your input image into raw RGB using ImageMagick:
convert input.jpg rgb:input.raw
It should be exactly 921600 bytes in size.
EDIT: Your question is misleading when you state that your input JPEG file in uncompressed. Anyway, I compiled your code and it works fine, compresses the image correctly. If you can upload the file you're using as input, it might be possible to debug further. If not, I suggest you test your program using an image created from a known JPEG using ImageMagick:
convert some_image_that_is_really_a_jpg.jpg -resize 640x480! rgb:input.jpg
You are reading the input file into memmory compressed and then you are recompressing it before righting to file. You need to decompress the image_buffer before compressing it again. Or alternativly instead of reading in a jpeg read a .raw image
What exactly do you mean by "The input JPEG file is an uncompressed"? Jpegs are all compressed.
In your code, it seems that in the loop you give one row of pixels to libjpeg and ask it to compress it. It doesn't work that way. libjpeg has to have at least 8 rows to start compression (sometimes even more, depending on parameters). So it's best to leave libjpeg to control the input buffer and don't do its job for it.
I suggest you read how cjpeg.c does its job. The easiest way I think is to put your data in a raw type known by libjpeg (say, BMP), and use libjpeg to read the BMP image into its internal representation and compress from there.

Resources