jpeg file format decoding - c

I'm trying to write a JPEG/JFIF encoder and decoder from scratch using C. I experimented writing a sample JPEG file, but it seems that I cannot open it using MS paint, Firefox. But I can decode it using JPEGsnoop ( http://www.impulseadventure.com/photo/jpeg-snoop.html?ver=1.5.2) and http://nothings.org/stb_image.c . I think the sample JPEG file complies the JPEG/JFIF standard, I don't know why applications like MS paint and Firefox cannot open it.
Here is how the sample JPEG looks like:
SOI
APP0 segment
DQT segment (contains two quantization tables)
COM segment
SOF0 segment
DHT segment (contains four Huffman tables)
SOS segment
huffman encoded data
EOI
The sample JPEG file has three component Y Cb Cr. No subsampling for Cb Cr component.
The two quantization tables are all filled with ones.
The Four huffman tables in DHT segment are all identical, it looks like this
[0 0 0 0 0 0 0 255 0 0 0 0 0 0 0 0]
[0,1,2, ... , 254]
That means all the codes are 8bits, so huffman encoding does not really compress data.
The huffman encoded data look like this:
[0x0000(DC) 0x0000(AC)](Y)
[0x0000(DC) 0x0000(AC)](Cb)
[0x0000(DC) 0x0000(AC)](Cr) for all (i, j) MCUs except (10, 10)
the data in (10, 10) MCU:
[0x0008(DC) 0x0000(DC), 0x0000(AC)](Y)
[0x0000(DC) 0x0000(AC)](Cb)
[0x0000(DC) 0x0000(AC)](Cr)
Can anyone tell me what is wrong with this sample JPEG file? thanks.
Here is a link to the sample JPEG file (ha.jpg) http://www.guoxiaoyong.net/ha.jpg

I had a similar problem years ago with some PNG code (though I didn't write it from scratch). It turns out my code was more standards compliant than the libraries by Windows, some browsers, etc. They did fine on typical cases, but choked on unusual and contrived images, even if they were completely in line with the standard. A common way to trip them up was to use an odd pixel width for the image. Almost half of my test suite was not viewable with Windows. (This was many versions ago, like Windows 95. The Windows codecs have improved substantially.)
I ended up building the open source PNG library and using it as my reference implementation. As long as the images that my code produced could be parsed by the reference implementation and vice versa, I called it good. I also checked that my code could display any image that Windows could display. Every time I found a bug, I added the image to my test suite before I fixed it. That was good enough for my project.
You could do the same. I believe there's an open source JPEG library that's widely used as a reference implementation.
If you really want to figure out why Firefox (or whatever) cannot open your image, you could try starting with an image that does open in Firefox. Incrementally make small changes (e.g, with a hex editor) to make it more like the image that fails. That might help you narrow down what aspect of your image is tripping up the application. Admittedly, some of those steps may be hard to try.

Firefox, (and many other apps AFAIK) is based on the open-source JPEG library from the Independent JPEG group.
You could download the source for this, and then see exactly why and when it doesn't like your file.
Also, this would save you reinventing the wheel :-)

I think your file is very unconventionally coded. I would suggest that you find a reference file and try to mimic that structure. Also, I would use the sample tables from the standard. Your Huffman data is full of zeros making every DC-value zero, followed by and End-of-block.
If you look in jpegsnoop your image is in two shades but it should be homogeneous. My guess is that you haven't got enough data to code the image at the resolution you've specified. I believe a lot of decoders would assume that it means your file is corrupt.

Related

C - drawing in a bitmap

I have to calculate the flight path of a projectile and draw the result in a bitmap file. So far I'm pretty clueless how to do that.
Would it be a good idea to safe the values of the flight path in a struct and transfer it to the bitmap file?
Do you have any other suggestions how it could be done in a better way?
The simplest way to produce an image file without much hassle with only standard C library tools is most likely writing a bmp file. For start, check the Wikipedia article on this file format, it gives a quite complete description of it.
If you don't want to go too deep in that, save for example a 640x480 or so empty 24 bit ("truecolor") .bmp image, and rip out it's header for your use. Depending on the program you use to save your image, you might end up with varying header size, however since the data is not compressed, it is fairly easy isolate the header. For a 640x480 image the data will be exactly 921600 bytes long, anything preceding it is the header.
From here the colors are (usually) in RGB order, bottom to top row, left to right. Experimenting a little should give you the proper results.
If you only have the standard C libraries to work with, it is unlikely there is anything much simpler to implement. Of course this case you will have to write a pixel matrix (so no much assistance for solving the actual problem you want to image), but that's true for any image format (maybe except if you rather aim for creating an SVG for a twist, it is neither too hard, just XML).

Using DMA to load an image into Visual Boy Advance (VBA)

I am new to C; I have an image file translated by means of online tools into a .h and .c file. The C file contains an array of 1024 16 bit hexadecimal numbers, used to denote on/off of bits. I want to read this file and draw the image onscreen using DMA...but I'm very much at a loss as to how to do this. Can anybody out there help? Does anyone even know what I'm talking about?
To draw an image onscreen, use DMA[3]. This is channel 3 of DMA for images.
This is how you set up DMA in a .h file:
http://nocash.emubase.de/gbatek.htm#gbadmatransfers
And then to draw an image using DMA:
#######include image.h
DMA[3].src = (specify your image source here, where you're drawing from)
DMA[3].dst = (where you're drawing pixels to)
In your scenario, I think you indicate the name of the file in your source.
Keep in mind you're using POINTERS to images for src and dst.
DMA[3].cnt = (how many times you want to do it) | flag1 | flag2...
Here are some flags:
DMA_SOURCE_FIXED means you draw from the same pixel over and over again. If this is what you want, then turn this bit on in cnt.
DMA_DESTINATION_FIXED applies that you're drawing TO the same pixel over and over again. If this is what you want, then turn on this bit in cnt.
Otherwise, DMA_SOURCE_INCREMENT and DMA_DESTINATION_INCREMENT are on by default (if not, you can turn them on in cnt anyway).
This is what I used for VBA, so I'm sorry if this does not answer your question (I'm kind of inexperienced with C as well...).
#Michael Yes, I mean the Visual Boy Advance

Bitmap Image Output in C

I'm working on small project in C and at one point, I need to write a picture with the content of an array to a file. This will have to run on an embedded system at some point, so additional libraries are not an option.
The code I have so far works (in a modified version) for RGB, but fails for 8bit Grayscale.
This is a stripped down version of the code so far: http://pastebin.com/U1UYAPuT
As I strongly suspect the header to be broken in some way, my question comes down to: What is a correct header for a BMP file for 8Bit Grayscale?
Your code would be a lot simpler if you ditched BMP and wrote images as PGM files instead. The format is a lot more portable and easy to work with in code. Both formats are uncompressed so data rates would be about the same. The only thing you would lose would be the ability to view the images natively on Windows systems -- whether or not this is a big deal depends on your requirements.
Here are some examples.
EDIT
At the very least, if you write your images in PGM and broken BMP, you can use imagemagick to reliably convert the PGM to a working BMP. Then compare the headers of the working and broken BMP images using a binary diff tool and fix your BMP writer, if required.

What C library allows scaling of ginormous images?

Consider the following file:
-rw-r--r-- 1 user user 470886479 2009-12-15 08:26 the_known_universe.png
How would you scale the image down to a reasonable resolution, using no more than 4GB of RAM?
For example:
$ convert -scale 7666x3833 the_known_universe.png
What C library would handle it?
Thank you!
I believe libpng has a stream interface. I think this can be used to read parts of the image at a time; depending on the image file you might be able to get the lines in order. You could then shrink each line (e.g. for 50% shrinking, shrink the line horizontally and discard every second line) and write to an output file.
Using libpng in C can take a fair amount of code, but the documentation guides you through it pretty well.
http://www.libpng.org/pub/png/libpng-1.2.5-manual.html#section-3.8
You could try making a 64 bit build of ImageMagick or seeing if there is one. My colleague wrote a blog with a super-simple png decoder (assumes you have zlib or equivalent) so you can kind of see the code you'd need to roll your own.
http://www.atalasoft.com/cs/blogs/stevehawley/archive/2010/02/23/libpng-you-re-doing-it-wrong.aspx
You would need to do the resample as you're reading it in.
I used cximage a few years ago. I think the latest version is at
http://www.xdp.it/cximage.htm
after moving off of CodeProject.
Edit: sorry, it's C++ not C.
You could use an image processing library that is intended to do complex operations on large (and small) images. One example is the IM imaging toolkit. It links well with C (but is implemented at least partly in C++) and has a good binding to Lua. From the Lua binding it should be easy to experiment.
libvips is comfortable with huge images. It's a streaming image processing library, so it can read from the source, process, and write to the destination simultaneously and in parallel. It's typically 3x to 5x faster than imagemagick and needs very little memory.
For example, with the largest PNG I have on my laptop (1.8gb), I can downsize 10x with:
$ vipsheader huge.png
huge.png: 72000x72000 uchar, 3 bands, srgb, pngload
$ ls -l huge.png
-rw-r--r-- 1 john john 1785845477 Feb 19 09:39 huge.png
$ time vips resize huge.png x.png 0.1
real 1m35.279s
user 1m49.178s
sys 0m1.208s
peak RES 230mb
Not fast, but not too shabby either. PNG is rather a slow format, it would be much quicker with TIFF.
libvips is installable by most package managers (eg. homebrew on macOS, apt on Debian), there's a Windows binary, and it's free (LGPL). As well as the command-line, there are bindings for C, C++, Python, Ruby, Lua, node, PHP, and others.
Have you considered exploring pyramid based images? Imagine a pyramid where the image is divided up in multiple layers, each layer with a different resolution. Each layer is split up into tiles.
This way you can display a zoomed out version of the image, and also a zoomed in partial view of the image, without having to re-scale.
See the Wikipedia entry.
One of the original formats was FlashPix, which I wrote a renderer for.
I've also created a new format of a pyramid converter and renderer, which was used for a medical application. An actual scanner would produce 90GB+ scans of a slice of an organ for cancer research.
The algorithm of the converter was actually pretty tricky to get efficient, to produce the pyramid images efficienty. Believe it or not, it was actually Java based, and it performed much better than you'd think. It used multithreading. Benchmarking showed it was unlikely that a C version would do a whole lot better. This was 6ish years ago. The original renderer I did over 10 years ago.
You don't hear anything about pyramid based images anymore these days. But it's really the only efficient way to produce scaled images on demand without having to generate cached scaled versions.
Jpeg2000 may or may not have an optional pyramid feature as well.
I recall that ImageMagick's supporter formats and conversions perhaps, include FlashPix.
Googling for "image pyramid" reveals some interesting results. Bring back some memories ;-)
If you can move it to a 64-bit OS you can open it as a memory mapped file or equivalent and use pretty much any library you want. It won't be fast, and may need the increase of the page/swap file (depending on the OS and what else you want to do with it) but in return you won't be limited to streaming libraries so you'll be able to do more operation before going into resolution reduction or slicing.

Reading tag data for Ogg/Flac files

I'm working on a C library that reads tag information from music files. I've already got ID3v2 taken care of, but I can't figure out how Ogg files are structured.
I opened a .ogg file in a hexeditor and I could find the tag data because that was all human readable. But everything from the beginning of the file to the tag data looked like garbage. How is this data encoded?
I don't need any help in the actual code, I just need help visualizing what a Ogg header looks like and what encoding it uses so I that I can read it. I'd like to use a non-hacky approach to reading Ogg files.
I've been looking at the Flac format, which has been helpful.
The Flac file I'm looking at has about 350 bytes between the "fLac" identifier and the human readable Comments section, and none of it is human readable in my hex editor, so I'm sure there has to be something important in there.
I'm using Linux, and I have no intention of porting to Windows or OS X. So if I need to use a glibc only function to convert the encoding, I'm fine with that.
The Ogg file format is documented here. There is a very nice graphical visualization as you requested with a detailed written description.
You may also want to look at libogg which is a open source BSD-licensed library for reading and writing Ogg files.
As is described in the link you provided, the following metadata blocks can occur between the "fLaC" marker and the VORBIS_COMMENT metadata block.
STREAMINFO: This block has information about the whole stream, like sample rate, number of channels, total number of samples, etc. It must be present as the first metadata block in the stream. Other metadata blocks may follow, and ones that the decoder doesn't understand, it will skip.
APPLICATION: This block is for use by third-party applications. The only mandatory field is a 32-bit identifier. This ID is granted upon request to an application by the FLAC maintainers. The remainder is of the block is defined by the registered application. Visit the registration page if you would like to register an ID for your application with FLAC.
PADDING: This block allows for an arbitrary amount of padding. The contents of a PADDING block have no meaning. This block is useful when it is known that metadata will be edited after encoding; the user can instruct the encoder to reserve a PADDING block of sufficient size so that when metadata is added, it will simply overwrite the padding (which is relatively quick) instead of having to insert it into the right place in the existing file (which would normally require rewriting the entire file).
SEEKTABLE: This is an optional block for storing seek points. It is possible to seek to any given sample in a FLAC stream without a seek table, but the delay can be unpredictable since the bitrate may vary widely within a stream. By adding seek points to a stream, this delay can be significantly reduced. Each seek point takes 18 bytes, so 1% resolution within a stream adds less than 2k. There can be only one SEEKTABLE in a stream, but the table can have any number of seek points. There is also a special 'placeholder' seekpoint which will be ignored by decoders but which can be used to reserve space for future seek point insertion.
Just after the above description, there's also the specification of the format of each of those blocks. The link also says
All numbers used in a FLAC bitstream are integers; there are no floating-point representations. All numbers are big-endian coded. All numbers are unsigned unless otherwise specified.
So, what are you missing? You say
I'd like a non-hacky approach to reading Ogg files.
Why re-write a library to do that when they already exist?

Resources