Where to find stereo test sequences in yuv - video-processing

Where can I find YUV stereo test sequences? I have to use it for disparity compensation so I'll be needing the seprate left and right views. Thank you!

There are at least 2 digital libraries, which contain uncompressed stereo video in YUV 4:2:0 with separate-left-right-views format:
http://sp.cs.tut.fi/mobile3dtv/stereo-video/
FTP-download only, video resolution for YUVs is ~480x270. These sequences are not copyright-free, so contacting license file is useful.
http://nma.web.nitech.ac.jp/fukushima/multiview/multiview.html
Here you need 'Multiview sequence 1' and 'Multiview sequence 2'
HTTP-download possible. These sequences are actually multiview, it means they contain not only left and right views, but more of them. Any 2 views from there will be fine as stereo video.
RMIT3dv library may also be useful, but it contains uncompressed YUV 4:2:2 video in .MOV format, so you will need to perform some format conversion to get .yuv files. But video quality on this site is way better.
Link to this library will be the first answer to 'RMIT3dv' request to google.

Related

How to filter specific frequencies from an Audio file in C

After searching on various search engines, and also here, there is very little information applicable to my situation.
Basically I want to make a program in C that does the following:
Open an Audio File (flac Mp3 and wav, to represent a bit of variety)
Filter and cut out a specific set of frequencies (for Example 4000-5200hz, the frequencies should be entered upon inquiry)
Save the new file (without the filtered frequencies) in the same format as the input file.
Things that would be of interest to me:
Open-Source examples of software that does the same or a similar thing, preferably in C
ANY literature on audio programming in C
Explanations on how the different formats are structured, any sources appreciated
Ps.: I apologise if some parts of the question can be easily googled, but I tried, and there wasn't anything that described this well in detail.
Thanks a lot!!
Answers:
FFmpeg does a lot of audio slicing and dicing, and it's written in pure C. It's pretty big, though, and might be difficult to digest in one go.
"Audio programming" is a bit vague. But from the rest of your question, it sounds like you want to open an audio file from disk, apply some transformations to the audio, and write the data to a new file. (Other areas under the "audio programming" umbrella would include accessing platform-specific APIs to read from a microphone and write audio to an output device).
Broad topic again, but we'll start simple.
I suggest getting (or generating) a .WAV file to start with. WAV files are probably the simplest audio files to read and write manually. Here is a page that describes what you need to know about the WAV format.
Pulse code modulation (PCM) is the simplest audio format to work with since you don't need to worry about decompressing it first. Here is a page (that I wrote) describing different PCM formats.
As for filtering and cutting different frequencies, I think what you're looking for would be low-pass, high-pass, or band-pass filters.
I hope that helps you get started. Ask more questions here on Stack Overflow as needed.

How to play H264 stream with SilverLight?

I have an H264 stream (IIS - smooth streaming) that I would like to play with SilverLight. Apparently SilverLight can do it, but how?
Note: the VC-1 stream can be played by the SilverLight, but H264 not.Also, I can provide a stream and any additional information required. H264 encoder is the one in Media Foundation (MFT). Same goes for the VC-1 that works (although is it impossible to create equal chunks for smooth streaming because forcing key-frame insertion makes video jerky.EDIT: MPEG2VIDEOINFO values for H264:
Just a guess. Based on your question 18009152. I am guessing you are encoding h.264 using the annexb bitstream format. According to comments, you can not tell the encoder to use AVCC format. Therefore, you must perform this conversion manually (Annex B WILL NOT work in an ISO container). You can do this by looking for start codes in your AVC stream. A start code is 3 or 4 bytes (0x000001, 0x00000001). You get the length of the NALU by locating the next start code, or the end of the stream. Strip the start code (throw it away) and in its place write the size of the NALU in a 32bit integer big endian. Then write this data to the container. Just to be clear, this is performed on the video frames that come out of the encoder. The extra data is a separate step that appears you have mostly figure out (except for the NALUSizeLength). Because we uses a 4 byte integer to write the NALU sizes, you MUST set NALUSizeLength to 4.
Silverlight 3 can play H264 files. Use MediaStreamSource for this.
Here is the interface description: http://msdn.microsoft.com/en-us/library/system.windows.media.mediastreamsource(v=vs.95).aspx
Also, this blog entry is related to H264 playing sing Silverlight 3: http://nonsenseinbasic.blogspot.ru/2011/05/silverlights-mediastreamsource-some.html
It will help you with other issues that may arise.

Reading YUV images in C

How to read any yuv image?
How can the dimensions of an YUV image be passed for reading to a buffer?
Usually, when people talk about YUV they talk about YUV 4:2:0. Your reference to any YUV image is misleading, because there are a number of different formats, and each is handled differently. For example, raw YUV 4:2:0 (by convention, files with an extension .yuv) doesn't contain any dimension data; whereas y4m files typically do. So you really need to know what sort of image you're dealing with before you start reading it. For YUV 4:2:0, you just open the file and read the luminance (Y') and chrominance (CbCr) components in that order, keeping in mind the chrominance components have been decimated (quarter size of the luminance components). After that you typically convert Y'CbCr to RGB so you can display the image.
From what I've mentioned above, you simply can't determine the dimensions of any YUV image. Often, the dimensions are simply not in the file. You have to know them up front -- this is why most sites listing YUV files for download also list their frame dimensions (and for video, the number of frames). Most YUV files are either CIF (352x288) or QCIF (176x144), although there are some files digitized from analog video that are in the slightly different SIF format. Read about the different formats here.

audio to 8-bit text sample conversion

I have an interesting question today.
I need to convert some pokemon audio files to a list of 8-bit samples (0-255 values). I am writing an assembly routine on the MC6800 chipset that will require these sounds to be played. I plan on including an array with the 8-bit samples that the program will loop through when a function is called.
Does anyone know a way to convert audio files (wav/mp3) into a list of comma separated 8-bit text sample values? Or anything of this relative method?
Thank you so much in advance!
You can use the command-line "sox" tool or the Audacity audio editor to convert the file to a raw, unsigned 8-bit mono audio file.
In Audacity 1.3 or higher, open the audio then select Export, choose "Wave, AIFF, and other uncompressed types" as the format, then click Options... - then choose "Other..." for the Format, "RAW" for the Header, and Signed 8-bit PCM as the encoding. (Sorry, unsigned isn't available.)
From the command line, try sox with -c 1 for 1 channel, -t raw for no header, -u for unsigned linear, and -1 for 1 byte per sample.
Then you can use a tool like "hexdump" to dump out the bytes of the file as numbers and paste them into your code.
If sox doesn't have it, you will have to use it to generate raw (headerless) files and convert the raw files to comma-separated yourself.
EDIT: sox has "Raw textual data" as one of its formats, from the web page. You can make it convert your sound files to unsigned 8-bit linear samples in a first pass and then probably get exactly the output you want using this option for output.
For .wav it is a very simple process. You can find the .wav specification easily with a google search. It comprises a header then simply raw samples. You should read the header first, then loop through all the samples. Usually they are 16 bit samples, so you want to normalize them from the range -32768 to 32767 to your 0-255 range. I suggest simple scaling at first. If that's not successful maybe find the actual min and max amongst the samples and adjust your scale accordingly.
Well a lot depends on your audio format. The wave format, for example, consists of uncompressed interleaved PCM data.
ie for an 8-bit stereo file each sample will be arranged as follows.
[Left Sample 1][Right Sample 1][Left Sample 2][Right Sample2]...[Left Sample n][Right sample n].
ie each 8 bit stereo sample is stored in 2 bytes. 1 for the left channel and 1 for the right. This is the data format your sound hardware will most likely require.
A 16 or 24-bit audio file will work in each way but the left and right samples will be 2 or 3 bytes each, respectively.
Obviously a wave file has a load of extyra information in it. It follows the RIFF format. You can find info on it and the "chunks" wave files use at places such as www.wotsit.org.
To decompress an MP3 is more complicated. You are best off getting hold of a decompressor and running it on the MP3 encoded audio. IT will spit out PCM data as above from the other side.

How to use MediaStreamSource to play h264 frames coming from a matroska file?

I'm trying to render frames coming from an mkv h264 file in silverlight 3 by using the MediaStreamSource.
Parsing the mkv file is fine, but I'm struggling with the expected values for CodecPrivateData in SL, which has to be a string, while the PrivateData info from mkv is a binary element.
Also, I'm not sure about in which form the frames should be given to SL (ie, the way they are stored in mkv / mp4, or transcoded as NALU)
Would anyone have any info on this?
After similar problems of my own and much head-scratching, I am able to answer this question.
In ReportOpenMediaCompleted(), when setting up your video stream description, you can ignore the CodecPrivateData attribute string, despite what the documentation says. It's not required. (assuming your stream of NAL units includes SPS and PPS units)
You should send one NAL unit back to the MediaElement for each GetSampleAsync() request.
This includes non-picture NAL units, e.g. SPS / PPS units.
When you send your NAL units, ensure there are 3-byte start codes (0x00 0x00 0x01) at the beginning of each one. (This is similar to 'Annex B' format, but not quite the same thing)
In ReportGetSampleCompleted(), set the value of 'Offset' equal to the beginning of the NAL start code, not the actual data. (in most cases this will be zero, assuming you use a fresh stream per NAL unit)
I have blogged a little about the experience here and hope to blog more.
According to the documentation the Codec private data should be set to 00000001 + sps + 00000001 + pps. However the documentation is wrong the value of CodecPrivateData seems to be completely ignored. Instead you need to pass the SPS and PPS NALS (with an annex b header of course) as the first and second result of GetSampleAsync.
For regular media samples normal 4 byte annex b headers headers work just fine
The CodecPrivateData is the contents of the 'avcC' atom which is a child of the 'stsd' atom in an MP4 file. You have to convert the binary data to a string. It will look something like this: "014D401FFFE10017674D401F925402802DD0800000030080000018478C195001000468EE32C8"
You also have to replace the mkv/mp4 lengths to NALU. I've written a little about this (to get Smooth Streaming to work for H.264 files).
Regards,
See: Smooth Streaming H264

Resources