I'm looking for the most commom GoP length used by IPTV system, or videos in general.
In the research area I did not found for IPTV system but I found this article that told me an idea of GOP Lenght in LTE Networs: "GOP Length Effect Analysis on H.264/AVC Video Streaming Transmission Quality over LTE Network", witch is varible from 2 to 35. And the optimal length was between 5-8.
For me (personal use because I'm working in a Telco operator) the most commom in the IPTV systems is the 12,2, witch means 12 frames between I (intra-frames) and 2 frames "B" (bidirectional frames) between "P" frames.
Most common GOP length is 10 , Rest is depend on your type of streams , video codecs etc etc
Related
I have an H264 stream (IIS - smooth streaming) that I would like to play with SilverLight. Apparently SilverLight can do it, but how?
Note: the VC-1 stream can be played by the SilverLight, but H264 not.Also, I can provide a stream and any additional information required. H264 encoder is the one in Media Foundation (MFT). Same goes for the VC-1 that works (although is it impossible to create equal chunks for smooth streaming because forcing key-frame insertion makes video jerky.EDIT: MPEG2VIDEOINFO values for H264:
Just a guess. Based on your question 18009152. I am guessing you are encoding h.264 using the annexb bitstream format. According to comments, you can not tell the encoder to use AVCC format. Therefore, you must perform this conversion manually (Annex B WILL NOT work in an ISO container). You can do this by looking for start codes in your AVC stream. A start code is 3 or 4 bytes (0x000001, 0x00000001). You get the length of the NALU by locating the next start code, or the end of the stream. Strip the start code (throw it away) and in its place write the size of the NALU in a 32bit integer big endian. Then write this data to the container. Just to be clear, this is performed on the video frames that come out of the encoder. The extra data is a separate step that appears you have mostly figure out (except for the NALUSizeLength). Because we uses a 4 byte integer to write the NALU sizes, you MUST set NALUSizeLength to 4.
Silverlight 3 can play H264 files. Use MediaStreamSource for this.
Here is the interface description: http://msdn.microsoft.com/en-us/library/system.windows.media.mediastreamsource(v=vs.95).aspx
Also, this blog entry is related to H264 playing sing Silverlight 3: http://nonsenseinbasic.blogspot.ru/2011/05/silverlights-mediastreamsource-some.html
It will help you with other issues that may arise.
I have an array of 10345 bytes, I want to compress the array and then decompress, kindly suggest me the compression algorithm which can reduce the size of array. I am using c language, and the array is of unsigned char type.
Rephrased: Can someone suggest a general-purpose compression algorithm (or library) for C/C++?
zlib
Lossless Compression Algorithms
This post's a community wiki. I don't want any points for this -- I've already voted to close the question.
The number of bytes to compress has very little to do with choice of compression algorithm, although it does affect the implementation. For example, when you have fewer than 2^15 bytes to compress, if you are using ZLib, you will want to specify a compression-level of less than 15. The compression-level in Zlib (one of the two such parameters) controls the depth of the "look-back" dictionary. If your file is shorter than 16k bytes, then a 32k look-back dictionary will never half-fill; in that case, use one less bit of pointer into the look-back for a 1/15th edge on the compression compared to setting ZLib to "max."
The content of the data is what matters. If you are sending images with mostly background, then you might want Run Length Encoding (used by Windows .BMP, for example).
If you are sending mostly English text, than you wish you could use something like ZLib, which implements Huffman encoding and LZW-style look-back dictionary compression.
If your data has been encrypted, then attempting to compress it will not succeed.
If your data is a particular type of signal, and you can tolerate some loss of detail, then you may want to transform it into frequency space and send only the principal components. (e.g., JPEG, MP3)
I try to grok that: Apple is talking about "packets" in audio files, and there is a fancy function called AudioFileReadPackets which takes a lot of arguments. One of them specifies the "start packet", and another one the number of packets which you want to read.
So I imagine an audio file to look like this, internally: It's made up of a lot of packets. If it's an audio file which has an variable bit rate format, then every packet may have a different size. If the file has an constant bit rate format, then every packet is the same size. So an audio file is like a truck full of boxes, and every box contains some interesting stuff.
Is that correct? Does it apply to any kind of file? Is this how files actually look like?
The question (even with the "especially audio files" qualification) is far too broad; different file formats are, well, different!
So to answer the question you will first have to specify a particular file type; then the answer to the question will invariably to look at its specification. Proprietary formats may not have a publicly available specification.
Specifications for many files (official and reverse engineered) can be found at the brilliant Wotsit's Format site.
AAC used by Apple iTunes and others is defined by ISO/IEC 13818-7:2006. The document will cost you 252 Swiss Francs (about US$233)! You'd have to be really interested (commercially) to pay that rather than use an existing AAC Codec.
"Packet" is a term commonly used in data transmission, so may be more applicable to audio streaming than audio files, where a "frame" may be more appropriate, or for data files in general a "record", but the terminology is flexible because it means whatever the person that wrote it thought it meant! If enough people misuse a term, it essentially becomes redefined (or multiply defined) to mean that, so I would not get too hung up on that. The author was do doubt using it to define a unit that has a defined format within a file that has multiple such units repeated sequentially.
"Packet" looks to me like Apple-specific terminology. I just did a lot of reading and coding to process WAV and MP3 files and I don't believe I saw the term "packet" once.
Files contain whatever the application that created them chose to place in them. Files are essentially a sequence of bytes. Any further organisation is a semantic distinction made by the program that created them. It is untrue to think of all files containing the same structure.
That said, certain data storage problems are similar enough to be solved in similar ways, and patterns start to emerge. Splitting data into records or packets is an example of that.
That's pretty much what audio files look like: a series of chunks of data, or frames. AudioFileReadPacketData and AudioFileReadPackets shield you from the details of, for instance, how big a frame might be in bytes (because you might be reading from a WAV file, which has a different structure to an MP3 file, or your MP3 file uses a variable bit rate).
The concept of frames doesn't apply in general to any file, but then you wouldn't be using the Audio File Services API to access just any old file.
For MP3 (and MP1, MP2) the file consists of frames. And yes, your understanding is correct - in VBR files packets have different size. In WAV files packets have the same length if memory serves (I wrote a decoder / player 11 years ago,).
I'm trying to render frames coming from an mkv h264 file in silverlight 3 by using the MediaStreamSource.
Parsing the mkv file is fine, but I'm struggling with the expected values for CodecPrivateData in SL, which has to be a string, while the PrivateData info from mkv is a binary element.
Also, I'm not sure about in which form the frames should be given to SL (ie, the way they are stored in mkv / mp4, or transcoded as NALU)
Would anyone have any info on this?
After similar problems of my own and much head-scratching, I am able to answer this question.
In ReportOpenMediaCompleted(), when setting up your video stream description, you can ignore the CodecPrivateData attribute string, despite what the documentation says. It's not required. (assuming your stream of NAL units includes SPS and PPS units)
You should send one NAL unit back to the MediaElement for each GetSampleAsync() request.
This includes non-picture NAL units, e.g. SPS / PPS units.
When you send your NAL units, ensure there are 3-byte start codes (0x00 0x00 0x01) at the beginning of each one. (This is similar to 'Annex B' format, but not quite the same thing)
In ReportGetSampleCompleted(), set the value of 'Offset' equal to the beginning of the NAL start code, not the actual data. (in most cases this will be zero, assuming you use a fresh stream per NAL unit)
I have blogged a little about the experience here and hope to blog more.
According to the documentation the Codec private data should be set to 00000001 + sps + 00000001 + pps. However the documentation is wrong the value of CodecPrivateData seems to be completely ignored. Instead you need to pass the SPS and PPS NALS (with an annex b header of course) as the first and second result of GetSampleAsync.
For regular media samples normal 4 byte annex b headers headers work just fine
The CodecPrivateData is the contents of the 'avcC' atom which is a child of the 'stsd' atom in an MP4 file. You have to convert the binary data to a string. It will look something like this: "014D401FFFE10017674D401F925402802DD0800000030080000018478C195001000468EE32C8"
You also have to replace the mkv/mp4 lengths to NALU. I've written a little about this (to get Smooth Streaming to work for H.264 files).
Regards,
See: Smooth Streaming H264
EDIT: i want to use libsox to programatically convert a wav file's sample rate, audio format, channels, and etc.
in the libsox man page, there are a bunch of functions I can use but I'm clueless as hell on what to do. Can anyone give me a sort of steps on how to do it?
Help?
Can anyone please explain this?
The function sox_write writes len samples from buf using the format
handler specified by ft. Data in buf must be 32-bit signed samples and
will be converted during the write process. The value of len is speci-
fied in total samples. If its value is not evenly divisable by the num-
ber of channels, undefined behavior will occur.
I'd recommend a combination of libsndfile and libsamplerate
http://www.mega-nerd.com/SRC
SRC provides a small set of converters
to allow quality to be traded off
against computation cost. The current
best converter provides a
signal-to-noise ratio of 145dB with
-3dB passband extending from DC to 96% of the theoretical best bandwidth for
a given pair of input and output
sample rates.
http://www.mega-nerd.com/libsndfile/
Ability to read and write a large number of file formats.
A simple, elegant and easy to use Applications Programming
Interface.
Usable on Unix, Win32, MacOS and others.
On the fly format conversion, including endian-ness swapping, type
conversion and bitwidth scaling.
Optional normalisation when reading floating point data from files
containing integer data.
Ability to open files in read/write mode.
The ability to write the file header without closing the file (only
on files open for write or
read/write).
Ability to query the library about all supported formats and
retrieve text strings describing each
format.
Well, I guess your question has something to do with the last sentence. If you have an interleaved buffer, the number of samples in the buffer has to be divisable by the number of channels, because this is the number of per-channel samples you will write. For example, let's say you have L and R channels; your data will be like this on the buffer:
[0] 1st sample - L
[1] 1st sample - R
[2] 2nd sample - L
[3] 2nd sample - R
...
[n-1] n/2-th sample - L
[n] n-th sample - R
Hope it helps.