There is a very good chance that I am going down a pointless path on this, so I apologize if this is a waste of time. I have been trying to write uncompressed video to an FLV file, and I am not sure whether it is possible.
According to Wikipedia, a valid video encoding option is 0, which indicates an "RGB" video encoding: https://en.wikipedia.org/wiki/Flash_Video#Packets. However, I don't see any mention of this Codec ID option in Adobe's documentation; neither "Video File Format Specification Version 10" nor "Adobe Flash Video File Format Specification Version 10.1".
I proceeded under the assumption that a 0/RGB Codec ID is allowed. I hard-coded an array of unsigned char in C and used fwrite to write the following Double/Number metadata to a new, binary FLV file (which admittedly, I am assuming I wrote correctly):
duration: 4 (seconds)
width: 16 (pixels)
height: 16 (pixels)
videodatarate: 6 (Kbps)
framerate: 1 (fps)
videocodecid: 0
filesize: 3323 (bytes)
I then added 4 VIDEODATA tags, 1 for each RGB frame I was hoping to write. Their timestamps are 0, 1000, 2000, and 3000 (milliseconds). All four of them have a 769-byte payload: the first byte to specify it is a keyframe with a Codec ID of 0, and the remaining 768 are to represent a 16x16x3 (RGB) image. I wrote 255/0xFF for all values in hopes of seeing a small, white screen appear for 4 seconds.
When that did not play correctly in VLC Media Player, as I feared, I tried using RGBA colors for each frame. I also changed the videodatarate and filesize metadata to Number values 8 (Kbps) and 4347 (bytes) respectively.
Unfortunately, this did not play in VLC Media Player either. I was wondering if anyone knew for certain whether uncompressed video in an FLV file is possible? If so, I was curious what format the video data should be in (RGB, RGBA, multiple VIDEODATA tags, just one VIDEODATA tag, etc.)?
My C code is mostly one, giant array of unsigned char, but if anyone would like to see it, I can try adding it. Any advice is greatly appreciated.
Thank you,
Mitchell A
As per SirDarius, "the video encoding types listed in the Wikipedia page do not come from an official source. I would not recommend relying on those." This makes sense given that the FLV Format documentation from Adobe itself makes no mention of an uncompressed, RGB option for video encoding.
I was holding out hope that Wikipedia editors and other people knew of some undocumented easter egg in the FLV format, but I'm now convinced that's not the case.
"...The FLV Format documentation from Adobe itself makes no mention of an uncompressed, RGB option for video encoding."
For RGB (raw bitmap data) you must use theScreen 1 codec (id=3).
Strangely, it's hidden in the SWF Format documentation (not the FLV Format docs).
See Chapter 14 (page 204) which is the Video section...
You want specifically page 208 for the Screen Video codec to be explained.
Check this example code (AS3) of encoding RGB into Screen Video.
Apply the logic, especially function videoData(), which could be adjusted to read pixels uints (via some getPixel type call) or just read from an Array.
Example:
for (var x2:int = 0; x2 < xLimit; x2++)
{
var px:int = (x1 * blockWidth) + x2;
var py:int = frameHeight - ((y1 * blockHeight) + y2); // (flv's save image from bottom to top)
var p:uint = YOUR_INPUT_BITMAP.getPixel(px, py); // sample a pixel's RGB (3-bytes unsigned int)
//# IF reading from Pixel's uint value
block.writeByte( p & 0xff ); // blue
block.writeByte( p >> 8 & 0xff ); // green
block.writeByte( p >> 16 ); // red
//# ELSE IF reading from Array of R-G-B values(FLV writes in BGR format)
block.writeByte( myRGB_Array[x+2] ); // blue
block.writeByte( myRGB_Array[x+1] ); // green
block.writeByte( myRGB_Array[x] ); // red
}
Related
I'm using the H.264 library to compress a video frame by frame. It works, I can replay it back locally without any issue.
However, I need to send that video over the LAN and that LAN is rather busy already so I need to limit the size of each frame to a maximum of about 250Kb.
I use the following code to setup the parameters, but changing the bit rate values does not seem to have any effect on what the library does with the input frames:
x264_param_t param = {};
if(x264_param_default_preset(¶m, "faster", nullptr) < 0)
{
return -1;
}
param.i_csp = X264_CSP_I420;
param.i_width = 3840;
param.i_height = 2160;
param.i_keyint_max = static_cast<int>(f_frame_header.f_fps);
param.i_threads = X264_THREADS_AUTO;
param.b_vfr_input = 0;
param.b_repeat_headers = 1;
param.b_annexb = 1;
// the following three parameters are the ones I tried to change with no results
param.rc.i_bitrate = 100000;
param.rc.i_vbv_max_bitrate = 100000;
param.rc.i_vbv_buffer_size = 125000;
if(x264_param_apply_profile(¶m, "high") < 0)
{
return -1;
}
...enter loop reading frames and compressing them...
Changing the i_bitrate, i_vbv_max_bitrate and i_vbv_buffer_size parameters seems to have absolutely no effect on the size of the resulting frames. I still get some frames over 500Kb and in many even, rather large frames one after the other as the following sizes show:
20264
358875
218429
20728
25215
310230
36127
9077
29785
341541
222778
23542
21356
276772
25339
32459
421036
11179
6172
286070
193849
What I would need is the largest frame to be around 250,000 at its maximum. Now I understand that once in a while it go over a bit, but not 2×. That's just too much for my current available bandwidth.
What am I doing wrong in the parameters setup above?
I've seen this command line:
ffmpeg -i input -c:v libx264 -b:v 2M -maxrate 2M -bufsize 1M output.mp4
which would suggest that what I'm doing above should work (I tried all sorts of values including the ones one that command line). Yet the frame size does not really change between my runs.
I tried with a blur applied to each frame to see whether it work help. Yes! It did. The result is a movie which is 2.44 times smaller than the original.
To load each JPEG image from the original, I use ImageMagick++ (in C++), so I just do the following blur on each image:
image.blur(0.0, 5.0);
and that took about 10 hours total (without the blur the same processing took about 40 minutes) but it was worth it since in the end the compressed movie went from 1,293,272,023 bytes to only 529,556,265 bytes (2.44218 times smaller). The blur added about 3.3 seconds of processing per frame and there are a little over 11,000 frames in the original.
Note: I used 5.0 for the blur because I have 4K images and although I can see a sharp difference when I look at one frame, when playing back the resulting movie, I don't notice the final blur. If you have smaller images, you probably want to use a smaller number. It looks like many people use a blur of just 0.05 and already have good results in compression ratios.
In C, use the BlurImage() function:
Image *BlurImage(const Image *image,const double radius,
const double sigma,ExceptionInfo *exception)
Here are some references about using a blur to further compress JPEG images as it helps eliminates sharp edges which do not compress well in the JPEG format (as sharp edge are not as natural):
Recommendation for compressing JPG files with ImageMagick
How do I reduce the file size of an image? (search on "blur" to find the section)
Could I blur an image to dramatically reduce the file size?
What would be the best way, in linux from gnu C and not C++, to display a gif87a file on screen and redisplay it in the same location on the screen so the user can observe changes that are made on the fly to the dataset? This is not an animated gif.
in some old code (fortran77) that has a C wrapper which takes an image that was displayed on the screen and writes it to a gif file, there is a comment about X Window Applications Programming, Ed. 2, Johnson & Reichard that was used as a reference to write the C code to display image data to the screen and write a gif87a file, and this code was written around 1995, the onscreen display of the image no longer works (just a black window) but the creation of the gif file still works. What i would like to do is from the existing C code, in SLES version 11.4 with the libraries that are available to open the gif file and display it on screen. The image, or contour plot, has a color bar that the user sets the min/max value for to display the image to their liking and it would be preferable to make it as easy & efficient for the user to adjust those min max values then redraw the image (re-write the gif then redisplay on screen in same location). There's also a handful of other knobs that the user can turn, such as windowing of the dat (hamming or han) and it would be best if the user can quickly/easily run though about 5+ ways of looking at the image before settling on what is considered correct then using that final gif that was created in powerpoint, excel, etc.
Writing an X11 application is non-trivial. You can display a GIF (or any one of around 200 image formats) using ImageMagick which is included in most Linux distros and is available for macOS. Windows doesn't count.
So, you can create images and manipulate images from the command line, or in C if you want. So, let's create a GIF that is 1024x768 and full of random colours:
convert -size 1024x768 xc:blue +noise random -pointsize 72 -gravity center -annotate 0 "10" image.gif
Now we can display it, using ImageMagick's display program:
display image.gif &
Now we can get its X11 "window-id" with:
xprop -root
...
_NET_ACTIVE_WINDOW(WINDOW): window id # 0x600011
...
...
Now you can change the image, however you like with filters and blurs and morphology and thresholds and convolutions:
convert image.gif -threshold 80% -morphology erode diamond -blur 0x3 -convolve "3x3: -1,0,1, -2,0,2, -1,0,1" ... image.gif
And then tell the display program to redraw the window with:
display -window 0x600011 image.gif
Here is a little script that generates images with a new number in the middle of each frame and updates the screen:
for ((t=0;t<100;t++)) ; do
convert -size 640x480 xc:blue +noise random -pointsize 72 -fill white -gravity center -annotate 0 "$t" image.gif
display -window 0x600011 image.gif
done
Now all you need to do is find a little Python or Tcl/Tk library that draws some knobs and dials, reads their positions and changes the image accordingly and tells the screen to redraw.
As a result of the lack of enthusiasm for my other answer, I thought I'd have another attempt. I had a quick look and learn of Processing which is a very simple language, very similar to C but much easier to program.
Here is a screen shot of it loading a GIF and displaying a couple of twiddly knobs - one of which I attached to do a threshold on the image.
Here's the code - it is not the prettiest in the world because it is my first ever code in Processing but you should be able to see what it is doing and adapt to your needs:
import controlP5.*;
ControlP5 cp5;
int myColorBackground = color(0,0,0);
int knobValue = 100;
float threshold=128;
Knob myKnobA;
Knob myKnobB;
PImage src,dst; // Declare a variable of type PImage
void setup() {
size(800,900);
// Make a new instance of a PImage by loading an image file
src = loadImage("image.gif");
// The destination image is created as a blank image the same size as the source.
dst = createImage(src.width, src.height, RGB);
smooth();
noStroke();
cp5 = new ControlP5(this);
myKnobA = cp5.addKnob("some knob")
.setRange(0,255)
.setValue(50)
.setPosition(130,650)
.setRadius(100)
.setDragDirection(Knob.VERTICAL)
;
myKnobB = cp5.addKnob("threshold")
.setRange(0,255)
.setValue(220)
.setPosition(460,650)
.setRadius(100)
.setNumberOfTickMarks(10)
.setTickMarkLength(4)
.snapToTickMarks(true)
.setColorForeground(color(255))
.setColorBackground(color(0, 160, 100))
.setColorActive(color(255,255,0))
.setDragDirection(Knob.HORIZONTAL)
;
}
void draw() {
background(0);
src.loadPixels();
dst.loadPixels();
for (int x = 0; x < src.width; x++) {
for (int y = 0; y < src.height; y++ ) {
int loc = x + y*src.width;
// Test the brightness against the threshold
if (brightness(src.pixels[loc]) > threshold) {
dst.pixels[loc] = color(255); // White
} else {
dst.pixels[loc] = color(0); // Black
}
}
}
// We changed the pixels in destination
dst.updatePixels();
// Display the destination
image(dst,100,80);
}
void knob(int theValue) {
threshold = color(theValue);
println("a knob event. setting background to "+theValue);
}
void keyPressed() {
switch(key) {
case('1'):myKnobA.setValue(180);break;
case('2'):myKnobB.setConstrained(false).hideTickMarks().snapToTickMarks(false);break;
case('3'):myKnobA.shuffle();myKnobB.shuffle();break;
}
}
Here are some links I used - image processing, P5 library of widgets and knobs.
I'm working on a "GS Wrapper" (using the 9.20 SDK) for use by an external application. There i scale down for example a A0 Sheet to A1, A2 and A3 and it works fine. (PDF to PS, then Print)
Problem: When i scale down any input format to A4, the printer cut off the borders of the content (these are technical drawings with a black border each 5mm from the sheet edge).
Is there an opportunity to scale down the A4 (to A4) again about 95% and center the image? (This should be result in a smaller base image, say the black borders are about ~10mm away from the sheet border afterwards)
I use the following parameter for scaling:
GhostArg[0] = "-dNOPAUSE";
GhostArg[1] = "-dBATCH";
GhostArg[2] = "-dSAFER";
GhostArg[3] = "-dNOPAUSE";
GhostArg[4] = "-g2480x3508";
GhostArg[5] = "-dPDFFitPage";
GhostArg[6] = "-r300x300";
GhostArg[7] = "-sDEVICE=ps2write";
GhostArg[8] = Output;
GhostArg[9] = Input;
Solution Update:
I managed to fix this problem by insert this three lines between Arg[8] and Arg[9]:
GhostArg[9] = "-c";
GhostArg[10] = "<< /BeginPage { 0.99 0.99 scale 10 10 translate } >> setpagedevice";
GhostArg[11] = "-f";
Thanks to KenS for the /BeginPage hint.
It sounds like your printer has a non-printable area. This is not uncommon, the paper handling needs to hold the paper while its being printed, and this can lead to some areas of the media not being printable.
If your content reaches to the edge of the media, its possible that the printer simple cannot print there, resulting in the content being cropped.
It is entirely possible to have ps2write drop the media content to a smaller size, but you can't have it (automatically) scale down and also shift the content location, because the content is fitted to the media size.
However, the FitPage mechanism doesn't look at the content, just the media size requests. So if the input requests A3 and the selected media is A4 (and fixed) then a scale factor is applied to scale the content to the required media size (and the media request for A3 is ignored).
So what you could do is leave the code you have as it is as present, but add a BeginPage or Install procedure which uses the scale operator to further reduce the size of marks on the page, and the translate operator to move the origin slightly so that the final content is centered.
Something like (example only, untested):
<<
/BeginPage {
0.95 0.95 scale
16 20 translate
}
>> setpagedevice
By the way, you do realise Ghostscript is licenced under the AGPL ?
Also, I'd very strongly recommend that you do not use the -g and -r switches, but instead simply use -dDEVICEWIDTHPOINTS and -dDEVICEHEIGHTPOINTS to alter the media size.
The -g switch works in pixels, but high level output devices (eg pdfwrite and ps2write) don't emit pixels, they write high level vector objects. However, due to differences in the PostScript and PDF graphics models, some elements do need to be rendered to images and enclosed in that fashion in the PostScript output. By setting the resolution to 300 you are fixing the resolution at which those elements (eg pages containing transparency) are rendered. I'd suggest that you don't do so, unless you are working in a very tightly controlled workflow and know the resolution of the final output.
By using the DEVICEHEIGHTPOINTS and DEVICEWIDTHPOINTS switches you can control the media size without reference to the resolution. Note that in PostScript (and PDF) 1 point = 1/72 inch.
I am writing a software which processes audio files. I am using libsndfile library for reading wave file data, and I come across a doubt that wasn't solved by their documentation: what is the difference between functions that read items and functions that read frames? Or, in other words, am I getting the same results if I interchange both sf_read_short and sf_readf_short?
I have read in some questions that an audio frame equals a single sample, so I thought that what libsndfile calls items might be the same thing. During my tests they seemed to be the same.
I was concerned too and found the answer.
Q12 : I'm looking at sf_read*. What are items? What are frames?
An item is a single sample of the data type you are reading; ie a
single short value for sf_read_short or a single float for
sf_read_float. For a sound file with only one channel, a frame is the
same as a item (ie a single sample) while for multi channel sound
files, a single frame contains a single item for each channel.
Here are two simple, correct examples, both of which are assumed to be
working on a stereo file, first using items:
#define CHANNELS 2
short data [CHANNELS * 100] ;
sf_count items_read = sf_read_short (file, data, 200) ;
assert (items_read == 200) ;
and now readng the exact same amount of data using frames:
#define CHANNELS 2
short data [CHANNELS * 100] ;
sf_count frames_read = sf_readf_short (file, data, 100) ;
assert (frames_read == 100) ;
This is a copy&paste from:
libsndfile FAQ, question 12.
On some audio files the value of MediaElement.NaturalDuration is less than the actual duration of the audio. When I open the file in Windows Media Player the duration is correct (also when I look at the properties of the file). Although the value of the NaturalDuration property is incorrect, the audio is played fully, but at some point the value of the Position property becomes greater than the value of the NaturalDuration property, which, as I understand, should never happen.
I have created a simple application to reproduce the problem: https://skydrive.live.com/redir?resid=ACF8BFD4384116CE!2908&authkey=!AG-wF6Ae-7EAYk8
The duration of the audio file used in the application is 00:02:54, but the value of the NaturalDuration property is 00:01:59.
Does anyone know why and if there is a workaround for this?
Thanks in advance for any help.
Ok, this is not an answer but some results of a short investigation that give some clues why it behaves like that and where those numbers come from (2:58 and 1:59). First look at this thread: Calculating the length of MP3 Frames in milliseconds
Two things that we will use from there:
1) Frame length (in ms) = (samples per frame / sample rate (in hz)) * 1000, and
Duration in sec = Frame length (in ms) * number of frames / 1000
2) There are some standards regarding number of samples for different MPEG versions:
Samples per frame:
MPEG Version 1
384, // Layer1
1152, // Layer2
1152 // Layer3
MPEG Version 2 & 2.5
384, // Layer1
1152, // Layer2
576 // Layer3
Now lets check in winamp what it says about files format info:
MPEG-2.5 layer 3
16 kbps, 2482 frames
Now if you take frames = 2482 and samples per frame = 576 (MPEG-2.5 layer 3) you'll get duration 2:58. But it looks like for some reason silverlight and iTunes uses samples per frame = 384 which gives us 1:59. Next step could be to check the real values of file's headers and if they are correct and it is possible to calculate correct duration - well than you could cook up some hack to get durations separately (from the server for example). But I'm pretty sure - that file has some defects (inconsistent headers and content) and some players can handle it, others - not.