I am trying to create a video file using ffmpeg. I have all the RGB pixel data for each frame, and following this blogpost I have code which sends the data frame by frame via a pipe. And it works mostly. However if any pixel has a value of 10 in any of the 3 channels (e.g. #00000A, #0AFFFF, etc) then it produces these errors:
[rawvideo # 0000020c3787f040] Packet corrupt (stream = 0, dts = 170)
pipe:: corrupt input packet in stream 0
[rawvideo # 0000020c3789f100] Invalid buffer size, packet size 32768 < expected frame_size 49152
Error while decoding stream #0:0: Invalid argument
And the output video is garbled.
Now I suspect because 10 in ASCII is newline character, that this is confusing the pipe somehow.
What exactly is happening here and how do I fix it so that I can use RGB values like #00000a?
Below is the C code which is an example of this
#include <stdio.h>
unsigned char frame[128][128][3];
int main() {
int x, y, i;
FILE *pipeout = popen("ffmpeg -y -f rawvideo -vcodec rawvideo -pix_fmt rgb24 -s 128x128 -r 24 -i - -f mp4 -q:v 1 -an -vcodec mpeg4 output.mp4", "w");
for (i = 0; i < 128; i++) {
for (x = 0; x < 128; ++x) {
for (y = 0; y < 128; ++y) {
frame[y][x][0] = 0;
frame[y][x][1] = 0;
frame[y][x][2] = 10;
}
}
fwrite(frame, 1, 128*128*3, pipeout);
}
fflush(pipeout);
pclose(pipeout);
return 0;
}
EDIT: for clarity, I am using Windows
I've just tried your code in Linux and it worked for me. I think #Craig Estey suggestion is probably the answer.
If it doesn't work, you could try to write the data using write instead of fwrite, if available. (I've had issues writing binary data to pipes using fread/fwrite family of functions in the past.)
So you could try changing this line:
fwrite(frame, 1, 128*128*3, pipeout);
To something like:
int fd = fileno(pipeout);
write(fd, frame, sizeof(frame));
And also remove the following line:
fflush(pipeout);
EDIT: There is some troubleshooting tips on the comment section of the blog post you linked. Specially regarding the Windows version of this program.
Related
Even though a question of this nature sounds very similar, I am having problems in converting a jpg image to yuv in C (without using opencv).
This is what I have understood as of now, how to solve this problem :
Identify the structure of file formats for jpg and yuv. i.e what each byte in the file actually contains. This is what I think jpg format looks like.
With the above structure I tried to read a jpg file and tried to decipher its 18th and 19th bytes. I did type cast them to both char and int but I don`t get any meaningful values for width and height of the image.
Once I have read these values, I should be able to convert them from jpg to yuv. I was looking at this resource.
Appropriately, construct yuv image and write it to a (.yuv) file.
Kindly help me by pointing me to appropriate resources. I will keep updating my progress on this post. Thanks in advance.
Usually the image is already stored in YUV (or, to be more precise: YCbCr).
When reading the file, the jpeg reader usually converts YUV to RGB. Converting back will reduce quality somewhat.
In libTurboJpeg (http://libjpeg-turbo.virtualgl.org/) you can read the jpeg without color conversion. Check https://github.com/libjpeg-turbo/libjpeg-turbo/blob/master/turbojpeg.h -
it has the tjDecompressToYUV function which gives you the 3 colorspaces on 3 different output buffers.
Not sure what you have against opencv, maybe ImageMagick is acceptable to you? It is installed on most Linux distors and is available for OSX, and Windows. It has C bindings, and also a command-line version that I am showing here. So you can create an image like this:
# Create test image
convert -size 100x100 \
\( xc:red xc:lime xc:blue +append \) \
\( xc:cyan xc:magenta xc:yellow +append \) \
-append image.jpg
Now convert to YUV and write to 3 separate files:
convert image.jpg -colorspace yuv -separate bands.jpg
bands-0.jpg (Y)
bands-1.jpg (U)
bands-2.jpg(V)
Or, closer to what you ask, write all three bands YUV into a binary file:
convert image.jpg -colorspace yuv rgb:yuv.bin
Based on https://en.wikipedia.org/wiki/YUV#Y.27UV444_to_RGB888_conversion
Decoding a JPEG, well in pure C without libraries ... the following code is somewhat straightforward ...
https://bitbucket.org/Halicery/firerainbow-progressive-jpeg-decoder/src
Assuming you have the jpeg decoded to rgb using the above or a library (using a library is likely easier).
int width = (width of the image);
int height = (height of the image);
byte *mydata = (pointer to rgb pixels);
byte *cursor;
size_t byte_count = (length of the pixels .... i.e. width x height x 3);
int n;
for (cursor = mydata, n = 0; n < byte_count; cursor += 3, n += 3)
{
int red = cursor[0], green = cursor[1], blue = cursor[2];
int y = 0.299 * red + 0.587 * green + 0.114 * blue;
int u = -0.147 * red + -0.289 * green + 0.436 * blue;
int v = 0.615 * red + -0.515 * green + -0.100 * blue;
cursor[0] = y, cursor[1] = u, cursor[2] = v;
}
// At this point, the entire image has been converted to yuv ...
And write that to file ...
FILE* fout = fopen ("myfile.yuv, "wb");
if (fout) {
fwrite (mydata, 1, byte_count, fout);
fclose (fout);
}
I am trying to use libsndfile to write a multichannel wav that can be read by MATLAB 2010+.
the following code writes a 4 channel interleaved wav. all samples on channel 1 should be 0.1, on channel 2 they are 0.2, on channel 3 ... etc.
Each channel is 44100 samples in length.
I drag the wave file onto the MATLAB workspace and unfortunately MATLAB keeps returning "File contains uninterpretable data".
It may also be worth noting that when all samples are set to 0.0, MATLAB successfully reads the file, although very slowly.
I have successfully used libsndfile to read multichannel data written by MATLAB's wavwrite.m, so the library is setup up correctly I believe.
Audacity can read the resulting file from the code below.
VS 2012 64 bit compiler,
Win7 64bit, MATLAB 2015a
ref: the code has been adapted from http://www.labbookpages.co.uk/audio/wavFiles.html
Any suggestions, I presume i'm making a simple error here?
Thanks
#include <sndfile.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
// Create interleaved audio data
int numFrames_out = 44100;
int channels = 4;
float *int_y;
int_y = (float*)malloc(channels*numFrames_out*sizeof(float));
long q=0;
for (long i = 0; i<numFrames_out; i++)
{
for (int j = 0; j<channels; j++)
{
int_y[q+j] = ((float)(j+1))/10.0;
}
q+=channels;
}
// Set multichannel file settings
SF_INFO info;
info.format = SF_FORMAT_WAV | SF_FORMAT_PCM_32;
info.channels = channels;
info.samplerate = 44100;
// Open sound file for writing
char out_filename[] = "out_audio.wav";
SNDFILE *sndFile = sf_open(out_filename, SFM_WRITE, &info);
if (sndFile == NULL)
{
fprintf(stderr, "Error opening sound file '%s': %s\n", out_filename, sf_strerror(sndFile));
return -1;
}
// Write frames
long writtenFrames = sf_writef_float(sndFile, int_y, numFrames_out);
// Check correct number of frames saved
if (writtenFrames != numFrames_out) {
fprintf(stderr, "Did not write enough frames for source\n");
sf_close(sndFile);
free(int_y);
return -1;
}
sf_close (sndFile);
}
It looks like you are only closing the output file (using sf_close()) in the error case. The output file will not be a well formed WAV file unless you call sf_close() at the end of your program.
When I call
frame_size = x264_encoder_encode(encoder, &nals, &i_nals, &pic_in, &pic_out);
and subsequently write each NAL to a file like this:
if (frame_size >= 0)
{
int i;
int j;
for (i = 0; i < i_nals; i++)
{
printf("******************* NAL %d (%d bytes) *******************\n", i, nals[i].i_payload);
fwrite(&(nals[i].p_payload[0]), 1, nals[i].i_payload, fid);
}
}
then I get this
My questions are:
1) Is it normal that there's readable parameters in the beginning of the file?
2) How do I configure the X264 encoder so that the encoder returns frames that I can send via UDP without the packet getting fragmented (size must be below 1390 or somewhere around that).
3) With the x264.exe I pass in these options:
"--threads 1 --profile baseline --level 3.2 --preset ultrafast --bframes 0 --force-cfr --no-mbtree --sync-lookahead 0 --rc-lookahead 0 --keyint 1000 --intra-refresh"
How do I map those to the settings in the X264 parameters structure ? (x264_param_t)
4) I have been told that the x264 static library doesn't support bitmap input to the encoder and that I have to use libswscale for conversion of the 24bit RGB input bitmap to YUV2. The encoder, supposedly, only takes YUV2 as input? Is this true? If so, how do I build libswscale for the x264 static library?
1) Yes. x264 includes the automatically. Its an SEI slice, and you can throw it away if you want.
2) set i_slice_max_size = 1390
3) Take a look at x264_param_t in x264.h. The settings are fairly self explanatory. As for setting the profile and preset call int x264_param_apply_profile( x264_param_t *, const char *profile ) and int x264_param_default_preset( x264_param_t *, const char *preset, const char *tune )
4) Yes, it is true, I want lying when I said that. Look online/on stack overflow there are a million resources on compiling ffmpeg. In fact if you compiled x264 with avcodec support you already have it on your system.
5) Yes!, you should be a good stack overflow citizen and up vote and accept answers form people who donate there free time and knowledge (which takes years to acquire) to helping you.
I've been working on some streaming software that takes live feeds
from various kinds of cameras and streams over the network using
H.264. To accomplish this, I'm using the x264 encoder directly (with
the "zerolatency" preset) and feeding NALs as they are available to
libavformat to pack into RTP (ultimately RTSP). Ideally, this
application should be as real-time as possible. For the most part,
this has been working well.
Unfortunately, however, there is some sort of synchronization issue:
any video playback on clients seems to show a few smooth frames,
followed by a short pause, then more frames; repeat. Additionally,
there appears to be approximately a 4-second delay. This happens with
every video player I've tried: Totem, VLC, and basic gstreamer pipes.
I've boiled it all down to a somewhat small test case:
#include <stdio.h>
#include <stdint.h>
#include <unistd.h>
#include <x264.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
#define WIDTH 640
#define HEIGHT 480
#define FPS 30
#define BITRATE 400000
#define RTP_ADDRESS "127.0.0.1"
#define RTP_PORT 49990
struct AVFormatContext* avctx;
struct x264_t* encoder;
struct SwsContext* imgctx;
uint8_t test = 0x80;
void create_sample_picture(x264_picture_t* picture)
{
// create a frame to store in
x264_picture_alloc(picture, X264_CSP_I420, WIDTH, HEIGHT);
// fake image generation
// disregard how wrong this is; just writing a quick test
int strides = WIDTH / 8;
uint8_t* data = malloc(WIDTH * HEIGHT * 3);
memset(data, test, WIDTH * HEIGHT * 3);
test = (test << 1) | (test >> (8 - 1));
// scale the image
sws_scale(imgctx, (const uint8_t* const*) &data, &strides, 0, HEIGHT,
picture->img.plane, picture->img.i_stride);
}
int encode_frame(x264_picture_t* picture, x264_nal_t** nals)
{
// encode a frame
x264_picture_t pic_out;
int num_nals;
int frame_size = x264_encoder_encode(encoder, nals, &num_nals, picture, &pic_out);
// ignore bad frames
if (frame_size < 0)
{
return frame_size;
}
return num_nals;
}
void stream_frame(uint8_t* payload, int size)
{
// initalize a packet
AVPacket p;
av_init_packet(&p);
p.data = payload;
p.size = size;
p.stream_index = 0;
p.flags = AV_PKT_FLAG_KEY;
p.pts = AV_NOPTS_VALUE;
p.dts = AV_NOPTS_VALUE;
// send it out
av_interleaved_write_frame(avctx, &p);
}
int main(int argc, char* argv[])
{
// initalize ffmpeg
av_register_all();
// set up image scaler
// (in-width, in-height, in-format, out-width, out-height, out-format, scaling-method, 0, 0, 0)
imgctx = sws_getContext(WIDTH, HEIGHT, PIX_FMT_MONOWHITE,
WIDTH, HEIGHT, PIX_FMT_YUV420P,
SWS_FAST_BILINEAR, NULL, NULL, NULL);
// set up encoder presets
x264_param_t param;
x264_param_default_preset(¶m, "ultrafast", "zerolatency");
param.i_threads = 3;
param.i_width = WIDTH;
param.i_height = HEIGHT;
param.i_fps_num = FPS;
param.i_fps_den = 1;
param.i_keyint_max = FPS;
param.b_intra_refresh = 0;
param.rc.i_bitrate = BITRATE;
param.b_repeat_headers = 1; // whether to repeat headers or write just once
param.b_annexb = 1; // place start codes (1) or sizes (0)
// initalize
x264_param_apply_profile(¶m, "high");
encoder = x264_encoder_open(¶m);
// at this point, x264_encoder_headers can be used, but it has had no effect
// set up streaming context. a lot of error handling has been ommitted
// for brevity, but this should be pretty standard.
avctx = avformat_alloc_context();
struct AVOutputFormat* fmt = av_guess_format("rtp", NULL, NULL);
avctx->oformat = fmt;
snprintf(avctx->filename, sizeof(avctx->filename), "rtp://%s:%d", RTP_ADDRESS, RTP_PORT);
if (url_fopen(&avctx->pb, avctx->filename, URL_WRONLY) < 0)
{
perror("url_fopen failed");
return 1;
}
struct AVStream* stream = av_new_stream(avctx, 1);
// initalize codec
AVCodecContext* c = stream->codec;
c->codec_id = CODEC_ID_H264;
c->codec_type = AVMEDIA_TYPE_VIDEO;
c->flags = CODEC_FLAG_GLOBAL_HEADER;
c->width = WIDTH;
c->height = HEIGHT;
c->time_base.den = FPS;
c->time_base.num = 1;
c->gop_size = FPS;
c->bit_rate = BITRATE;
avctx->flags = AVFMT_FLAG_RTP_HINT;
// write the header
av_write_header(avctx);
// make some frames
for (int frame = 0; frame < 10000; frame++)
{
// create a sample moving frame
x264_picture_t* pic = (x264_picture_t*) malloc(sizeof(x264_picture_t));
create_sample_picture(pic);
// encode the frame
x264_nal_t* nals;
int num_nals = encode_frame(pic, &nals);
if (num_nals < 0)
printf("invalid frame size: %d\n", num_nals);
// send out NALs
for (int i = 0; i < num_nals; i++)
{
stream_frame(nals[i].p_payload, nals[i].i_payload);
}
// free up resources
x264_picture_clean(pic);
free(pic);
// stream at approx 30 fps
printf("frame %d\n", frame);
usleep(33333);
}
return 0;
}
This test shows black lines on a white background that
should move smoothly to the left. It has been written for ffmpeg 0.6.5
but the problem can be reproduced on 0.8 and 0.10 (from what I've tested so far). I've taken some shortcuts in error handling to make this example as short as
possible while still showing the problem, so please excuse some of the
nasty code. I should also note that while an SDP is not used here, I
have tried using that already with similar results. The test can be
compiled with:
gcc -g -std=gnu99 streamtest.c -lswscale -lavformat -lx264 -lm -lpthread -o streamtest
It can be played with gtreamer directly:
gst-launch udpsrc port=49990 ! application/x-rtp,payload=96,clock-rate=90000 ! rtph264depay ! decodebin ! xvimagesink
You should immediately notice the stuttering. One common "fix" I've
seen all over the Internet is to add sync=false to the pipeline:
gst-launch udpsrc port=49990 ! application/x-rtp,payload=96,clock-rate=90000 ! rtph264depay ! decodebin ! xvimagesink sync=false
This causes playback to be smooth (and near-realtime), but is a
non-solution and only works with gstreamer. I'd like to fix the
problem at the source. I've been able to stream with near-identical
parameters using raw ffmpeg and haven't had any issues:
ffmpeg -re -i sample.mp4 -vcodec libx264 -vpre ultrafast -vpre baseline -b 400000 -an -f rtp rtp://127.0.0.1:49990 -an
So clearly I'm doing something wrong. But what is it?
1) You didn't set PTS for frames you send to libx264 (you probably should see "non-strictly-monotonic PTS" warnings)
2) You didn't set PTS/DTS for packets you send to libavformat's rtp muxer (I not 100% sure it need to be set but I guess it would be better. From source code it looks like rtp use PTS).
3) IMHO usleep(33333) is bad. It cause encoder to stall this time also (increasing latency) while you could encode next frame during this time even if you still don't need to send it by rtp.
P.S. btw you didn't set param.rc.i_rc_method to X264_RC_ABR so libx264 will use CRF 23 instead and ignore your "param.rc.i_bitrate = BITRATE". Also it can be good idea to use VBV when encoding for network sending.
I have been building a simple samplerate converter in c using libsndfile and libsamplerate. I just cant seem to get the src_simple function of libsamplerate to work, whatever I try. I have striped back my code to be as simple as possible and it now just outputs a silent audio file of identical sampling rate:
#include <stdio.h>
#include <sndfile.h>
#include <samplerate.h>
#define BUFFER_LEN 1024
#define MAX_CHANNELS 6
int main ()
{
static double datain [BUFFER_LEN];
static double dataout [BUFFER_LEN];
SNDFILE *infile, *outfile;
SF_INFO sfinfo, sfinfo2 ;
int readcount ;
const char *infilename = "C:/Users/Oli/Desktop/MARTYTHM.wav" ;
const char *outfilename = "C:/Users/Oli/Desktop/Done.wav" ;
SRC_DATA src_data;
infile = sf_open (infilename, SFM_READ, &sfinfo);
outfile = sf_open (outfilename, SFM_WRITE, &sfinfo);
src_data.data_in = datain
src_data.input_frames = BUFFER_LEN;
src_data.data_out = dataout;
src_data.output_frames = BUFFER_LEN;
src_data.src_ratio = 0.5;
src_simple (&src_data, SRC_SINC_BEST_QUALITY, 1);
while ((readcount = sf_read_double (infile, datain, BUFFER_LEN)))
{
src_simple (&src_data, SRC_SINC_BEST_QUALITY, 1);
sf_write_double (outfile, dataout, readcount) ;
};
sf_close (infile);
sf_close (outfile);
sf_open ("C:/Users/Oli/Desktop/Done.wav", SFM_READ, &sfinfo2);
printf("%d", sfinfo2.samplerate);
return 0;
}
It's really starting to stress me out. The program is a uni project and is due very soon, it is making me very anxious as whatever I try seems to result in failure. Can anyone please help me?
I'm not an expert on this particular library, but just from looking at the online documentation I see a few problems with your code:
src_simple apparently works with floats, yet your buffers are doubles - I think you need to change the buffers to float and use sf_read_float/sf_write_float for I/O.
src_simple is the "simple" interface and is intended to be applied to an entire waveform in one call, not in chunks as you are doing - see http://www.mega-nerd.com/SRC/faq.html#Q004 - you should first get the input file size, then allocate sufficient memory, read in the whole file, convert it in one go, then write the converted output data to your output file.
when changing sample rate you will get a different number of samples in the output file than in the output file (around half as many in for case), yet you're writing the same number of samples that you read (readcount). You should probably be using src_data.output_frames_gen as the number of frames to write, not readcount.