Trying to use openH264 as an alternative to libX264 in FFMPEG C project - c

I have an application that transcodes a video frame by frame using FFMPEG and x264 encoder. I am looking to release this application but the licensing of x264 made me switch to using openh264 instead.
I managed to compile everything smoothly (openh264 then FFMPEG with enable-openh264). I am now trying to correct the encoder setup in my C code as what worked for libx264 doesn't work anymore. Unfortunately I found very limited C/C++ examples of FFMPEG/openh264, i would appreciate any link/hint.
I am using the following code (dec_ctx is the AVCodecContext of the video I am decoding)
enc_ctx->height = dec_ctx->height;
enc_ctx->width = dec_ctx->width;
enc_ctx->sample_aspect_ratio = dec_ctx->sample_aspect_ratio;
/* take first format from list of supported formats */
enc_ctx->pix_fmt = encoder->pix_fmts[0];
/* video time_base can be set to whatever is handy and supported by encoder */
enc_ctx->time_base = dec_ctx->time_base;
enc_ctx->gop_size = 120; /* emit one intra frame every twelve frames at most */
enc_ctx->max_b_frames = 16;
enc_ctx->scenechange_threshold = 0;
enc_ctx->rc_buffer_size = 0;
enc_ctx->me_method = ME_ZERO;
enc_ctx->ticks_per_frame = dec_ctx->ticks_per_frame * ifmt_ctx->streams[i]->time_base.den * ifmt_ctx->streams[i]->r_frame_rate.num/ifmt_ctx->streams[i]->r_frame_rate.den;
// Set Ultrafast profile. internal name for this preset is baseline
av_opt_set(enc_ctx->priv_data, "preset", "placebo", AV_OPT_SEARCH_CHILDREN);
I get the following errors in the output with the [OpenH264] tag:
[OpenH264] this = 0x0000000019C126C0, Warning:bEnableFrameSkip = 0,bitrate can't be controlled for RC_QUALITY_MODE,RC_BITRATE_MODE and RC_TIMESTAMP_MODE without enabling skip frame.
Output #0, mp4, to 'C:\Dev\temp\geoVid.mp4':
Stream #0:0: Video: h264 (libopenh264), yuv420p, 720x480, q=2-31, 200 kb/s, 90k tbn, 180k tbc
Stream #0:1: Audio: aac, 48000 Hz, stereo, fltp, 96 kb/s
[OpenH264] this = 0x0000000019C126C0, Warning:Actual input framerate fAverageFrameRate = 0.000000 is quite different from framerate in setting 60.000000, please check setting or timestamp unit (ms), start_Ts = 0
[OpenH264] this = 0x0000000019C126C0, Warning:Actual input framerate fAverageFrameRate = 0.000000 is quite different from framerate in setting 60.000000, please check setting or timestamp unit (ms), start_Ts = 0
The output video file just plays black frames. Any hint or link to some doc would be appreciated. I have been trying to understand these errors but not too sure how to enable "skip frame" or why it is complaining about my input framerate (this is the same input as when I encode successfully with libx264)

The warnings suggest that you have to set a framedrop mode before setting the bitrate, and because of that it is setting the bitrate to 0.

Related

How does ffmpeg use concat of filter in C?

It's OK for me to use the command line like this.
ffmpeg -i test1.mp4 -i test2.mp4 -filter_complex "movie='test1.mp4',scale=640:360[v1];movie='test2.mp4',scale=640:360[v2];[v1][v2]concat" testout.mp4
This is my configuration code.
AVFilterInOut* inputs = avfilter_inout_alloc();
AVFilterInOut* outputs = avfilter_inout_alloc();
...
avfilter_graph_parse_ptr(filter->filterGraph,
"movie='test1.mp4',scale=640:360[v1];movie='test2.mp4',scale=640:360[v2];[v1][v2]concat",
&inputs, &outputs, NULL)
avfilter_graph_config(filter->filterGraph, NULL)
Reported error
[h264 # 0000026dfecef780] Application has requested 17 threads. Using a thread count greater than 16 is not recommended.
[h264 # 0000026dffae9d00] Application has requested 17 threads. Using a thread count greater than 16 is not recommended.
Output pad "default" with type video of the filter instance "in" of buffer not connected to any destination
How can I configure the filter correctly?

FFMPEG Api conversion from YUV420P to RGB produces strange output

I'm using the FFMPEG Api in Rust to get RGB images from video files.
While some videos work correct and I get the frames back as expected, some work not. Or at least the result is not the way I expected it to be.
The code I use in Rust:
ffmpeg::init().unwrap();
let in_ctx = input(&Path::new(source)).unwrap();
let input = in_ctx
.streams()
.best(Type::Video)
.ok_or(ffmpeg::Error::StreamNotFound)?;
let decoder = input.codec().decoder().video()?;
let scaler = Context::get(
decoder.format(),
decoder.width(),
decoder.height(),
Pixel::RGB24,
decoder.width(),
decoder.height(),
Flags::FULL_CHR_H_INT | Flags::ACCURATE_RND,
)?; // <--- Is basically sws_getContext
// later to get the actual frame
let mut decoded = Video::empty();
if self.decoder.receive_frame(&mut decoded).is_ok() {
let mut rgb_frame = Video::empty();
self.scaler.run(&decoded, &mut rgb_frame)?; // <--- Does sws_scale
println!("Converted Pixel Format: {}", rgb_frame.format() as i32);
Ok(Some(rgb_frame))
}
Which should roughly translate to C like so:
// Get the context and video stream
SwsContext * ctx = sws_getContext(imgWidth, imgHeight,
imgFormat, imgWidth, imgHeight,
AV_PIX_FMT_RGB24, 0, 0, 0, 0);
sws_scale(ctx, decoded.data, decoded.linesize, 0, decoded.height, rgb_frame.data, rbg_frame.linesize);
And like I said earlier, sometimes it works fine and I get the expected frame back. But sometimes I get something like this:
Weird result image
I saved the images as .ppm files for quick visual comparison. I used this method, which basically writes the bytes to a file with a simple .ppm header:
fn save_file(frame: &Video, index: usize) -> std::result::Result<(), std::io::Error>
{
let mut file = File::create(format!("frame{}.ppm", index))?;
file.write_all(format!("P6\n{} {}\n255\n", frame.width(), frame.height()).as_bytes())?;
file.write_all(frame.data(0))?;
Ok(())
}
Here you can see that on the left side there is a good image result vs. on the right side there is a bad image result.
Comparison of the .ppm files
To come to the question now:
Why is this happening. I tested everything on my side and the only thing left is ffmpeg conversion. FFMPEG seems to convert these two test files differently even though it reports YUV420P as format for both. I cannot figure out what the difference may be...
Here the info for the two video files i used:
Good video file:
General
Complete name : /mnt/smb/Snapchat-174933781.mp4
Format : MPEG-4
Format profile : Base Media / Version 2
Codec ID : mp42 (isom/mp42)
File size : 1.90 MiB
Duration : 9 s 612 ms
Overall bit rate : 1 661 kb/s
Encoded date : UTC 2021-07-28 22:09:36
Tagged date : UTC 2021-07-28 22:09:36
eng : -180.00
Video
ID : 512
Format : AVC
Format/Info : Advanced Video Codec
Format profile : High#L3.1
Format settings : CABAC / 1 Ref Frames
Format settings, CABAC : Yes
Format settings, Reference frames : 1 frame
Format settings, GOP : M=1, N=30
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 9 s 598 ms
Bit rate : 1 597 kb/s
Width : 480 pixels
Height : 944 pixels
Display aspect ratio : 0.508
Frame rate mode : Variable
Frame rate : 29.797 FPS
Minimum frame rate : 15.000 FPS
Maximum frame rate : 30.000 FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.118
Stream size : 1.83 MiB (96%)
Title : Snap Video
Language : English
Encoded date : UTC 2021-07-28 22:09:36
Tagged date : UTC 2021-07-28 22:09:36
Color range : Full
colour_range_Original : Limited
Color primaries : BT.709
Transfer characteristics : BT.601
transfer_characteristics_Original : BT.709
Matrix coefficients : BT.709
Codec configuration box : avcC
Audio
ID : 256
Format : AAC LC
Format/Info : Advanced Audio Codec Low Complexity
Codec ID : mp4a-40-2
Duration : 9 s 612 ms
Bit rate mode : Constant
Bit rate : 62.0 kb/s
Channel(s) : 1 channel
Channel layout : C
Sampling rate : 44.1 kHz
Frame rate : 43.066 FPS (1024 SPF)
Compression mode : Lossy
Stream size : 73.3 KiB (4%)
Title : Snap Audio
Language : English
Encoded date : UTC 2021-07-28 22:09:36
Tagged date : UTC 2021-07-28 22:09:36
Bad video file:
General
Complete name : /mnt/smb/Snapchat-1989594918.mp4
Format : MPEG-4
Format profile : Base Media / Version 2
Codec ID : mp42 (isom/mp42)
File size : 2.97 MiB
Duration : 6 s 313 ms
Overall bit rate : 3 948 kb/s
Encoded date : UTC 2019-07-11 06:43:04
Tagged date : UTC 2019-07-11 06:43:04
com.android.version : 9
Video
ID : 1
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Baseline#L3.1
Format settings : 1 Ref Frames
Format settings, CABAC : No
Format settings, Reference frames : 1 frame
Format settings, GOP : M=1, N=30
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 6 s 313 ms
Bit rate : 3 945 kb/s
Width : 496 pixels
Height : 960 pixels
Display aspect ratio : 0.517
Frame rate mode : Variable
Frame rate : 29.306 FPS
Minimum frame rate : 19.767 FPS
Maximum frame rate : 39.508 FPS
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.283
Stream size : 2.97 MiB (100%)
Title : VideoHandle
Language : English
Encoded date : UTC 2019-07-11 06:43:04
Tagged date : UTC 2019-07-11 06:43:04
Color range : Limited
Color primaries : BT.709
Transfer characteristics : BT.709
Matrix coefficients : BT.709
Codec configuration box : avcC
Or as a diff image: image diff
The problem is that I am not that familiar with ffmpeg yet I don't know all the quirks it has.
I hope someone can point me in the right direction.
Thanks to the suggestions of #SuRGeoNix and #Jmb I played around with the linesize and width of the input.
After a bit I learned that ffmpeg requires 32bit aligned data to perform optimally. So I adjusted the scaler to scale to a 32bit aligned width and the output is fine now.
ffmpeg::init().unwrap();
let in_ctx = input(&Path::new(source)).unwrap();
let input = in_ctx
.streams()
.best(Type::Video)
.ok_or(ffmpeg::Error::StreamNotFound)?;
let decoder = input.codec().decoder().video()?;
// Round to the next 32bit divisible width
let width = if decoder.width() % 32 != 0 {
decoder.width() + 32 - (decoder.width() % 32)
} else {
decoder.width()
};
let scaler = Context::get(
decoder.format(),
decoder.width(),
decoder.height(),
Pixel::RGB24,
width, // Use the calculated width here
decoder.height(),
Flags::FULL_CHR_H_INT | Flags::ACCURATE_RND,
)?;

How do i record JANUS signal as wav file?

I am testing an interoperability between modems. one of my modem did support JANUS and I believe UnetStack base Subnero Modem Phy[3] also support JANUS. How can i send and record JANUS signal which i can use for preliminary testing for other modem ? Can someone please provide basic snippet ?
UnetStack indeed has an implementation of JANUS that is, by default, configured on phy[3].
You can check this on your modem (the sample outputs here are from unet audio SDOAM, and so your modem parameters might vary somewhat):
> phy[3]
« PHY »
[org.arl.unet.phy.PhysicalChannelParam]
fec = 7
fecList ⤇ [LDPC1, LDPC2, LDPC3, LDPC4, LDPC5, LDPC6, ICONV2]
frameDuration ⤇ 1.1
frameLength = 8
janus = true
[org.arl.yoda.FhbfskParam]
chiplen = 1
fmin = 9520.0
fstep = 160.0
hops = 13
scrambler = 0
sync = true
tukey = true
[org.arl.yoda.ModemChannelParam]
modulation = fhbfsk
preamble = (2400 samples)
threshold = 0.0
(I have dropped a few parameters that are not relevant to the discussion here to keep the output concise)
The key parameters to take note of:
modulation = fhbfsk and janus = true setup the modulation for JANUS
fmin = 9520.0, fstep = 160.0 and hops = 13 are the modulation parameters to setup fhbfsk as required by JANUS
fec = 7 chooses ICONV2 from the fecList, as required by JANUS
threshold = 0.0 indicates that reception of JANUS frames is disabled
NOTE: If your modem is a Subnero M25 series, the standard JANUS band is out of the modem's ~20-30 kHz operating band. In that case, the JANUS scheme is auto-configured to a higher frequency (which you will see as fmin in your modem). Do note that this frequency is important to match for interop with any other modem that might support JANUS at a higher frequency band.
To enable JANUS reception, you need to:
phy[3].threshold = 0.3
To avoid any other detections from CONTROL and DATA packets, we might want to disable those:
phy[1].threshold = 0
phy[2].threshold = 0
At this point, you could make a transmission by typing phy << new TxJanusFrameReq() and put a hydrophone next to the modem to record the transmitted signal as a wav file.
However, I'm assuming you would prefer to record on the modem itself, rather than with an external hydrophone. To do that, you can enable the loopback mode on the modem, and set up the modem to record the received signal:
phy.loopback = true # enable loopback
phy.fullduplex = true # enable full duplex so we can record while transmitting
phy[3].basebandRx = true # enable capture of received baseband signal
subscribe phy # show notifications from phy on shell
Now if you do a transmission, you should see a RxBasebandSignalNtf with the captured signal:
> phy << new TxJanusFrameReq()
AGREE
phy >> RxFrameStartNtf:INFORM[type:#3 rxTime:492455709 rxDuration:1100000 detector:0.96]
phy >> TxFrameNtf:INFORM[type:#3 txTime:492456016]
phy >> RxJanusFrameNtf:INFORM[type:#3 classUserID:0 appType:0 appData:0 mobility:false canForward:true txRxFlag:true rxTime:492455708 rssi:-44.2 cfo:0.0]
phy >> RxBasebandSignalNtf:INFORM[adc:1 rxTime:492455708 rssi:-44.2 preamble:3 fc:12000.0 fs:12000.0 (13200 baseband samples)]
That notification has your signal in baseband complex format. You can save it to a file:
save 'x.txt', ntf.signal, 2
To convert to a wav file, you'll need to load this signal and convert to passband. Here's some example Python code to do this:
import numpy as np
import scipy.io.wavfile as wav
import arlpy.signal as asig
x = np.genfromtxt('x.txt', delimiter=',')
x = x[:,0] + 1j * x[:,1]
x = asig.bb2pb(x, 12000, 12000, 96000)
wav.write('x.wav', 96000, x)
NOTE: You will need to replace the fd and fc of 12000 respectively, by whatever is the fs and fc fields in your modem's RxBasebandSignalNtf. For Unet audio, it is 12000 for both, but for Subnero M25 series modems it is probably 24000.
Now you have your wav file at 96 kSa/s!
You could also plot a spectrogram to check if you wanted to:
import arlpy.plot as plt
plt.specgram(x, fs=96000)
I have an issue while recording the signal. Modem refuse to send the JANUS frame. It looks like something is not correctly set on my end, specially fmin = 12000.0 , fstep = 160.0 and hops = 13. The Actual modem won't let me set the fmin to 9520.0 and automatically configured on lowest fmin = 12000. How can i calculate corresponding parameters for fmin=12000.
Although your suggestion do work on the unet audio.
Here is my modem logs:
> phy[3]
« PHY »
[org.arl.unet.DatagramParam]
MTU ⤇ 0
RTU ⤇ 0
[org.arl.unet.phy.PhysicalChannelParam]
dataRate ⤇ 64.0
errorDetection ⤇ true
fec = 7
fecList ⤇ [LDPC1, LDPC2, LDPC3, LDPC4, LDPC5, LDPC6, ICONV2]
frameDuration ⤇ 1.0
frameLength = 8
janus = true
llr = false
maxFrameLength ⤇ 56
powerLevel = -10.0
[org.arl.yoda.FhbfskParam]
chiplen = 1
fmin = 12000.0
fstep = 160.0
hops = 13
scrambler = 0
sync = true
tukey = true
[org.arl.yoda.ModemChannelParam]
basebandExtra = 0
basebandRx = true
modulation = fhbfsk
preamble = (2400 samples)
test = false
threshold = 0.3
valid ⤇ false
> phy << new TxJanusFrameReq()
REFUSE: Frame type not setup correctly
phy >> FAILURE: Timed out

FFmpeg(C/libav) VPX to mpeg2video stream cannot be reproduce in VLC

I am currently trying to transcode a VPX(VP8/VP9) video to a mpeg2video and stream it over UDP with mpegts.
I have initialized all of the contexts and the streams and as long as I stream it to ffplay it works, if I send the stream to VLC or another player, the receiver only display the first frame and do nothing else. If I do the same thing through the command line it works flawlessly - ffmpeg -re -i video.webm -an -f mpegts udp://127.0.0.1:8080
My output context:
this->output_codec_ctx_->codec_type = AVMEDIA_TYPE_VIDEO; // Set media type
this->output_codec_ctx_->pix_fmt = AV_PIX_FMT_YUV420P; // Set stream pixel format
this->output_codec_ctx_->time_base.den = ceil(av_q2d(input_stream->r_frame_rate)); // Add the real video framerate. Eg.: 29.9
this->output_codec_ctx_->time_base.num = 1; // Numerator of the framerate. Eg.: num/29.9
this->output_codec_ctx_->width = input_stream->codecpar->width; // Video width
this->output_codec_ctx_->height = input_stream->codecpar->height; // Video height
this->output_codec_ctx_->bit_rate = 400000; // Video quality
this->output_codec_ctx_->gop_size = 12;
this->output_codec_ctx_->max_b_frames = 2;
this->output_codec_ctx_->framerate = this->input_codec_ctx_->framerate;
this->output_codec_ctx_->sample_aspect_ratio = this->input_codec_ctx_->sample_aspect_ratio;
My av_dump:
Output #0, mpegts, to 'udp://127.0.0.1:20010':
Metadata:
encoder : Lavf57.72.101
Stream #0:0: Video: mpeg2video (Main), 1 reference frame, yuv420p, 480x640 (0x0), q=2-31, 400 kb/s, SAR 1:1 DAR 3:4, 24 fps, 24 tbr, 90k tbn
FFMPEG av_dump:
Output #0, mpegts, to 'udp://127.0.0.1:20010':
Metadata:
title : Tears of Steel
encoder : Lavf57.72.101
Stream #0:0: Video: mpeg2video (Main), yuv420p, 480x640 [SAR 1:1 DAR 3:4], q=2-31, 200 kb/s, 24 fps, 90k tbn, 24 tbc (default)
Metadata:
encoder : Lavc57.96.101 mpeg2video
Side data:
cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: -1
Any ideia on what I may be doing wrong?

Converting mp4 to mpeg ts, video and audio plays too fast

I am converting a MP4 file to a MPEG TS format, and though my code has started to produce video files the video and audio is running at superspeed. Running avconv -i (same as ffmpeg -i) on the output file I get the following (180 fps!):
Input #0, mpegts, from 'mpegtest_result.ts':
Duration: 00:01:56.05, start: 0.011111, bitrate: 6356 kb/s
Program 1
Metadata:
service_name : Service01
service_provider: Libav
Stream #0.0[0x100]: Video: h264 (Main), yuv420p, 1280x720 [PAR 1:1 DAR 16:9], 180 fps, 90k tbn, 47.95 tbc
Stream #0.1[0x101]: Audio: aac, 48000 Hz, stereo, fltp, 126 kb/s
Currently, in my code, I do not alter the PTS or DTS value of the packet, and I am pretty sure that is what is messing up my video. The only thing I alter is the time_base through this piece of code (the variables should speak for themselves):
if(av_q2d(input_codec_context->time_base) * input_codec_context->ticks_per_frame > av_q2d(input_stream->time_base) && av_q2d(input_stream->time_base) < 1.0/1000) {
output_codec_context->time_base = input_codec_context->time_base;
output_codec_context->time_base.num *= input_codec_context->ticks_per_frame;
}
else {
output_codec_context->time_base = input_stream->time_base;
}
I am aware that I should probably be calling packet.pts = av_rescale_q(...), but I am unsure which time_bases / values I should rescale between.
The full code can be seen here http://pastebin.com/CHvrvc3G.
For my input/output (code line 189+190) I get the following output:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'testvideo.mp4':
Metadata:
major_brand : M4V
minor_version : 1
compatible_brands: isomiso2avc1mp41M4A M4V mp42
encoder : Lavf54.63.100
Duration: 00:07:15.41, start: 0.000000, bitrate: 1546 kb/s
Stream #0.0(eng): Video: h264 (Main), yuv420p, 1280x720 [PAR 1:1 DAR 16:9], 1416 kb/s, 23.98 fps, 11988 tbn, 47.95 tbc
Stream #0.1(und): Audio: aac, 48000 Hz, stereo, fltp, 127 kb/s
Metadata:
creation_time : 2013-05-09 14:37:22
Output #0, mpegts, to 'mpegtest':
Stream #0.0: Video: libx264, yuv420p, 1280x720, q=2-31, 1416 kb/s, 90k tbn, 23.98 tbc
Stream #0.1: Audio: libfaac, 48000 Hz, stereo, 127 kb/s
If you're not doing any rescaling, then it's no wonder the timestamps are messed up.
Timestamps in the packets you send to the muxer must be in the stream timebase (AVStream.time_base). The API semantics right now is such that you set the codec timebase (AVStream.codec.time_base) before writing the header and then the muxer chooses the stream timebase. It may or may not use the codec timebase you set.
Timestamps in the packets you get from the demuxer are also in the stream timebase, so you should call av_rescale_q(pts/dts/duration, input_stream->time_base, output_stream->time_base).

Resources