Good day,
I have a problem. I need to get correct framerate from ffmpeg libs..
I tried to use
pFormatCtx->streams[videoStream]->avg_frame_rate.num
return of avg_frame_rate is 2997. But when I dumped meta info, I got:
Input #0, avi, from '/test.avi':
Metadata:
encoder : MEncoder SVN-r33883(20110719-gcc4.5.2)
Duration: 00:49:47.70, start: 0.000000, bitrate: 1294 kb/s
Stream #0:0: Video: mpeg4 (Advanced Simple Profile) (XVID / 0x44495658), yuv420p, 856x480 [SAR 1:1 DAR 107:60], 1090 kb/s, SAR 491520:492521 DAR 8192:4603, 23.98 fps, 23.98 tbr, 23.98 tbn, 23.98 tbc
Stream #0:1: Audio: mp3 (U[0][0][0] / 0x0055), 48000 Hz, stereo, s16p, 192 kb/s
2015-09-20 15:47:02.377 TV3[21607:769601] ready to start audio
sample rate is: 23.98fps. What value is correct and why are they different?
So, what's in pFormatCtx->streams[videoStream]->avg_frame_rate.den?
I bet it's 125 then. AVStream::avg_frame_rate is of type AVRational, a structure holding a rational number as a fraction. To get a decimal value, you have to divide num by den.
-> 2997 / 125 = 23.976
Related
I have a long text file that I am importing with numpy genfromtext:
00:00:01 W 348 18.2 55.9 049 1008.8 0.000
00:00:02 W 012 12.5 55.9 049 1008.8 0.000
00:00:03 W 012 12.5 55.9 049 1008.8 0.000
00:00:04 W 357 18.2 55.9 049 1008.8 0.000
00:00:05 W 357 18.2 55.9 049 1008.8 0.000
00:00:06 W 339 17.6 55.9 049 1008.8 0.000
testdata = np.genfromtxt(itertools.islice(f_in, 0, None, 60),\
names=('time','ew','d12','s12','t12','p12'.....)
time = (testdata['time'])
This is organizing all the data into an array. The first column of data in the file is a timestamp for each row. In the text file it is formatted as 00:00:00 so in format (%H:%m:%s). However in the actual array that is generated, it turns it into 1900-01-01 00:00:00 . When plotting my data with time, I cannot get it to drop the Y-m-d.
I have tried time = time.strftime('%H:%M:%S')
and
dt.datetime.strptime(time.decode('ascii'), '%H:%M:%S')
Both do nothing. How can I turn my whole array of times to keep the original %H:%m:%s format without it adding in the %Y-%m-%d?
EDIT: based on data provided, you can import your file like this:
str2date = lambda x: datetime.strptime(x.decode("utf-8"), '%H:%M:%S').time()
data = np.genfromtxt(itertools.islice(f_in, 0, None, 60), dtype=None,names=('time','ew','d12','s12','t12','p12'.....), delimiter=' ', converters = {0: str2date})
print(data['time'])
output:
00:00:01
Note than you would need to .decode("utf-8") your input to str2date since it accepts bytes. You can set your dtype in np.genfromtxt() according to your specific file content.
You can also use this if your data is in right format:
dt.datetime.strptime(time,"%H:%M:%S").time()
I have wireshark capturing video call between two of my IP call clients. I've generated video streams using videosnarf tool by giving pcap as input. But when I try to play them with ffplay they are not being played.
ffplay H264-media-4.264
ffplay version 4.1.3-0york1~16.04 Copyright (c) 2003-2019 the FFmpeg developers
built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609
configuration: --prefix=/usr --extra-version='0york1~16.04' --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-nonfree --enable-libfdk-aac --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
libavutil 56. 22.100 / 56. 22.100
libavcodec 58. 35.100 / 58. 35.100
libavformat 58. 20.100 / 58. 20.100
libavdevice 58. 5.100 / 58. 5.100
libavfilter 7. 40.101 / 7. 40.101
libavresample 4. 0. 0 / 4. 0. 0
libswscale 5. 3.100 / 5. 3.100
libswresample 3. 3.100 / 3. 3.100
libpostproc 55. 3.100 / 55. 3.100
[h264 # 0x7f3158000940] Format h264 detected only with low score of 1, misdetection possible!
[h264 # 0x7f31580020c0] missing picture in access unit with size 1853721
[AVBSFContext # 0x7f3158009500] Invalid NAL unit 0, skipping.
Last message repeated 648 times
[AVBSFContext # 0x7f3158009500] Invalid NAL unit 0, skipping.=0/0
Last message repeated 1924 times
[h264 # 0x7f31580020c0] Invalid NAL unit 0, skipping.
Last message repeated 130 times
[h264 # 0x7f31580020c0] Invalid NAL unit 0, skipping. 0B f=0/0
Last message repeated 2442 times
[h264 # 0x7f31580020c0] no frame!
[h264 # 0x7f3158000940] decoding for stream 0 failed
[h264 # 0x7f3158000940] Could not find codec parameters for stream 0 (Video: h264, none): unspecified size
Consider increasing the value for the 'analyzeduration' and 'probesize' options
Input #0, h264, from 'H264-media-4.264':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: h264, none, 25 tbr, 1200k tbn, 50 tbc
[h264 # 0x7f3158003700] Invalid NAL unit 0, skipping.
Last message repeated 2573 times
[h264 # 0x7f3158003700] no frame!
The file H264-media-4.264 is not played even with VLC on windows.
I'm reading some (apparently) large grib files using xarray. I say 'apparently' because they're ~100MB each, which doesn't seem too big to me. However, running
import xarray as xr
ds = xr.open_dataset("gribfile.grib", engine="cfgrib")
takes a good 5-10 minutes. Worse, reading one of these takes up almost 4GB RAM - something that surprises me given the lazy-loading that xarray is supposed to do. Not least that this is 40-odd times the size of the original file!
This reading time and RAM usage seems excessive and isn't scalable to the 24 files I have to read.
I've tried using dask and xr.open_mfdataset, but this doesn't seem to help when the individual files are so large. Any suggestions?
Addendum:
dataset looks like this once opened:
<xarray.Dataset>
Dimensions: (latitude: 10, longitude: 10, number: 50, step: 53, time: 45)
Coordinates:
* number (number) int64 1 2 3 4 5 6 7 8 9 ... 42 43 44 45 46 47 48 49 50
* time (time) datetime64[ns] 2011-01-02 2011-01-04 ... 2011-03-31
* step (step) timedelta64[ns] 0 days 00:00:00 ... 7 days 00:00:00
surface int64 0
* latitude (latitude) float64 56.0 55.0 54.0 53.0 ... 50.0 49.0 48.0 47.0
* longitude (longitude) float64 6.0 7.0 8.0 9.0 10.0 ... 12.0 13.0 14.0 15.0
valid_time (time, step) datetime64[ns] 2011-01-02 ... 2011-04-07
Data variables:
u100 (number, time, step, latitude, longitude) float32 6.389208 ... 1.9880934
v100 (number, time, step, latitude, longitude) float32 -13.548858 ... -3.5112982
Attributes:
GRIB_edition: 1
GRIB_centre: ecmf
GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
GRIB_subCentre: 0
history: GRIB to CDM+CF via cfgrib-0.9.4.2/ecCodes-2.9.2 ...
I've temporarily got around the issue by reading in the grib files, one-by-one, and writing them to disk as netcdf. xarray then handles the netcdf files as expected. Obviously it would be nice to not have to do this because it takes ages - I've only done this for 4 so far.
I am a new PocketSphinx user. I just followed the official getting started guide
However, I begin having difficulties when I get to this step:
"To test the installation, run pocketsphinx_continuous -inmic yes and check that it recognizes words you speak into your microphone."
I've attached my terminal output that I receive when I type this command, which ultimately results in 'Segmentation fault: 11'
Any help would be greatly appreciated.
Thanks,
Nakul
nakul : ~ 101 $ pocketsphinx_continuous -inmic yes
INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/en-us/en-us/feat.params
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2.000000e+00
-allphone
-allphone_ci no no
-alpha 0.97 9.700000e-01
-ascale 20.0 2.000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-ceplen 13 13
-cmn live batch
-cmninit 40,3,-1 41.00,-5.29,-0.12,5.09,2.48,-4.07,-1.37,-1.78,-5.08,-2.05,-6.45,-1.42,1.17
-compallsen no no
-debug 0
-dict /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /usr/local/share/pocketsphinx/model/en-us/en-us
-input_endian little little
-jsgf
-keyphrase
-kws
-kws_delay 10 10
-kws_plp 1e-1 1.000000e-01
-kws_threshold 1 1.000000e+00
-latsize 5000 5000
-lda
-ldadim 0 0
-lifter 0 22
-lm /usr/local/share/pocketsphinx/model/en-us/en-us.lm.bin
-lmctl
-lmname
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.300000e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 6.500000e+00
-maxhmmpf 30000 30000
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-nwpen 1.0 1.000000e+00
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-10 1.000000e-10
-pl_pip 1.0 1.000000e+00
-pl_weight 3.0 3.000000e+00
-pl_window 5 5
-rawlogdir
-remove_dc no no
-remove_noise yes yes
-remove_silence yes yes
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec 0-12/13-25/26-38
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6.800000e+03
-uw 1.0 1.000000e+00
-vad_postspeech 50 50
-vad_prespeech 20 20
-vad_startspeech 10 10
-vad_threshold 2.0 2.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.562500e-02
INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
INFO: acmod.c(162): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /usr/local/share/pocketsphinx/model/en-us/en-us/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/local/share/pocketsphinx/model/en-us/en-us/mdef
INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq
INFO: tmat.c(149): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/en-us/en-us/transition_matrices
INFO: acmod.c(113): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us/means
INFO: ms_gauden.c(242): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us/variances
INFO: ms_gauden.c(242): 42 codebook, 3 feature, size:
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(244): 128x13
INFO: ms_gauden.c(304): 222 variance values floored
INFO: ptm_mgau.c(476): Loading senones from dump file /usr/local/share/pocketsphinx/model/en-us/en-us/sendump
INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126
INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(838): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 138824 * 32 bytes (4338 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict
INFO: dict.c(213): Dictionary size 134723, allocated 1016 KiB for strings, 1679 KiB for phones
INFO: dict.c(336): 134723 words read
INFO: dict.c(358): Reading filler dictionary: /usr/local/share/pocketsphinx/model/en-us/en-us/noisedict
INFO: dict.c(213): Dictionary size 134728, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 42672 bytes (41 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 42672 bytes (41 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
INFO: ngram_search_fwdtree.c(74): Initializing search tree
INFO: ngram_search_fwdtree.c(101): 791 unique initial diphones
INFO: ngram_search_fwdtree.c(186): Creating search channels
INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 152609
INFO: ngram_search_fwdtree.c(333): Created 723 root, 152481 non-root channels, 53 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: continuous.c(307): pocketsphinx_continuous COMPILED ON: Sep 6 2018, AT: 19:28:29
INFO: continuous.c(252): Ready....
Segmentation fault: 11
nakul : ~ 102 $ defaults write com.apple.finder AppleShowAllFiles YES.
I'm working with the TDA19988 HDMI framer and having troubles understanding how to translate the EDID info to configure the framer output.
For example, from the EDID I can see the following parsed info:
1280x720 0x41 74.2MHZ
H : 1280 start 1390 end 1430 total 1650 clock 45.0KHZ
V : 720 start 725 end 730 total 750 clock 60.0HZ
Now, the HDMI framer allows the following to be configured:
refpix (preset pixel) = ?
refline (preset line) = ?
npix (number of input pixels) = ?
nline (number of input lines) = ?
vs_line_start_1 (vertical synchronization line start) = ?
vs_pix_start_1 (vertical synchronization pixel start) = ?
vs_line_end_1 (vertical synchronization line end) = ?
vs_pix_end_1 (vertical synchronization pixel end) = ?
hs_pix_start (horizontal synchronization pixel number) = ?
vwin_start_1 (vertical window start) = ?
vwin_end_1 (vertical window end) = ?
de_start (data enable start) = ?
de_end (data enable end) = ?
I haven't been able to understand how the EDID info is translated to configure the HDMI framer output. Can someone give me some help?
Thanks in advance!
I don't know too much about EDID, but since there is no answer yet, I'll explain what I know.
The TV signal comes one pixel at a time from left to right and from top to bottom. The pixel frequency is 74.2MHZ, that is there are 74.2 million pixels in a second.
Each line is composed of 1650 pixels, that makes 74.2M / 1650 = 45K lines in a second. That's the 45.0KHz.
Then, each frame is made of 750 lines. That is 45K / 750 = 60 frames per second. That's the 60.0Hz.
From each line of 1650 pixels, only the first 1280 pixels are used for actual pixels in the image. From pixel 1390 to 1430 there is the horizontal synchronizaton signal. From 1280 to 1390 and from 1430 to 1650 there are unused pixels (HBlank).
And from each frame of 750 lines, only the first 720 are used for actual pixels. From 725 to 730 there is the vertical synchronization signal. Ranges 720-725 and 730-750 are also unused (VBlank).
About your parameters, the *start* and *end* parameters should be quite obvious. The other ones... well, I don't know.