TensorFlow learn.Estimator : is it naive to call fit() many times? Because I get ResourceExhaustedError - loops

I am learning machine learning using TensorFlow. I have been through a couple of tutorials but I still have a hard time trying to find what are the good ways of training a model. Recently I implemented a CNN model I found in the litterature. The model must take a crop of a certain size centered on a given pixel and predict the label of this pixel. It does that for each pixel of the image. I used:
classifier = tf.learn.Estimator(model_fn=cnn_model_fn, model_dir="./cnn")
with cnn_model_fn beeing a function I implemented.
For each training image, we take 3000 crops randomly, so I can't load all theses images and their crops to memory. The way I found is by loading one image at a time, extract the 3000 crops and then call classifier.fit() to train on the 3000 crops. Then loop for each image in my dataset.
for i in range(len(filenames)):
...
image = misc.imread(filenames[i])
labels = misc.imread(groundTruth[i]) #labels for each pixels
input_classifier = preprocess(image,...) #crops 3000 images in image and do other things
input_labels = preprocess_labels(labels, ...) #take the corresponding 3000 labels
classifier.fit(x = input_classifier,
y = input_labels,
batch_size = 30
steps = 100)
It worked fine for 100 images, but if I try on the whole dataset (2000 images), it always stops and give an error of ResourceExhausted.
...
[everything goes well]
...
iteration :227/2000
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating
TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus
id: 0000:01:00.0)
INFO:tensorflow:Create CheckpointSaverHook.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating
TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus
id: 0000:01:00.0)
Traceback (most recent call last):
File "train-cnn.py", line 78, in <module>
classifier.fit(x= input_classifier, y=input_labels,batch_size=30, steps=100)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 280, in new_func
...
...
...
tensorflow.python.framework.errors_impl.ResourceExhaustedError: cnn/graph.pbtxt.tmp32bcc6311c164c29b91177d17d05d669
I don't see why it gets OOM... I have suspicions that it is because of the way I call fit() in loop. After each fit(), a ckpt is saved and it must be restored right after to train on the next image. So is it a bad way to train a model?

running estimator.fit in a loop with smaller steps is not a good idea. I would put all input logic into an input_fn. then run estimator.fit only once with more steps.
An example of reading data from different files can be found here: tf.contrib.learn.read_batch_examples

Related

What do I have to do to get the H.264 library to compress frames to 250Kb or under?

I'm using the H.264 library to compress a video frame by frame. It works, I can replay it back locally without any issue.
However, I need to send that video over the LAN and that LAN is rather busy already so I need to limit the size of each frame to a maximum of about 250Kb.
I use the following code to setup the parameters, but changing the bit rate values does not seem to have any effect on what the library does with the input frames:
x264_param_t param = {};
if(x264_param_default_preset(&param, "faster", nullptr) < 0)
{
return -1;
}
param.i_csp = X264_CSP_I420;
param.i_width = 3840;
param.i_height = 2160;
param.i_keyint_max = static_cast<int>(f_frame_header.f_fps);
param.i_threads = X264_THREADS_AUTO;
param.b_vfr_input = 0;
param.b_repeat_headers = 1;
param.b_annexb = 1;
// the following three parameters are the ones I tried to change with no results
param.rc.i_bitrate = 100000;
param.rc.i_vbv_max_bitrate = 100000;
param.rc.i_vbv_buffer_size = 125000;
if(x264_param_apply_profile(&param, "high") < 0)
{
return -1;
}
...enter loop reading frames and compressing them...
Changing the i_bitrate, i_vbv_max_bitrate and i_vbv_buffer_size parameters seems to have absolutely no effect on the size of the resulting frames. I still get some frames over 500Kb and in many even, rather large frames one after the other as the following sizes show:
20264
358875
218429
20728
25215
310230
36127
9077
29785
341541
222778
23542
21356
276772
25339
32459
421036
11179
6172
286070
193849
What I would need is the largest frame to be around 250,000 at its maximum. Now I understand that once in a while it go over a bit, but not 2×. That's just too much for my current available bandwidth.
What am I doing wrong in the parameters setup above?
I've seen this command line:
ffmpeg -i input -c:v libx264 -b:v 2M -maxrate 2M -bufsize 1M output.mp4
which would suggest that what I'm doing above should work (I tried all sorts of values including the ones one that command line). Yet the frame size does not really change between my runs.
I tried with a blur applied to each frame to see whether it work help. Yes! It did. The result is a movie which is 2.44 times smaller than the original.
To load each JPEG image from the original, I use ImageMagick++ (in C++), so I just do the following blur on each image:
image.blur(0.0, 5.0);
and that took about 10 hours total (without the blur the same processing took about 40 minutes) but it was worth it since in the end the compressed movie went from 1,293,272,023 bytes to only 529,556,265 bytes (2.44218 times smaller). The blur added about 3.3 seconds of processing per frame and there are a little over 11,000 frames in the original.
Note: I used 5.0 for the blur because I have 4K images and although I can see a sharp difference when I look at one frame, when playing back the resulting movie, I don't notice the final blur. If you have smaller images, you probably want to use a smaller number. It looks like many people use a blur of just 0.05 and already have good results in compression ratios.
In C, use the BlurImage() function:
Image *BlurImage(const Image *image,const double radius,
const double sigma,ExceptionInfo *exception)
Here are some references about using a blur to further compress JPEG images as it helps eliminates sharp edges which do not compress well in the JPEG format (as sharp edge are not as natural):
Recommendation for compressing JPG files with ImageMagick
How do I reduce the file size of an image? (search on "blur" to find the section)
Could I blur an image to dramatically reduce the file size?

Data output from the device

I am working on a project,in which I need to extract data from the device: InertialUnit.
I get a single value in real time, but I need data for the first 10 s and in 1 ms increments, or all the data for the entire cycle of the device. Please help me implement this if possible.
Webots controllers are like any other programs, so you can easily get the values of the inertial unit and save them in a file at each step.
Here is a very simple example in Python:
from controller import Robot
robot = Robot()
inertial_unit = robot.getInertialUnit('inertial unit')
inertial_unit.enable(10)
while robot.step(10) != -1:
values = inertial_unit.getValues()
with open('values.txt','a') as f:
f.write('\n'.join(values))

Increment by 5.005, sometimes 207, sometimes 196?

This problem deals with HLS playlists, and it may help to understand how HLS works before diving in.
HLS (HTTP Live Streaming) is a playlist, similar to that of an iTunes .m3u playlist. HLS takes a video file (such as an .mpg), and splits it into multiple, equal-length segment files (.ts — stands for Transport Stream). For simplicity, you can think of these segment files as chunks of the original .mpg file, which are to be played consecutively.
There are many ways to name these segment files. Sometimes you have…
file_segment_0.ts
file_segment_1.ts
file_segment_2.ts
But sometimes you have something like,
22/51/04.ts
22/51/09.ts
22/51/14.ts
(H/M/S)
The client (such as VLC) knows how to handle these files. It’s up to the producer to decide how they want to name their files.
An HLS playlist can also be “VOD” (Video On-Demand) or “Live”. If the playlist is “Live”, the client will jump to the current live time. Inside the playlist, a header will define the program’s (in terms of the streamed event) start datetime, like so:
#EXT-X-PROGRAM-DATE-TIME:2016-09-16T21:59:09+00:00
The playlist will also tell the client how far apart the segmented files are, in terms of seconds.
My issue falls with H/M/S format. You can find an example playlist here: http://pastebin.com/raw/rS84YJwN
The segments are 5.005s apart, as defined by #EXTINF:5.005.
At first glance, it doesn’t look so bad. Start at the EXT-X-PROGRAM-DATE-TIME, increment by 5.005, round accordingly, and format the date as H/M/S.ts
But there’s a bigger question: Why are there sometimes 207 segments between EXT-X-PROGRAM-DATE-TIME + 5.005, and why are there sometimes 196 segments between EXT-X-PROGRAM-DATE-TIME + 5.005?
My math tells me that I will increment by 6 (instead of 5) every 200 segments, which I can calculate as true with some quick and dirty ruby code[0], which produces this output:
139 22/14/40.ts
339 22/31/21.ts
539 22/48/02.ts
746 23/05/18.ts
946 23/21/59.ts
1146 23/38/40.ts
1346 23/55/21.ts
1402 00/00/01.ts
1542 00/11/42.ts
1742 00/28/23.ts
1942 00/45/04.ts
2142 01/01/45.ts
2342 01/18/26.ts
2542 01/35/07.ts
2742 01/51/48.ts
2942 02/08/29.ts
3142 02/25/10.ts
3342 02/41/51.ts
3542 02/58/32.ts
3749 03/15/48.ts
3949 03/32/29.ts
4149 03/49/10.ts
Where 139 is a line number, and 22/14/40.ts is the segment file.
My question is this: What’s going on here, and how can I reproduce it accurately? I obviously don’t/won’t have access to the actual input video file, and I need to rebuild these playlist files.
[0]
require 'date'
file = `curl 'http://pastebin.com/raw/rS84YJwN'`
start_date = DateTime.parse(file.scan(%r{#EXT-X-PROGRAM-DATE-TIME:(.*)$}).to_a.first.first)
lines = file.split("\n").select { |line| !line.index('.ts').nil? }
date_lines = []
lines.each_with_index do |line, i|
str = line.gsub("/", ':').split(".ts")[0]
date = "#{start_date.to_date} #{str}"
date_obj = DateTime.parse(date)
date_lines << date_obj
next unless i > 0
diff = date_obj.to_time - date_lines[i - 1].to_time
if diff != 5.0
puts "#{i} #{line}"
end
end

How to handle large files while reading it with python xlrd in GAE without giving DeadlineExceededError

I want to read a file having size 4 MB using python xlrd in GAE.
i am getting the file from Blobstore. Code used is given below.
book = xlrd.open_workbook(file_contents=temp_file)
sh = book.sheet_by_index(0)
for col_no in range(sh.ncols):
its gives me DeadlineExceededError.
book = xlrd.open_workbook(file_contents=file_data)
File "/base/data/home/apps/s~appid/app-version.369475363369053908/xlrd/__init__.py", line 416, in open_workbook
ragged_rows=ragged_rows,
File "/base/data/home/apps/s~appid/app-version.369475363369053908/xlrd/xlsx.py", line 756, in open_workbook_2007_xml
x12sheet.process_stream(zflo, heading)
File "/base/data/home/apps/s~appid/app-version.369475363369053908/xlrd/xlsx.py", line 520, in own_process_stream
for event, elem in ET.iterparse(stream):
DeadlineExceededError
But i am able to read files with smaller size.
Actually i need to get only first few rows(30 to 50) of the file. Is there any other method, other than adding it as a task and getting the details using channel API to get the details with out causing deadline error ?
What i can do to handle this....?
I read a file about 1000 rows excel and it works okay the library.
I leave a link that might be useful https://github.com/cjhendrix/HXLator-SpaceAppsVersion/blob/master/gae/main.py
the code I see that this crossing of columns and rows must be at lists for each row
example:
wb = xlrd.open_workbook(file_contents=inputfile.read())
sh = wb.sheet_by_index(0)
for rownum in range(sh.nrows):
val_row = sh.row_values(rownum)
#here print element of list
self.response.write(val_row[1]) #depending for number for columns
regards!!!

MediaElement.NaturalDuration is less than the actual duration of the audio

On some audio files the value of MediaElement.NaturalDuration is less than the actual duration of the audio. When I open the file in Windows Media Player the duration is correct (also when I look at the properties of the file). Although the value of the NaturalDuration property is incorrect, the audio is played fully, but at some point the value of the Position property becomes greater than the value of the NaturalDuration property, which, as I understand, should never happen.
I have created a simple application to reproduce the problem: https://skydrive.live.com/redir?resid=ACF8BFD4384116CE!2908&authkey=!AG-wF6Ae-7EAYk8
The duration of the audio file used in the application is 00:02:54, but the value of the NaturalDuration property is 00:01:59.
Does anyone know why and if there is a workaround for this?
Thanks in advance for any help.
Ok, this is not an answer but some results of a short investigation that give some clues why it behaves like that and where those numbers come from (2:58 and 1:59). First look at this thread: Calculating the length of MP3 Frames in milliseconds
Two things that we will use from there:
1) Frame length (in ms) = (samples per frame / sample rate (in hz)) * 1000, and
Duration in sec = Frame length (in ms) * number of frames / 1000
2) There are some standards regarding number of samples for different MPEG versions:
Samples per frame:
MPEG Version 1
384, // Layer1
1152, // Layer2
1152 // Layer3
MPEG Version 2 & 2.5
384, // Layer1
1152, // Layer2
576 // Layer3
Now lets check in winamp what it says about files format info:
MPEG-2.5 layer 3
16 kbps, 2482 frames
Now if you take frames = 2482 and samples per frame = 576 (MPEG-2.5 layer 3) you'll get duration 2:58. But it looks like for some reason silverlight and iTunes uses samples per frame = 384 which gives us 1:59. Next step could be to check the real values of file's headers and if they are correct and it is possible to calculate correct duration - well than you could cook up some hack to get durations separately (from the server for example). But I'm pretty sure - that file has some defects (inconsistent headers and content) and some players can handle it, others - not.

Resources