Lowest Memory Cache Method for PHPExcel - benchmarking

I tried using the different caching methods in section 4.2.1 of PHPExcel manual.
did a benchmark with 100k rows and here are the results
gzip = time=50,memoryused=177734904
ser = time=34,memoryused=291654272
phptm= time=41,memoryused=325973456
isamm= time=39,memoryused=325972824
the manual says that the phptmp and isamm methods use disk instead of memory. Hence, they should use the least memory, but it seems that this was the opposite.
Here is the code I used to test:
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_in_memory_gzip;
// $cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_in_memory_serialized;
// $cacheMethod = PHPExcel_CachedObjectStorageFactory:: cache_to_phpTemp;
// $cacheSettings = array( 'memoryCacheSize' => '8MB');
// $cacheMethod = PHPExcel_CachedObjectStorageFactory:: cache_to_discISAM;
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
$xlsReader = PHPExcel_IOFactory::createReader($fileType);
$xlsReader->setReadDataOnly(true);
Anyone can shed light on this mystery?

This depends on many factors including PHP version, content of cells (numeric, string, rich text, etc), extensions enabled for PHP, etc; so it is impossible for anybody other than yourself to actually answer, because it is unique for your situation.
However, all methods retain some information about each cell in memory, with the exception of SQLite, so using an SQLite database is the most memory efficient option.
EDIT
I ran some tests with different caching methods against different versions of PHP some months ago, and the following summarises the results
These are still fairly arbitrary result, disk speed and other factors will affect performance for some caching methods like discisam and phptemp, and any configuration settings for options like phptemp will also have some affect; but it should give a relative guideline for working out which options are better for memory, and which are better for speed of execution.

Related

Multiple images mean dask.delayed vs. dask.array

Background
I have a list with the paths of thousand image stacks (3D numpy arrays) preprocessed and saved as .npy binaries.
Case Study I would like to calculate the mean of all the images and in order to speed the analysis I thought to parallelise the processing.
Approach using dask.delayed
# List with the file names
flist_img_to_filter
# I chunk the list of paths in sublists. The number of chunks correspond to
# the number of cores used for the analysis
chunked_list
# Scatter the images sublists to be able to process in parallel
futures = client.scatter(chunked_list)
# Create dask processing graph
output = []
for future in futures:
ImgMean = delayed(partial_image_mean)(future)
output.append(ImgMean)
ImgMean_all = delayed(sum)(output)
ImgMean_all = ImgMean_all/len(futures)
# Compute the graph
ImgMean = ImgMean_all.compute()
Approach using dask.arrays
modified from Matthew Rocklin blog
imread = delayed(np.load, pure=True) # Lazy version of imread
# Lazily evaluate imread on each path
lazy_values = [imread(img_path) for img_path in flist_img_to_filter]
arrays = [da.from_delayed(lazy_value, dtype=np.uint16,shape=shape) for
lazy_value in lazy_values]
# Stack all small Dask arrays into one
stack = da.stack(arrays, axis=0)
ImgMean = stack.mean(axis=0).compute()
Questions
1. In the dask.delayed approach is it necessary to pre-chunk the list? If I scatter the original list I obtain a future for each element. Is there a way to tell a worker to process the futures it has access to?
2. The dask.arrays approach is significantly slower and with higher memory usage. Is this a 'bad way' to use dask.arrays?
3. Is there a better way to approach the issue?
Thanks!
In the dask.delayed approach is it necessary to pre-chunk the list? If I scatter the original list I obtain a future for each element. Is there a way to tell a worker to process the futures it has access to?
Simple answer is no, as of Dask version 0.15.4 there is no very robust way to submit a computation on "all of the tasks of a certain type currently present on this worker".
However, you can easily ask the scheduler which keys are present on the scheduler using the who_has or has_what client methods.
from dask.distributed import wait
import wait
futures = dask.persist(futures)
wait(futures)
client.who_has(futures)
The dask.arrays approach is significantly slower and with higher memory usage. Is this a 'bad way' to use dask.arrays?
You might want to play with the split_every= keyword of the mean function or else rechunk your array to group images together (probably similar to what yo do above) before calling mean to play with parallelism/memory tradeoffs.
Is there a better way to approach the issue?
You might also try as_completed and compute running means as data completes. You would have to switch from delayed to futures for this.

Face detection code in Open CV

In my program i am detecting the face of a person, my code is working well, but i am worry about this code, as for eye detection "cascade.detectMultiScale()" have many parameters, while for Face detection i am using these few parameters, and how it detects the face, whether we have not initialized the size of detecting object in "cascade.detectMultiScale()"
cascade.detectMultiScale(gray, faces, 1.2, 2);
for (int i = 0; i < faces.size(); i++)
{
Rect r = faces[i];
rectangle(src, Point(r.x, r.y), Point(r.x + r.width, r.y + r.height), CV_RGB(0,0,255));
}
You should probably read some manual regarding integrated Open CV functions (Open CV cascade classifier). Last 2 parameters are "minSize" and "maxSize", which can set minimum and maximum size of detected object. For my project I am detecting narrator's face on some 1080p HDTV channel, so my configuration is like this:
face_cascade.detectMultiScale( frame_gray, faces, 1.1, 2, 0|CV_HAAR_SCALE_IMAGE, Size(500, 500) );
...which means I have scale factor=1.1 with only 1 possible face detected.
CV_HAAR_SCALE_IMAGE means that algorithm is in charge of scaling the image, not the detector (which is, in general, slower). You can also use something like 0|CV_HAAR_FIND_BIGGEST_OBJECT if you want to extract the biggest object among all the candidates. In my case, I also forced detector to search for objects not smaller than 500x500 px, which have also speed up my real-time processing and prevent detector from making false detections.
You also should keep in mind, that integrated detector is derived from some predefined parameters (especially in training phase). If you are really interested in making better detector (and better detection accuracy and/or performance) for your application of use, you should consider custom made classifier. But be aware: although modified parameters (number of training iterations, training mode, object properties, object alignment, etc.) COULD make things better, but good understanding of each one of them (and impact between them and on the final result), as well as fine-tuning is needed in order to make some reasonable improvements.

What does the renderingEmSize parameter in GlyphRun specify?

I'm trying to make a GlyphRun instance for use in a GlyphRunDrawing, but the documentation is just so bad that it's almost comical. For example, the parameter renderingEmSize is described like this:
renderingEmSizeType: System.Double
A value of type Double.
Just... wow.
I know what an "em" is in a font (width of the em dash), but I don't know what the grid units are. Device pixels? Device independent pixels?
Turns out the answer is in the source code. Thanks for MS making this available, if they are going to make eyes bleed on the docs.
Interestingly, all the information we need is contained in the xml doc comments on GlyphRun.cs. The renderingEmSize for example, is as follows:
<param name="renderingEmSize">Font rendering size in drawing surface units (96ths of an inch).</param>
The rest of the file is similarly well-commented, including this seemingly out-of-place but gripping read:
/*
The default branch prediction rules for modern processors specify that forward branches
are not to be taken. If the branch is in fact taken, all of the speculatively executed code
must be discarded, the processor pipeline flushed, and then reloaded. This results in a
processor stall of at least 42 cycles for the P4 Northwood for each mis-predicted branch.
The deeper the processor pipeline the higher the cost, i.e. Prescott processors.
Checking for multiple incorrect parameters in a method with high call count like this one can
easily add significant overhead for no reason. Note that the C# compiler should be able to make
reasonable assumptions about branches that throw exceptions, but the current whidbey
implemenation is weak in this regard. Also the current IBC tools are unable to add branch
prediction hints to improve behavior based on run time information. Also note that adding
branch prediction hints increases code size by a byte per branch and doing this in every
method that is coded without default branch prediction behavior in mind would add an
unacceptable amount of working set.
*/
The whole file can be found here: GlyphRun.cs at webtropy

What is a MsgPack 'zone'

I have seen references to 'zone' in the MsgPack C headers, but can find no documentation on what it is or what it's for. What is it? Furthermore, where's the function-by-function documentation for the C API?
msgpack_zone is an internal structure used for memory management & lifecycle at unpacking time. I would say you will never have to interact with it if you use the standard, high-level interface for unpacking or the alternative streaming version.
To my knowledge, there is no detailed documentation: instead you should refer to the test suite that provides convenient code samples to achieve the common tasks, e.g. see pack_unpack_c.cc and streaming_c.cc.
From what I could gather, it is a move-only type that stores the actual data of a msgpack::object. It very well might intended to be an implementation detail, but it actually leaks into users' code sometimes. For example, any time you want to capture a msgpack::object in a lambda, you have to capture the msgpack::zone object as well. Sometimes you can't use move capture (e.g. asio handlers in some cases will only take copyable handlers, or your compiler doesn't support the feature). To work around this, you can:
msgpack::unpacked r;
while (pac_.next(&r)) {
auto msg = result.get();
io_->post([this, msg, z = std::shared_ptr<msgpack::zone>(r.zone().release())]() {
// msg is valid here
}));
}

How do I detect whether the sample supplied by VideoSink.OnSample() is right-side up?

We're currently using the Silverlight VideoSink to capture video from users' local webcams, kinda like so:
protected override void OnSample(long sampleTime, long frameDuration, byte[] sampleData)
{
if (FrameShouldBeSubmitted())
{
byte[] resampledData = ResizeFrame(sampleData);
mediaController.SetVideoFrame(resampledData);
}
}
Now, on most of the machines that we've tested, the video sample provided in the byte[] sampleData parameter is upside-down, i.e., if you try to take the RGBA data and turn it into, say, a WriteableBitmap, the bitmap will be upside-down. That's odd, but fairly easy to correct, of course -- you just have to reverse the array as you encode it.
The problem is that at least on some machines (e.g., the single Macintosh in our test environment), the video sample provided is no longer upside-down, but right-side up, and hence, flipping the image actually results in an image that's received upside-down on the far side.
I reported this to MS as a bug, but their (terse) response was that it was "As Designed". Further attempts at clarification have so far been ignored.
Now, I'll grant that it's kinda entertaining to imagine the discussions behind this design decision: "OK, just to make it interesting, let's play the video rightside up on a Mac, but let's turn it upside down for Windows!" "Great idea!" "Yeah, that'll keep those developers guessing!" But beyond that, I can't find this, umm, "feature" documented anywhere, nor can I find any documentation on how one is supposed to be able to tell that a given video sample is upside down or rightside up. Any thoughts on how to tell this?
EDIT 3/29/10 4:50 pm - I got a response from MS which said that the appropriate way to tell was through the Stride property on the VideoFormat object, i.e., if the stride value is negative, the image will be upside-down. However, my own testing indicates that unless I'm doing something wrong, this isn't the case. At least on my own machine, whether the stride value is zero or negative (the only options I see), the sampled image is still upside-down.
I was going to suggest looking at VideoFormat.Stride provided at VideoSink.OnFormatChange but then I noticed your edit. I went ahead and tested it at my dev machine, image is upside down and stride is negative as expected. Have you checked again recently?
Even though stride made perfect sense for native applications (using stride at pointer operations), I agree that current behavior is not what you expect from a modern API. However performance wise, it is better not to make changes on data received from native API.
Yet at this point, while we are talking about performance, why not provide samples in formats other than PixelFormatType.Format32bppArgb so that we can avoid color space conversion? BTW, there is a VideoCaptureDevice.DesiredFormat property which only works for resolution as there is no alternative to PixelFormatType.Format32bppArgb.

Resources