Datagridview excessive memory usage - winforms

I have an unbound datagridview with 175 columns and 50,000 rows, populated primarily with doubles. According to my calculations, this equates to a memory usage of 175*50000*8 bytes = 70 MB. However, Task Manager says the grid is using about 1.2 GB of memory - an 17x overhead! Can anyone explain why it's consuming so much memory?
From the msdn article on scaling the datagridview ( http://msdn.microsoft.com/en-us/library/ha5xt0d9.aspx ) I don't think I'm doing anything flagrantly wrong. I'm not setting styles or contextmenustrips for individual cells. No modifications other than populating the cell values and setting format strings on column level.
I understand that virtual mode or shared rows might decrease memory consumption, but given my above calculations, I don't think it should be necessary. 17x overhead doesn't sound right to me.

Keep in mind that each cell of your DataGridView holds a DataGridViewCell instance, containing about 33 properties. It's more overhead than just a double value.

Your calculation is based on the System.Double containing 8 bytes. There may be 8 bytes in the value of each cell in the underlying System.Data.DataTable, but that does not mean that the same amount of data in the DataGridView is only 8 bytes.
Each and every cell has multiple properties - height, width, borderstyle, bordercolor, etc. Even if these all are at the default values, those default values consume memory.

Related

Compressing a sparse bit array

I have arrays of 1024 bytes (8192 bits) which are mostly zero.
Between 0.01% and 10% of bits will be set (random, no pattern).
How could these be compressed, given the lack of structure and the relatively small size?
(My first thought was to store the distances between set bits. I need 13 bits for each distance, but at worst case 10% occupancy this needs 13 * 816 / 8 = 1326 bytes, which is not an improvement.)
This is for ultra-low bandwidth comms, so every byte matters.
I've dealt deeply with a similar problem, but my sets are much bigger (30 million possible values with between 1 and 30 million elements in each set), so they both gain much more from compression and the compression metadata is insignificant compared to the size of the data. I have never gone down to squeezing things into units smaller than uint16_t, so the things I write below might not apply if you start chopping up 13 bit values into pieces. It feels like it should work, but caveat emptor.
What I've found works is to employ several strategies that depend on the particular data we have. The good news is that the count of elements in each set is a very good indicator of which compression strategy will work best for a particular set. So all the metadata you need is a count of elements in the set. In my data format the first and only metadata value (I'll be unspecific and just call it "value", you can squeeze things in bytes, 16 bit values or 13 bit values however you feel) is the count of elements in the set, the rest is just the encoding of the set elements.
The strategies are:
If very few elements are in the set, you can't do better than an array that says "1, 4711, 8140", so in this case the data is encoded as: [3, 1, 4711, 8140]
If almost all elements are in the set, you can just keep track of elements that aren't. For example [8190, 17, 42].
If around half of the elements are in the set you pretty much can't do much better than a bitmap, so you get [4000, {bitmap}], this is the only case where your data ends up being longer than strictly uncompressed.
If more than "a few" but many fewer than "around half" elements are set, I found another strategy. Divide the bits of your possible values in the set in half. Let's say we have 2^16 (it's easier to describe, it should probably work for 2^13) possible values. The values are divided into 256 ranges with each range with 256 possible values. We then have an array with 256 bytes, each of these bytes describes how many values are in each range (so byte 0 tells us how many elements are [0,255], byte 1 gives us [256,511], etc.) immediately after follow arrays with the values in each range mod 256. The trick here is that while every element in the set encoded as an array (strategy 1) would be 2 bytes, in this scheme each element is only 1 bytes + 256 static bytes for the counts of elements. This means that as soon as we have more than 256 elements in the set this saves us space by switching from strategy 1 to 4.
Strategy 4 can be refined (probably meaningless if your data is random as you mention, but my data had more patterns sometimes, so it worked for me). Since we still need 8 bits for each element in the previous encoding, as soon as a sub-array of elements goes over 32 elements (256 bytes), we can store it as a bitmap instead. This is also a good breakpoint for switching strategies between 4/5 to 3. If all the arrays in this strategy are just bitmaps, then we should just use strategy 3 (it's more complicated than that, but the breakpoint between strategies can be precomputed quite accurately that you'll end up picking the most likely efficient strategy each time).
I have only vaguely tried saving deltas between numbers in the set. Quick experiments showed that they weren't really much more efficient than the strategies I mentioned above, had unpredictable degenerate cases, but most importantly, the application I work with really likes to not have to deserialise its data, just use it raw straight from disk (mmap).

ExtJS 4 and grid and speed

I have very simple grid (30 lines and 20 columns) with some numbers. And I have to make a lot of operations on this data in the runtime. I mean - user enter data to the cell and program check lines, columns, some single cells and results write to some cells in this grid.
I found that reading is quite good but writing is horrible slow, even in such a small grid.
I found also, that when I set pair:
suspendLayouts() and resumeLayouts(true)
before and after block of grid operations speed is much better. But in this grid I use celledit plugin and problem with speed is the same.
Could you suggest me some rules how to write such a code to make it max speedy?

Extent allocation in sequential data sets

I am new to mainframe world and trying to work it up but unable to get one thing that how are extents allocated in data sets.
And please can someone explain it using an example or answer this question
Suppose there is a sequential data sets where primary and secondary are both allocated 1 track.
How many times can this data set request for extent ?
Is the extent allotted to both primary and secondary or only secondary?
And one last question
How does setting or not setting guaranteed space attribute in storage class effects the no of extents that can be requested ?
Thank You
Sequential Data Allocation
A sequential data set with primary and secondary of 1 track each will be able to have 16 extents if it is allocated with one volume
//stepname EXEC PGM=IEFBR14
//ddname DD DSN=dataset,
// DISP=(NEW,CATLG),
// UNIT=SYSALLDA,SPACE=(TRK,(1,1))
/*
The above will allocate a dataset that can be 16 tracks big if extend by being written too.
If you replace SYSALLDA with (SYSALLDA,2) it will be able to use 2 volumes so can be 32 tracks of size across 2 volumes
The number of volumes can be overridden by the DATACLASS which can be assigned to an SMS managed datasets
Guaranteed space
Guaranteed space allows you to specify the actual volumes that a dataset will be allocated on when the allocation is SMS controlled, normally SMS will pick the volumes based on the ACS routines
The below jcl will allocate dataset on volume VOL001 if storage class has the DCGSPAC attribute
//stepname EXEC PGM=IEFBR14
//ddname DD DSN=dataset,
// DISP=(NEW,CATLG),vol=ser=VOl001,
// STORCLAS=GSPACE,
// UNIT=SYSALLDA,SPACE=(TRK,(1,1))
/*
Normally the SMS routines are coded so that only specific users or jobs are allowed to use storage classes with Guaranteed Space
Explanation of Storage Class

How do I make my spatial index use a Level greater than HIGH?

My spatial geography index in SQL Server has the following level definitions.
HIGH LOW LOW LOW
The problem is that all of my points are in a city and thus all of my points are in a single cell at layer 1. As a result the primary filter is looking at all points which means my index efficiency is 0%. I realized that the HIGH grid means that there are 256 cells. How do I instead use 512 cells or 1024 cells? 256 just isn't enough for me.
Take a look at this page for the different levels.
Does anyone know how to get a higher value than HIGH?
You need to use a bounding box (see: http://technet.microsoft.com/en-us/library/bb934196(v=sql.105).aspx for information about bounding boxes).
Without a bounding box: The issue is that the SQL Server uses a sub-gridding methodology. The 256 cells together must span the entire space! This means that you HLLL is restricting the number of cells you use. Think about it this way: The LLL portion creates 4096 cells for each of the initial cells. The 256 cells each must be the same size. That means that your high level cells are splitting up too large of an area!
Instead, if you put in a bounding box, the total area covered will be reduced, and the 4096 grids will be smaller, so splitting that into 256 can be sufficient.

Loading tiles for a 2D game

Im trying to make an 2D online game (with Z positions), and currently im working with loading a map from a txt file. I have three different map files. One contains an int for each tile saying what kind of floor there is, one saying what kind of decoration there is, and one saying what might be covering the tile. The problem is that the current map (20, 20, 30) takes 200 ms to load, and I want it to be much much bigger. I have tried to find a good solution for this and have so far come up with some ideas.
Recently I'v thought about storing all tiles in separate files, one file per tile. I'm not sure if this is a good idea (it feels wrong somehow), but it would mean that I wouldn't have to store any unneccessary tiles as "-1" in a text file and I would be able to just pick the right tile from the folder easily during run time (read the file named mapXYZ). If the tile is empty I would just be able to catch the FileNotFoundException. Could anyone tell me a reason for this being a bad solution? Other solutions I'v thought about would be to split the map into smaller parts or reading the map during startup in a BackgroundWorker.
Try making a much larger map in the same format as your current one first - it may be that the 200ms is mostly just overhead of opening and initial processing of the file.
If I'm understanding your proposed solution (opening one file per X,Y or X,Y,Z coordinate of a single map), this is a bad idea for two reasons:
There will be significant overhead to opening so many files.
Catching a FileNotFoundException and eating it will be significantly slower - there is actually a lot of overhead with catching exceptions, so you shouldn't rely on them to perform application logic.
Are you loading the file from a remote server? If so, that's why it's taking so long. Instead you should embed the file into the game. I'm saying this because you probably take 2-3 bytes per tile, so the file's about 30kb and 200ms sounds like a reasonable download time for that size of file (including overhead etc, and depending on your internet connection).
Regarding how to lower the filesize - there are two easy techniques I can think of that will decrease the filesize a bit:
1) If you have mostly empty squares and only some significant ones, your map is what is often referred to as 'sparse'. When storing a sparse array of data you can use a simple compression technique (formally known as 'run-length encoding') where each time you come accross empty squares, you specify how many of them there are. So for example instead of {0,0,0,0,0,0,0,0,0,0,1,1,2,3,0,0,0,0,0,0,0,0,0,0,0,0,1} you could store {10 0's, 1, 1, 2, 3, 12 0's, 1}
2) To save space, I recommend that you store everything as binary data. The exact setup of the file mainly depends on how many possible tile types there are, but this is a better solution than storing the ascii characters corresponding to the base-10 representation of the numers, separated by delimiters.
Example Binary Format
File is organized into segments which are 3 or 4 bytes long, as explained below.
First segment indicates the version of the game for which the map was created. 3 bytes long.
Segments 2, 3, and 4 indicate the dimensions of the map (x, y, z). 3 bytes long each.
The remaining segments all indicate either a tile number and is 3 bytes long with an MSB of 0. The exception to this follows.
If one of the tile segments is an empty tile, it is 4 bytes long with an MSB of 1, and indicates the number of empty tiles including that tile that follow.
The reason I suggest the MSB flag is so that you can distinguish between segments which are for tiles, and segments which indicate the number of empty tiles which follow that segment. For those segments I increase the length to 4 bytes (you might want to make it 5) so that you can store larger numbers of empty tiles per segment.

Resources