I am developing an app that presents the user with a potentially very large user-generated image gallery, 10 or so images at the time.
The app is to be implemented in C using libSDL and 2D textures for accelerated rendering.
The overall gist of it in pseudocode is:
while cycle < MAX_CYCLES
while i < MAX_STEPS
show a gallery of 10 image thumbnails
while (poll events)
if event == user has pushed next
break
i++
scramble image galleries using a genetic algorithm
cycle++
I could load every image from disk at initialization time, creating all the required textures, so image presentation is fast. But of course this would be slow and potentially allocate a huge array of textures.
I will scale down the images for presentation, so this could mitigate the problem, but the total size of the collection depends on user preference. Surely I can cap the maximum value, but it cannot be small.
I was thinking about unloading every unused image at every step of every cycle, using SDL_FreeSurface and SDL_DestroyTexture. This would mean reloading the data from disk, recreating the surface and recreating the texture each time. Is this a viable approach?
Also I understand that SDL textures are stored in GPU memory, so the amount of available memory on the card should be my main concern. Am I right?
In summary, is there a recommended method to deal with this type of situation?
I would keep always 3 slides in memory.
Prev - Current - Next
While presenting the current slide, preload the next slide and unload the slide no (Current - 2).
Also I understand that SDL textures are stored in GPU memory, so the amount of available memory on the card should be my main concern. Am I right?
Not quite, if the GPU (Driver) seems it necessary, it will outsource unused texture data to RAM.
For Example, if you're presenting 10 Images and thus have 30 Images present in memory, then for 2K (with alpha) (1920 x 1080 x 4) you will need approx. 250 MB.
As long as you don't run on an embedded system (or very old, outdated system), this shouldn't be a big concern.
Related
I am working on a cocoa/iOS projet.
I have a common swift class which manage a Scenekit scene.
I want to draw a big terrain (about 5000x5000 points).
I have 2 triangles per 4 points. I have created a scngeometry object for the whole terrain (is it a good thing ?)
I decided to store those points in a 6-Float structure (x,y,z and r,g,b). I tried to create an empty array or to allocate a big array at the begining : i got the same issue.
I work with Int datatype for indices array.
The project works fine on Cocoa but i get memory errors on iOS. I think this is because of the need to have a big and contigous array for vertex.
I tried to create several chunks of geometry objects but scene kit does not like if we erase a previous buffer.
What is the best practice in this case ?
Is there a way to store vertex on the mass storage instead of memory arrays/buffers ?
Thanks
So...twice as many terrain points as there are pixels on a shiny new 5K display? That's a huge amount of memory to be using at once on iOS. And you won't be able to see that resolution on an iOS device.
So how about:
Break your 25 million pixel terrain into smaller tiles, each in its own SCNNode. Loop through the tiles, create one SCNNode, throw away the 6-Float array for that tile and move to the next.
Use SCNLevelOfDetail to produce much simpler versions of those nodes, for display when they're very far away.
Do the construction work on OS X. Archive your scene (NSSecureCoding). Bundle that scene into the iOS app.
Consider using reference nodes in your main SCNScene, and archive each tile as a separate SCNScene file.
Hopefully you're already using triangle strips, not triangles, to build your geometry.
I've been using 24bit .png with Alpha, from Photoshop, and just tried a .psd which worked fine with OpenGL ES, but Metal didn't see the Alpha channel.
What's the absolutely most performant texture format for particles within SceneKit?
Here's a sheet to test on, if needs be.
It looks white... right click and save as in the blank space. It's an alpha heavy set of rings. You can probably barely make them out if you squint at the screen:
exaggerated example use case:
https://www.dropbox.com/s/vu4dvfl0aj3f50o/circless.mov?dl=0
// Additional points for anyone can guess the difference between the left and right rings in the video.
Use a grayscale/alpha PNG, not an RGBA one. Since it uses 16 bits per pixel (8+8) instead of 32 (8+8+8+8), the initial texture load will be faster and it may (depending on the GPU) use less memory as well. At render time, though, you’re not going to see much of a speed difference, since whatever the texture format is it’s still being drawn to a full RGB(A) render buffer.
There’s also PVRTC, which can get you down as low as 2–4 bits per pixel, but I tried Imagine’s tool out on your image and even the highest quality settings caused a bunch of artifacts like the below:
Long story short: go with a grayscale+alpha PNG, which you can easily export from Photoshop. If your particle system is hurting your frame rate, reduce the number and/or size of the particles—in this case you might be able to get away with layering a couple of your particle images on top of each other in the source texture atlas, which may not be too noticeable if you pick ones that differ in size enough.
I can't quite understand what's the difference.
I know TMU is a texture mapping unit on GPU, and in opengl, we can have many texture units.I used to think they're the same, that if I got n TMU, then I can have n GL_TEXTURE to use, but I found that this may not be true.
Recently, I was working on an android game, targetting a platform using the Mali 400MP GPU.According to the document, it has only one TMU, I thought that I can use only one texture at a time.But suprisingly, I can use at least 4 textures without trouble.Why is this?
Is the hardware or driver level doing something like swap different textures in/out automatically for me? If so, is it supposed to cause a lot of cache miss?
I'm not the ultimate hardware architecture expert, particularly not for Mali. But I'll give it a shot anyway, based on my understanding.
The TMU is a hardware unit for texture sampling. It does not get assigned to a OpenGL texture unit on a permanent basis. Any time a shader executes a texture sampling operation, I expect this specific operation to be assigned to one of the TMUs. The TMU then does the requested sampling, delivers the result back to the shader, and is available for the next sampling operation.
So there is no relationship between the number of TMUs and the number of supported OpenGL texture units. The number of OpenGL texture units that can be supported is determined by the state tracking part of the hardware.
The number of TMUs has an effect on performance. The more TMUs are available, the more texture sampling operations can be executed within a given time. So if you use a lot of texture sampling in your shaders, your code will profit from having more TMUs. It doesn't matter if you sample many times from the same texture, or from many different textures.
Texture Mapping Units (TMUs) are functional units on the hardware, once upon a time they were directly related to the number of pixel pipelines. As hardware is much more abstract/general purpose now, it is not a good measure of how many textures can be applied in a single pass anymore. It may give an indication of overall multi-texture performance, but by itself does not impose any limits.
OpenGL's GL_TEXTURE0+n actually represents Texture Image Units (TIUs), which are locations where you bind a texture. The number of textures you can apply simultaneously (in a single execution of a shader) varies per-shader stage. In Desktop GL, which has 5 stages as of GL 4.4, implementations must support 16 unique textures per-stage. This is why the number of Texture Image Units is 80 (16x5). GL 3.3 only has 3 stages, and its minimum TIU count is thus only 48. This gives you enough binding locations to provide a set of 16 unique textures for every stage in your GLSL program.
GL ES, particularly 2.0, is a completely different story. It mandates support for at least 8 simultaneous textures in the fragment shader stage and 0 (optional) in the vertex shader.
const mediump int gl_MaxVertexTextureImageUnits = 0; // Vertex Shader Limit
const mediump int gl_MaxTextureImageUnits = 8; // Fragment Shader Limit
const mediump int gl_MaxCombinedTextureImageUnits = 8; // Total Limit for Entire Program
There is also a limit on the number of textures you can apply across all of the shaders in a single execution of your program (gl_MaxCombinedTextureImageUnits), and this limit is usually just the sum total of the limits for each individual stage.
QUERY
I am creating a program that has a lot of tiles in the environment that are all movie clips. The player can move around in this environment. How can I hide the tiles that are off the screen to decrease lag?
The tiles are all in a 2D array that is 20 horizontal units by 10 vertical units.
Let me know if you have any suggestions!
MORE INFO
I have a Tile class for the tile, so I can add functions for removal within this. I'm just unsure how to go about it.
-Olin
Sounds like you're looking for how to do some more fine grained memory management by recollecting memory from tiles no longer on screen, in Flash or any other language that results in bytecode that is then run in a virtual machine that handles the low level memory management and garbage collection, your control is limited with regard to reclaiming memory. Your best bet in these instances is to use object pools to dynamically allocate the number of objects you need then retain them and simply mark them as unused when they're ready to be recycled.
Read up on object pooling in AS3 here:
http://help.adobe.com/en_US/as3/mobile/WS948100b6829bd5a6-19cd3c2412513c24bce-8000.html
or on working with the garbage collector here:
http://help.adobe.com/en_US/as3/mobile/WS4bebcd66a74275c3-576ba64d124318d7189-7ffc.html
All part of the top level (conserving memory):
http://help.adobe.com/en_US/as3/mobile/WS4bebcd66a74275c333637c44124318c9bf9-8000.html
even though this is all in the mobile dirctory I'm willing to bet the information is just as pertinent on the desktop.
Basically, I have an array of data (fluid simulation data) which is generated per-frame in real-time from user input (starts in system ram). I want to write the density of the fluid to a texture as an alpha value - I interpolate the array values to result in an array the size of the screen (the grid is relatively small) and map it to a 0 - 255 range. What is the most efficient way (ogl function) to write these values into a texture for use?
Things that have been suggested elsewhere, which I don't think I want to use (please, let me know if I've got it wrong):
glDrawPixels() - I'm under the impression that this will cause an interrupt each time I call it, which would make it slow, particularly at high resolutions.
Use a shader - I don't think that a shader can accept and process the volume of data in the array each frame (It was mentioned elsewhere that the cap on the amount of data they may accept is too low)
If I understand your problem correctly, both solutions are over-complicating the issue. Am I correct in thinking you've already generated an array of size x*y where x and y are your screen resolution, filled with unsigned bytes ?
If so, if you want an OpenGL texture that uses this data as its alpha channel, why not just create a texture, bind it to GL_TEXTURE_2D and call glTexImage2D with your data, using GL_ALPHA as the format and internal format, GL_UNSIGNED_BYTE as the type and (x,y) as the size ?
What makes you think a shader would perfom bad? The whole idea of shaders is about processing huge amounts of data very, very fast. Please use Google on the search phrase "General Purpose GPU computing" or "GPGPU".
Shaders can only gather data from buffers, not scatter. But what they can do is change values in the buffers. This allows for a (fragment) shader to write the locations of *GL_POINT*s, which are then in turn placed on the target pixels of the texture. Shader Model 3 and later GPUs can also access texture samplers from the geometry and vertex shader stages, so the fragment shader part gets really simple then.
If you just have a linear stream of positions and values, just send those to OpenGL through a Vertex Array, drawing *GL_POINT*s, with your target texture being a color attachment for a framebuffer object.
What is the most efficient way (ogl function) to write these values into a texture for use?
A good way would be to try to avoid any unnecessary extra copies. So you could use Pixel Buffer Objects which you map to your address space, and use that to directly generate your data into.
Since you want to update this data per frame, you also want to look for efficient buffer object streaming, so that you don't force implicit synchronizations between the CPU and GPU. An easy way to do that in your scenario would be using a ring buffer of 3 PBOs, which you advance every frame.
Things that have been suggested elsewhere, which I don't think I want to use (please, let me know if I've got it wrong):
glDrawPixels() - I'm under the impression that this will cause an interrupt each time I call it, which would make it slow, particularly at high resolutions.
Well, what the driver does is totally implementation-specific. I don't think that the "cause an interrupt each time" is a useful mental image here. You seem to completely underestimate the work the GL implementation will be doing behind your back. A GL call will not correspond to some command which is sent to the GPU.
But not using glDrawPixels is still a good choice. It is not very efficient, and it has been deprecated and removed from modern GL.
Use a shader - I don't think that a shader can accept and process the volume of data in the array each frame (It was mentioned elsewhere that the cap on the amount of data they may accept is too low)
You got this totally wrong. There is no way to not use a shader. If you're not writing one yourself (e.g. by using old "fixed-function pipeline" of the GL), the GPU driver will provide the shader for you. The hardware implementation for these earlier fixed function stages has been completely superseeded by programmable units - so if you can't do it with shaders, you can't do it with the GPU. And I would strongly recommend to write your own shader (it is the only option in modern GL, anyway).