OpenGL Hooking -- Rendering to an arbitrarily-sized FBO - c

So I'm trying to make an OpenGL application render at a higher resolution than it normally would. I've already created a shared library that hooks most of the relevant GLX/OpenGL functions. Here's my current approach (at a high-level):
When my hooked SwapBuffers() is called
Unbind my FBO
Call the (original/unhooked) SwapBuffers()
Bind my FBO
Set the viewport to (0, 0, HIGH_RES_X, HIGH_RES_Y)
Set the scissor region to (0, 0, HIGH_RES_X, HIGH_RES_Y)
return
This approach doesn't seem to work for (most) applications. I suspect that is because some applications perform texture lookups (for screen-space operations) by dividing glFragCoord.xy by a uniform that represents their screen resolution (to convert from screen space to texture coordinates).
If resizing the output isn't possible, I wonder if it is possible to obtain the contents drawn onto the default framebuffer (i.e both the color and the depth buffer) without using glReadPixels. Ideally there would be a way to access this data in the form of a texture (so it's already on the GPU). I've heard things about Pixel Buffer Objects -- would using one of these prevent a pipeline stall?

The technique I proposed actually works for a handful of applications, but most of the time it just doesn't work.
If you just want to extract the color/depth buffer, you can either use a pool of PBOs with glReadPixels or use glBlitFramebuffer. The former is not an option if latency is a concern; the latter works fairly well (see this example from the reshade project)

Related

Frame by frame animation using OpenGL and SDL

I am working in a game project that features a large amout of assets. The character animations are very detailed and that require a lot of frames to happen.
At first, I created large spritesheets containing all the animations for a specific character. It was working well on my PC but when I tested it on an Android tablet, I noticed it ecceeded the maximum texture dimension of its GPU. My solution was to break down the big spritesheet into individual frames (the worst case is 180 frames) and upload them individually to the GPU. Things now seem to be working everywhere I need it to work.
Right now, the largest animation I have been working with is a character with 180 frames with 407x725 pixels of width and height. However, as I couldn't find any orientation on the web regarding how to properly render 2D animations using OpenGL, I would like to ask if there is a problem with this approach. Is there a maximum number of textures that can be uploaded to the GPU? Can I exceed the amout of RAM of the GPU?
The most efficient method for the GPU is to pass the entire sprite sheet to opengl as a single texture, and select which frame you want by adjusting the texture coordinates when you draw. You should also pack the sprites into, ideally, a square texture. Reducing the overall amount of memory used by the GPU is very good for performance esp. on phones and tablets.
You want to avoid if possible frequently changing which texture is bound. Ideally you want to bind a single texture and then render bits and pieces of it to the screen until you don't need it anymore, then bind a different texture and continue.
The reason for this is that the GPU will try hard to optimize the operation of the pipeline it creates to handle the geometry you feed it, and the shaders you select. But when you make big changes to the configuration like changing what texture is bound or what shader is bound, that's necessarily going to be somewhat opaque to optimization. Feeding it more vertices and texture coordinates at a time is better because they basically can all get done in a batch without unloading and reloading resources etc.
However depending what cards you are targetting, you should keep in mind that there may be a maximum of 8192 x 8192 size of textures or something like this. So depending on what assets you have you may be forced to split them up across several textures.

Cairo glyph caching

I'm using Cairo for text rendering on an embedded device. I've evaluated the 'toy' text API (i.e. cairo_show_text) and it works very well and is efficient. Unforunately it only supports the most basic operations and always discards the shape immediately.
What I need to do is draw simple text with fill and stroke. When I do this using the slightly more complicated API (cairo_text_path) it works but performance drops to unacceptable levels.
It's a bit difficult to find documentation but I did find this hint:
Be aware cairo_show_text() caches glyphs so is much more efficient if you work with a lot of text.
Where can I read about this glyph caching and how to it also for cairo_text_path? Ideally, is there a code example of this being done? I only need to support this simple use case.
cairo_text_path converts a text with all glyphs to a path and adds him to the context. Rendering this path is expensive because of many segments - dozens of moves, lines, curves for every single glyph.
Glyphs caching by cairo_show_text means that repeating glyphs/characters get rendered once and saved in a much cheaper format (like scanlines, triangles or bitmap) for later occurrences. Because the font doesn't change in-between, this recycling isn't a problem.
You could do this caching by yourself, rendering glyphs on image surfaces and using them as pattern, or simply use bitmap fonts from the beginning.

Displaying pixel data with GDI+

I am writing a simple 3D rendering engine.
The end result of my 3D processing is pixel data. Next I need to display it on the screen with GDI+.
I am using WinForms and Visual Basic. I am drawing directly on form's ClientRectangle.
I have some questions.
After I process a pixel, should I be writing pixel data to a buffer first, instead of sending each pixel to GDI+ individually?
- If so, how much of a screen should I buffer at one time? Full screen, half, quarter, eighth? I think there may be RAM usage / performance trade-offs here.
- What is the best data structure for the pixel buffer?
- Which GDI+ command do I use to render the pixel buffer (or the individual pixel)? Is it possible to avoid creating the bitmap as an intermediate step and send pixel data directly to screen?
Maximum screen size I anticipate is 1600x1200. RAM could be as low as 1GB.
Thanks.
Hope you can find some of those answers here
Write the data into a buffer of RGBA structs first. This will make it easy if, for example you want to render multiple "layers" and then composite those as well. It will also make it easy if you want to perform any deferred processing at some point. Once a full (tile?) render is complete, you can flush it to the output bitmap/file.
This depends on what resolutions you allow the user to render to. If you want to render gigapixel images, you will need to tile it at some reasonable size. I would recommend that the tile size be configurable and then you can set it at a reasonable default after testing.
I would recommend starting out with a simple RGBA buffer if you're not looking to perform any deferred shading.
If you are NOT performing tiled rendering/rendering images that can fit in memory, you can simply use Bitmap.LockBits and write the data that way. If you are using tiled rendering, you will need to either find a library that allows you to render a scanline at a time (and make that a "tile") or fix the file format you want to write TGA, PNG and seek/write directly to the file. Dumping the image as a RAW file and then using a command-line tool to convert it would also be another option.
Hope this helps!

VBO for tilemap (draw order and slanted aerial 2D)

I want to draw a tilemap in a (ANSI C, C99 cannot be used due to windows compatibility) game that uses GL for accelerated graphics, although the game is a top-down 2D perspective using textured quads.
The popular opinion for handling a timemap seems to use a GL vertex buffer object, which I am about to write. However, I realized I want some tiles to go a little beyond vertical bounds, faking a slanted aerial view. That will make whatever is directly above the block to be partially covered by the tile.
If I use a VBO here, I will need to draw the entire tilemap in one sitting. Meaning that any object I draw afterwards will be directly on top of the tilemap.
What would be the sanest approach to this problem? Should I draw the tilemap first, then the entities (players/enemies) and then the excess vertical space so they cover the entities, and finally the effects that display over both? (such as shots, explosions, etcetera). But this would give me the issue of shots not being covered by terrain, and if I change the order, terrain covering large explosions awkwardly.
Alternatively I can sort all visual objects and draw them in a top-down fashion, but that would mean I need to change textures often, as sorting by texture wouldn't help too much in this specific case.
As well, I want to be able to modify the colors of every individual vertex in the grid in a dynamic way, so that entities can cast colors into the map. From what I am understanding, the way to achieve this would be with a vertex shader. Is this correct?
EDIT: A last thing. If I draw a VBO like that tilemap that is larger than the screen,by translating, does GL automatically cull out-of-view faces or do I need to reform the VBO every time I move the "camera"?
A VBO is just a piece of abstract memory reserved in the graphics memory. You can place data in any layout and arrangement as you like. You can use a single VBO to store several independent meshes. gl{Vertex,Normal,TexCoord,Color,Attrib}Pointer functions are used to set the offset into memory, that means either process address space or offset into the bound VBO.
Furthermore once can easily draw only subsets of the bound data with either glDrawArrays and glDrawElements by choosing approriate first element or indices in the index buffer.
So, no, you don't have to draw entire VBOs.
I actually answered my own question. I needed to separate the map in two: blocks that have empty space directly on top, and then the rest. Effects will be drawn in two passes, "regular" and "top" "layer"
I feel pretty bad about having an useless question lying around though, so if some admin needs to purge it, please go ahead.

How do I use OpenGL 3.x VBOs to render a dynamic world?

Although there seem to be very few up to date references for OpenGL 3.x itself, the actual low level manipulation of OpenGL is relatively straight forward. However I am having serious trouble trying to even conceptualise how one would manipulate VBOs in order to render a dynamic world.
Obviously the immediate mode ways of old are non applicable, but from there where do I go? Do I write some kind of scene structure and then convert that to a set of vertices and stream that to the VBO, how would I store translation data? If so how would that look code wise?
Basically really unsure how to continue.
If your entire world is truly dynamic, you can use the GL_STREAM_DRAW_ARB usage flag and reset the data on each frame. Don't bother manipulating it, just try to stream as efficient as possible.
However, I assume that you have a scene that consists of multiple rigid objects that move relative to each other. In this case, use one VBO for each object and specify the GL_STATIC_DRAW_ARB usage flag. You can then set the modelview transform for each instance of an object and render them using one draw call per instance.
A rule of thumb (on the PC) is to issue not more than one draw call per MHz of your CPU. This is a crude estimate, but there's some truth to it. Don't worry about putting multiple independent objects into a single VBO or other performance tricks if you stay below this limit.
Short answer:
Use glMapBufferRange and only update the subrange that needs modification.
Long answer:
The trick is to map the already existing buffer with glMapBufferRange, and then only map the range you need. Given these assumptions:
Your geometry uses per-vertex animation morphing
The vertex count for models is constant during animation.
Then you can use glMapBufferRange to update only the changing parts, and leave the rest of the data alone. Full uploads using glBufferData are slow as a turtle, because they delete the old memory store and allocates a new one. That's in addition to uploading the new data. glMapBufferRange only lets you read/write existing data, it does no allocation or deallocation.
However, if you use skeleton animation, rather pass vertex transformations as 4x4 matrices per-vertex to the vertex shader, and do the calculations there. Per-vertex data is of course specified with glVertexAttribPointer.
Also, remember that you can read texture data in the vertex shader, and that OpenGL 3.1 introduced some new instance draw calls; glDrawArraysInstanced and glDrawElementsInstanced. Those combined can be used for instance-specific lookups. I.e you can do instance draw calls with the same geometry data bound, but send positions or whatever per-vertex data you need as textures or texture-arrays. This can save you from mixing and matching different vertex array data sets.
Imagine if you want to render 100 instances of the same model, but with different positions or color schemes. Or even texture maps.
Using VBOs doesn't mean you have to render your entire scene with only single draw call. You can still issue multiple draw calls, and set up different transformation matrices along the way.
For example, if you're using a scenegraph, each model in the scenegraph can correspond to a single draw call. In such a case, the easiest way to use VBOs is creating a separate VBO for each model.
As an optimization, you might be able to combine several models into a single VBO, then pass in non-zero offsets when making your draw calls; this plucks out the correct model from the VBO. It's also desirable to combine multiple draw calls into a single draw call, but that's not possible if they need to have independent transforms. (Actually it is possible in certain situations if you use instancing or vertex blending, but I suggest getting the basics out of the way first.)

Resources