I have a mesh consisting of several entries.
Every entry contains it's own list of faces, vertices, normals, colors and texture coordinates.
Can I loop though all of my entries and use glVertexAttribPointer to cummulate data of an attribute in a single buffer object, like this?:
glBindBuffer(vbo);
for(Entry* e : entries) {
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, e->vertices);
...
}
In other words, will repeated calls on glVertexAttribPointer for attribute 0 of buffer vbo rewrite the data pointed on before or not?
If yes, is there any effective solution out of copying all vertices into one consecutive memory block before calling glVertexAttribPointer only once for the whole buffer?
glVertexAttribPointer does only store (for each attribute) the last information you supplied to it. So appending buffers is not possible by this method.
You have two options when you have a situation like yours:
Issue for each buffer a separate draw-call
Copy the data off all buffers into a single buffer and issue one draw-call for it. Note that in this case the indices might have to be adjusted to point to the correct positions in the combined buffer.
glVertexAttribPointer() does not copy anything. It only sets state that specifies where the vertex data will be fetched from. If you call it repeatedly for the same attribute, each call will replace the previous state, and the last one wins.
Starting with OpenGL 3.1, there is a glCopyBufferSubData() call (man page) that allows you to copy data from one buffer to another. Using this, you could allocate a buffer with enough space for all verctices, and then copy the smaller buffers into the buffer holding all vertices.
That being said, it does not sound like a great idea to use it this way. If you want all vertices in the same buffer, it's much easier and more efficient to store them in that buffer right from the start.
You definitely should not copy around the vertex data on each draw call. While reducing the number of draw calls is desirable, copying around vertex data is much more expensive.
Related
I am trying to learn to use libav. I have followed the very first tutorial on dranger.com, but I got a little confused at one point.
// Write pixel data
for(y=0; y<height; y++)
fwrite(pFrame->data[0]+y*pFrame->linesize[0], 1, width*3, pFile);
This code clearly works, but I don't quite understand why, particulalry I don't understand how the frame data in pFrame->data stored, whether or not it depends on the format/codec in use, why pFrame->data and pFrame->linesize is always referenced at index 0, and why we are adding y to pFrame->data[0].
In the tutorial it says
We're going to be kind of sketchy on the PPM format itself; trust us, it works.
I am not sure if writing it to the ppm format is what is causing this process to seem so strange to me. Any clarification on why this code is the way it is and how libav stores frame data would be very helpful. I am not very familiar with media encoding/decoding in general, thus why I am trying to learn.
particularly I don't understand how the frame data in pFrame->data stored, whether or not it depends on the format/codec in use
Yes, It depends on the pix_fmt value. Some formats are planar and others are not.
why pFrame->data and pFrame->linesize is always referenced at index 0,
If you look at the struct, you will see that data is an array of pointers/a pointer to a pointer. So pFrame->data[0] is a pointer to the data in the first "plane". Some formats, like RGB have a singe plane, where all data is stored in one buffer. Other formats like YUV, use a separate buffer for each plane. e.g. Y = pFrame->data[0], U = pFrame->data[1], pFrame->data[3] Audio may use one plane per channel, etc.
and why we are adding y to pFrame->data[0].
Because the example is looping over an image line by line, top to bottom.
To get the pointer to the fist pixel of any line, you multiply the linesize by the line number then add it to the pointer.
I have a painting app which at any given time interactively shows content from an array of 200 or so CALayers via an UIImageView. I get reasonable performance, but I'm wondering if there could be any performance benefits with using CAMetalLayers instead. In particular, I'm curious if I could benefit from blitting textures directly to each CAMetalLayer, and would there be any hardware considerations with stacking/displaying so many CAMetalLayers at once.
Are there any gotchas I should consider before implementing, and should I continue using an UIImageView (or other) to host these newly Metal-backed sublayers? Any thoughts would be appreciated.
That’s not going to work. You should be keeping track of your stroke’s data. For example an array of points would be a single stroke and then you should have an array of those strokes. It could be only points (x, y) or more probably also containing color, size and other variables. You should know what do you need to describe your stroke.
Then use that to draw (stamp at those locations). When you want to undo, just start drawing from the beginning all the strokes in the array until n-1, n-2, etc...
I want to pass my touch points to GPUImage (iOS)
The Point can be translate to float array, the length of the array is variable length.
But I must direct the length of array in shader.
Disclaimer: not a glsl expert
AFAIk you can't have variable length arrays like what you want. This is a GLSL limitation, not GPUImage so it's not a quick fix- the work you'll be doing will be with textures or glsl, not GPUImage.
Here's another stack overflow post about glsl: GLSL indexing into uniform array with variable length
There's two solutions that could work:
1) Limit the number of points. It's reasonable to limit touches but in practice may be hard to narrow them down if there's too many. You could pass these points in to a fixed length array or as individual constants (one for each point). If you really care about scalability with the number of points this isn't a great method because in your shader you'll have to do check each of these points and perform the relevant computation, which could be expensive when performed for the entire image (again, depending on your use case). If for each pixel you're checking a distance to point, this could be too expensive.
2) Input your points in a texture. You can either have 2 1D textures with the x&y coordinates and then treat them like an array (then go to option 1), or you can create a 2D texture, all 0, and set parts to 1 where there are touches. The 2D texture can have a lower resolution than the actual screen. This method could be a lot less work for the shader if you're doing something simple like turning finger touches black.
Your choice depends largely on what you're doing with the points in the shader.
My VBOs are only being sent to the GPU when they are used for the first time, this causes small freezes the first time an object/group of objects is drawn.
I tried loading the data this way:
glBufferData(GL_ARRAY_BUFFER, size, NULL, GL_STATIC_DRAW);
glBufferSubData(GL_ARRAY_BUFFER, 0, size, data);
and this way
glBufferData(GL_ARRAY_BUFFER, size, data, GL_STATIC_DRAW);
But the result is the same.
If I then draw a triangle after glBufferData:
glDrawElements(GL_TRIANGLES, 3, GL_UNSIGNED_BYTE, NULL);
then the problem is solved, but I find this solution rather hackish.
Is there a better solution?
(I have a bunch of small VBOs containing 256 vertices each)
Well, this is how Buffer Objects are supposed to work, namely adding somewhat asynchronous operation. The idea is, that you can upload a large bunch of Buffer Objects, and continue OpenGL operations afterwards, with the pipline stalling only, if data is accessed which upload has not been completed yet. glBufferData and glBufferSubData either make the pages of the pointer passed them CoW or make an interim copy, either way you can safely discard the data in your process after the call returned, the OpenGL client side will still have the data around for (the ongoing) upload process.
Calling glFinish() will block until the operations pipline has been completely finished (hence the name).
Try calling glFlush() after your glBufferData call.
So, in the course of writing a model loader for a 3D scene I'm working on, I've decided to pack the vertex, texture and normal data like so:
VVVVTTTNNN
for each vertex, where V = vertex coordinate, T = UV coordinate, and N = normal coordinate. When I pass this data on to the vertex shader for my scene, I make three glVertexAttribPointer calls, like so:
glVertexAttribPointer(ATTRIB_VERTEX, 4, GL_FLOAT, 0, 10, group->vertices.data);
glEnableVertexAttribArray(ATTRIB_VERTEX);
glVertexAttribPointer(ATTRIB_NORMAL, 3, GL_FLOAT, 0, 10, group->normals.data);
glEnableVertexAttribArray(ATTRIB_NORMAL);
glVertexAttribPointer(ATTRIB_UV_COORDINATES, 3, GL_FLOAT, 0, 10, group->uvcoordinates.data);
glEnableVertexAttribArray(ATTRIB_UV_COORDINATES);
Each of the group pointers being passed refer to the beginning position in the shared vertex data block where that vertex type starts:
group->vertices.data == data
group->uvcoordinates.data == &data[4]
group->normals.data == &data[7]
Part of the reason for me interleaving this data was to program for cache friendliness and minimize data being sent to the card. ( NOTE: This is not for a realistic performance bottleneck. I'm investigating the optimization because I want to learn more about programming to address these sort of concerns. ) However, for the life of me, I can't imagine how GL would be able to infer that the 3 different pointers refer to offset positions within the same larger data block, and thereby make the necessary optimization to avoid copying the data once it has already been copied. Furthermore, since I'm only ensuring data locality in system memory ( and don't really have any guarantees on how that data is going to be organized on the GPU ), I'm only really optimizing for the case where I access any of these vertices outside of GL. Is that right? Are these optimizations mostly useless, or will providing data in this manner help minimize the data transfer to the GPU / prevent cache misses when iterating over vertex data in the vertex shader?
OpenGL is just an API, the intelligence lies in the driver. Anyway the problem is actually rather simple to implement: For every Vertex Attribute you got a starting memory address and when calling glDrawArrays or glDrawElements one looks for the largest index found. That defines the upper bound of the range.
Then you sort the vertex attributes starting addresses and for each address check if it range overlaps with any other vertex attribute range. You find the contiguous regions and copy those.
In the case of Vertex Buffer Objects it's even simpler since you already copied stuff to OpenGL ready for processing.