DirectX9 Use Geometry Instancing for a Mesh with multiple materials - directx-9

I am trying to have a flexible Geometry Instancing code able to handle meshes with multiple materials. For a mesh with one material everything is fine. I manage to render as many instances as I want with a single draw call.
Things get a bit more complicated with multiple materials. My mesh comes from an .x file. It has one vertex buffer, one index buffer but several materials. The indexes to render for each subset (materials) is stored in an attribute array.
Here is the code I use:
d3ddev->SetVertexDeclaration( m_vertexDeclaration );
d3ddev->SetIndices( m_indexBuffer );
d3ddev->SetStreamSourceFreq(0, (D3DSTREAMSOURCE_INDEXEDDATA | m_numInstancesToDraw ));
d3ddev->SetStreamSource(0, m_vertexBuffer, 0, D3DXGetDeclVertexSize( m_geometryElements, 0 ) );
d3ddev->SetStreamSourceFreq(1, (D3DSTREAMSOURCE_INSTANCEDATA | 1ul));
d3ddev->SetStreamSource(1, m_instanceBuffer, 0, D3DXGetDeclVertexSize( m_instanceElements, 1 ) );
m_effect->Begin(NULL, NULL); // begin using the effect
m_effect->BeginPass(0); // begin the pass
for( DWORD i = 0; i < m_numMaterials; ++i ) // loop through each subset.
{
d3ddev->SetMaterial(&m_materials[i]); // set the material for the subset
if(m_textures[i] != NULL)
{
d3ddev->SetTexture( 0, m_textures[i] );
}
d3ddev->DrawIndexedPrimitive(
D3DPT_TRIANGLELIST, // Type
0, // BaseVertexIndex
m_attributes[i].VertexStart, // MinIndex
m_attributes[i].VertexCount, // NumVertices
m_attributes[i].FaceStart * 3, // StartIndex
m_attributes[i].FaceCount // PrimitiveCount
);
}
m_effect->EndPass();
m_effect->End();
d3ddev->SetStreamSourceFreq(0,1);
d3ddev->SetStreamSourceFreq(1,1);
This code will work for the first material only. When I say the first I meant the one at index 0 because if I start my loop with the second material, it will not be rendered. However, by debugging the vertex buffer in PIX, I can see all my materials being processed properly. So something happens after the vertex shader.
Another weird issue, all my materials will be rendered if I set my stream source containing the instance data to be a vertex size of zero.
So Instead of this:
d3ddev->SetStreamSource(1, m_instanceBuffer, 0, D3DXGetDeclVertexSize( m_instanceElements, 1 ) );
I replace it by:
d3ddev->SetStreamSource(1, m_instanceBuffer, 0, 0 );
But of course, with this code, all my instances are rendered at the same position since I reuse the same instance data over and over again.
And last point, everything works fine if I create my device with D3DCREATE_SOFTWARE_VERTEXPROCESSING. Only Hardware has the issue but unfortunately DirectX does not report any problem in debug mode.

See the Shader Model 3 docs
If you are implementing shaders in hardware, you may not use vs_3_0 or ps_3_0 with any other shader versions, and you may not use either shader type with the fixed function pipeline. These changes make it possible to simplify drivers and the runtime. The only exception is that software-only vs_3_0 shaders may be used with any pixel shader version.

I had the same problem and in my case, the problem was with pool for instancing mesh. I originally had this mesh in SYSTEM_MEMORY, but the instanced mesh in POOL_DEFAULT. When I changed instancing mesh to sit in a default mem, everything worked as desired.
Hope it helps.

Related

What is SceneKit doing between calls to didApplyConstraints and willRenderScene?

The SceneKit rendering loop is well documented here https://developer.apple.com/documentation/scenekit/scnscenerendererdelegate and here https://www.raywenderlich.com/1257-scene-kit-tutorial-with-swift-part-4-render-loop. However neither of these documents explains what SceneKit does between calls to didApplyConstraints and willRenderScene.
I've modified my SCNSceneRendererDelegate to measure the time between each call and I can see that around 5ms elapses between those two calls. It isn't running my code in that time, but presumably some aspect of the way I've set up my scene is creating work which has to be done there. Any insight into what SceneKit is doing would be very helpful.
I am calling SceneKit myself from an MTKView's draw call (rather than using an SCNView) so that I can render the scene twice. The first render is normal, the second uses the depth buffer from the first but draws just a subset of the scene that I want to "glow" onto a separate colour buffer. That colour buffer is then scaled down, gaussian blurred, scaled back up and then blended over the top of the first scene (all with custom Metal shaders).
The 5ms spent between didApplyConstraints and willRenderScene started happening when I introduced this extra rendering pass. To control which nodes are in each scene I switch the opacity of a small number of parent nodes between 0 and 1. If I remove the code which switches opacity but keep everything else (so there are two rendering passes but they both draw everything) the extra 5ms disappears and the overall frame rate is actually faster even though much more rendering is happening.
I'm writing Swift targeting MacOS on a 2018 MacBook Pro.
UPDATE: mnuages has explained that changing the opacity causes SceneKit to rebuild the scene graph and it that explains part of the lost time. However I've now discovered that my use of a custom SCNProgram for the nodes in one rendering pass also triggers a 5ms pause between didApplyConstraints and willRenderScene. Does anyone know why this might be?
Here is my code for setting up the SCNProgram and the SCNMaterial, both done once:
let device = MTLCreateSystemDefaultDevice()!
let library = device.makeDefaultLibrary()
glowProgram = SCNProgram()
glowProgram.library = library
glowProgram.vertexFunctionName = "emissionGlowVertex"
glowProgram.fragmentFunctionName = "emissionGlowFragment"
...
let glowMaterial = SCNMaterial()
glowMaterial.program = glowProgram
let emissionImageProperty = SCNMaterialProperty(contents: emissionImage)
glowMaterial.setValue(emissionImageProperty, forKey: "tex")
Here's where I apply the material to the nodes:
let nodeWithGeometryClone = nodeWithGeometry.clone()
nodeWithGeometryClone.categoryBitMask = 2
let geometry = nodeWithGeometryClone.geometry!
nodeWithGeometryClone.geometry = SCNGeometry(sources: geometry.sources, elements: geometry.elements)
glowNode.addChildNode(nodeWithGeometryClone)
nodeWithGeometryClone.geometry!.firstMaterial = glowMaterial
The glow nodes are a deep clone of the regular nodes, but with an alternative SCNProgram. Here's the Metal code:
#include <metal_stdlib>
using namespace metal;
#include <SceneKit/scn_metal>
struct NodeConstants {
float4x4 modelTransform;
float4x4 modelViewProjectionTransform;
};
struct EmissionGlowVertexIn {
float3 pos [[attribute(SCNVertexSemanticPosition)]];
float2 uv [[attribute(SCNVertexSemanticTexcoord0)]];
};
struct EmissionGlowVertexOut {
float4 pos [[position]];
float2 uv;
};
vertex EmissionGlowVertexOut emissionGlowVertex(EmissionGlowVertexIn in [[stage_in]],
constant NodeConstants &scn_node [[buffer(1)]]) {
EmissionGlowVertexOut out;
out.pos = scn_node.modelViewProjectionTransform * float4(in.pos, 1) + float4(0, 0, -0.01, 0);
out.uv = in.uv;
return out;
}
constexpr sampler linSamp = sampler(coord::normalized, address::clamp_to_zero, filter::linear);
fragment half4 emissionGlowFragment(EmissionGlowVertexOut in [[stage_in]],
texture2d<half, access::sample> tex [[texture(0)]]) {
return tex.sample(linSamp, in.uv);
}
By changing the opacity of nodes you're invalidating parts of the scene graph which can result in additional work for the renderer.
It would be interesting to see if setting the camera's categoryBitMask is more performant (it doesn't modify the scene graph).

How do I use SDL_LockTexture() to update dirty rectangles?

I'm migrating an application from SDL 1.2 to 2.0, and it keeps an array of dirty rectangles to determine which parts of its SDL_Surface to draw to the screen. I'm trying to find the best way to integrate this with SDL 2's SDL_Texture.
Here's how the SDL 1.2 driver is working: https://gist.github.com/nikolas/1bb8c675209d2296a23cc1a395a32a0d
And here's how I'm getting changes from the surface to the texture in SDL 2:
void *pixels;
int pitch;
SDL_LockTexture(_sdl_texture, NULL, &pixels, &pitch);
memcpy(
pixels, _sdl_surface->pixels,
pitch * _sdl_surface->h);
SDL_UnlockTexture(_sdl_texture);
for (int i = 0; i < num_dirty_rects; i++) {
SDL_RenderCopy(
_sdl_renderer, _sdl_texture, &_dirty_rects[i], &_dirty_rects[i]);
}
SDL_RenderPresent(_sdl_renderer);
I'm just updating the entire surface, but then taking advantage of the dirty rectangles in the RenderCopy(). Is there a better way to do things here, only updating the dirty rectangles? Will I run into problems calling SDL_LockTexture and UnlockTexture up to a hundred times every frame, or is that how they're meant to be used?
SDL_LockTexture accepts an SDL_Rect param which I could use here, but then it's unclear to me how to get the appropriate rect from _sdl_surface->pixels. How would I copy out just a small rect from this pixel data of the entire screen?

GLSL: Count of fragment shader 'out's

I am trying to write a class that handles glsl and automatically gathers the number of:
in(to vertex)/attributes
uniforms
out(from fragment)
I know how to get the count of the first 2 using openGL's api but I cannot find a method for the third. If there is a way using openGL, I would prefer to use that. Otherwise I'll use a grep-like method to scan the frag program and return the data.
I think you want glGetProgramInterfaceiv(). Something like this:
GLint numActiveOutputs = 0;
glGetProgramInterfaceiv(prog, GL_PROGRAM_OUTPUT​, GL_ACTIVE_RESOURCES​, &numActiveOutputs );

Using PyMEL to set the "Alpha to Use" attribute in an object of class psdFileTex

I am using Maya to do some procedural work, and I have a lot of textures that I need to load into Maya, and they all have transparencies (alpha channels). I would very much like to be able to automate this process. Using PyMEL, I can create my textures and hook them up to a shader, but the alpha doesn't set properly by default. There is an attribute in the psdFileTex node called "Alpha to Use", and it must be set to "Transparency" in order for my alpha channel to work. My question is this - how do I use PyMEL scripting to set the "Alpha to Use" attribute properly?
Here is the code I am using to set up my textures:
import pymel.core as pm
pm.shadingNode('lambert', asShader=True, name='myShader1')
pm.sets(renderable=True, noSurfaceShader=True, empty=True, name='myShader1SG')
pm.connectAttr('myShader1.outColor', 'myShader1SG.surfaceShader', f=True)
pm.shadingNode('psdFileTex', asTexture=True, name='myShader1PSD')
pm.connectAttr('myShader1PSD.outColor', 'myShader1.color')
pm.connectAttr('myShader1PSD.outTransparency', 'myShader1.transparency')
pm.setAttr('myShader1ColorPSD.fileTextureName', '<pathway>/myShader1_texture.psd', type='string')
If anyone can help me, I would really appreciate it.
Thanks
With any node, you can use listAttr() to get the available editable attributes. Run listAttr('myShaderPSD'), note in it's output, there will be two attributes called 'alpha' and 'alphaList'. Alpha, will return you the current selected alpha channel. AlphaList will return you however many alpha channels you have in your psd.
Example
pm.PyNode('myShader1PSD').alphaList.get()
# Result: [u'Alpha 1', u'Alpha 2'] #
If you know you'll only ever be using just the one alpha, or the first alpha channel, you can simply do this.
psdShader = pm.PyNode('myShader1PSD')
alphaList = psdShader.alphaList.get()
if (len(alphaList) > 0):
psdShader.alpha.set(alphaList[0])
else:
// No alpha channel
pass
Remember that lists start iterating from 0, so our first alpha channel will be located at position 0.
Additionally and unrelated, while you're still using derivative commands of the maya.core converted for Pymel, there's still some commands you can use to help make your code read nicer.
pm.setAttr('myShader1ColorPSD.fileTextureName', '<pathway>/myShader1_texture.psd', type='string')
We can convert this to pymel like so:
pm.PyNode('myShader1ColorPSD').fileTextureName.set('<pathway>/myShader1_texture.psd')
And:
pm.connectAttr('myShader1PSD.outColor', 'myShader1.color')
Can be converted to:
pm.connect('myShader1PSD.outColor', 'myShader1.color')
While they may only be small changes, it reads just the little bit nicer, and it's native PyMel.
Anyway, I hope I have helped you!

glXSwapbuffers appear not to have swapped (?)

My situation is like this. I wrote a code that checked a group of windows if their content are eligible to be swapped or not (that is all the redrawing are successfully performed on the said window and all its children after a re-sizing event). Should the conditions be met, I performed glXSwapBuffers call for the said window, and all its children. My aim was to allow for a flicker-freed-upon-resizing system. The child windows were arranged in tile fashion, and does not overlap. Between them, the function appeared to work. My issue however, arise with the parent. Sometime during the re-sizing, its content flickers. So far, this is what I have implemented.
All the events such as ConfigureNotify, or Expose, are already compressed as is needed.
The window background_pixmap is set as None.
Understanding that whenever an Expose event is generated, window background content is lost. With every redrawing done, I keep always keep the copy of the finished redraw in my own allocated buffer. (Neither a pixmap or fbo, but it suffices for now.)
My logic for each call to glXSwapBuffers() is this.
void window_swap( Window *win ) {
Window *child;
if ( win ) {
for ( child=win->child; child; child=child->next )
window_swap( child );
if ( isValidForSwap( win ) ) {
glXMakeCurrent( dpy, win->drawable, win->ctx );
glDrawBuffer( GL_BACK );
RedrawWindowFromBuffer( win, win->backing_store );
glXSwapBuffers( dpy, win->drawable );
}
}
}
Which...should serve, the content is always restored before a call to swap. Sadly, it did not appear so in the implementation. From the above code, I make some adjustment for the purpose of debugging by outputting what should be in the buffer as following.
void window_swap( Window *win ) {
if ( win ) {
if ( isValidForSwap( win ) ) {
glXMakeCurrent( dpy, win->drawable, win->ctx );
glDrawBuffer( GL_BACK );
OutputWindowBuffer( "back.jpg", GL_BACK );
RedrawWindowFromBuffer( win, win->backing_store );
glXSwapBuffers( dpy, win->drawable );
glDrawBuffer( GL_BACK );
glClearColor( 1.0, 1.0, 1.0, 1.0 );
glClear( GL_COLOR_BUFFER_BIT );
OutputWindowBuffer( "front_after.jpg", GL_FRONT );
OutputWindowBuffer( "back_after.jpg", GL_BACK );
}
}
}
The function OutputWindowBuffer() use standard glReadPixel() to read the buffer content and then output it as image. Which buffer is to be read is determined by the parameter passed into the function. What I've found out with the output picture is this.
The picture output of the back buffer after RedrawWindowFromBuffer() is what was expected.
The picture output of the back buffer after the swap is filled with the cleared colour as was expected. Thus, it is not the case that glReadPixel might be lagging in its execution when it was called for the Front buffer as some discovered bug about intel system seemed to suggest once.
The picture output of the front buffer after the swap show mostly black artifacts (My window's colour is always cleared to other colour before each drawings).
Is there other plausible explanations as to why swapping the buffer, does not appear to swap the buffer? Is there other routes I should be looking into as to implement a flicker-free re-sizing? I have read an article suggesting the use of WinGravity, but I'm afraid I don't quite comprehend it yet.
If your windows have a background pixmap set, then at every resizing step they get filled with that, before the actual OpenGL redraw commences. This is one source of flicker. The other problem is glXSwapBuffers not being synched to the vertical retrace. You can set this using glXSwapInterval.
So the two things to do for flicker free resizing: Set a nil background pixmap and enable glXSwapBuffers synched to vertical retrace (swap interval 1).

Resources