Making OpenGL polygonOffset and DirectX9 depthBias behave the same - directx-9

For our multiplatform engine that supports both OpenGL and DirectX9 I am adding support for decals. In OpenGL I can set glPolygonOffset(-1.0f, -1.0f) to fix z-fighting between the wall and the decals. I want the DirectX version to behave exactly the same, so I call this:
float offsetFloat = -1.0f;
DWORD offsetDWord = *((DWORD*)&offsetFloat);
device->SetRenderState(D3DRS_DEPTHBIAS, offsetDWord);
device->SetRenderState(D3DRS_SLOPESCALEDEPTHBIAS, offsetDWord);
However, this gives me an extremely large depth bias. It seems I need to use extremely small values in DirectX9. However, I can't seem to find how small.
I noticed that in the OGRE engine's source they're dividing by 250000, but despite the comment I don't quite see where that number comes from. Also, they only divide the constant by that for some reason?
// D3D also expresses the constant bias as an absolute value, rather than
// relative to minimum depth unit, so scale to fit
constantBias = -constantBias / 250000.0f;
__SetRenderState(D3DRS_DEPTHBIAS, FLOAT2DWORD(constantBias));
slopeScaleBias = -slopeScaleBias;
__SetRenderState(D3DRS_SLOPESCALEDEPTHBIAS, FLOAT2DWORD(slopeScaleBias));
So my question: what do I need to pass to DirectX9 to get the exact same result as glPolygonOffset?

I haven't found an exact number anywhere, but by experimenting I have figured out that to get roughly the same effect in OpenGL and DirectX, I need to divide by 3500000, instead of the 250000 mentioned above.
If anyone knows the exact number or why it's this, I'd love to hear that, but for practical purposes I think this conclusion will do for me.

If you look at the equations for OpenGL polygon offset and the equivalent Direct3D 9 renderstates, you'll find that they're identical, other than OpenGL has a term for an implementation-dependent constant value. If we assume that value is 1, the equations do become identical.
The obvious problem here is: what happens when the value is not 1?
Unfortunately OpenGL doesn't seem to provide a way of querying it, so the only thing you can do is twiddle parameters until you find something that works, then twiddle them some more as edge cases arise, and never assume that what works for you will work for anyone else.
Direct3D's assumption that the value is always 1 may not necessarily hold good all the time either.
Bottom line is that polygon offset is not a 100% robust method for fixing z-fighting under any circumstances. Have you tried other methods, such as pushing your decals out an epsilon along the surface normal?

Related

changing HM reference software to display some information about the bitstream

I am very new to the HM HEVC (and the JEM) reference software, and I am currently trying to understand the source code. I want to add some lines to display for each component: name of Algo (i.e. inter/intra Algos) + length of the bitstream+ position in output bin file.
To know which component cost more bits to code and how codec is working. I want to do same thing for the JEM also after that.
my problem first is that I am unable of understanding a lot of function there, the comment is not sufficient, so is there any references to understand the code??!! (I already read the Manuel ,doesn’t help).
2nd I don’t know where & how exactly to add these lines; is it in TEncGOP, TEncSlice or TEncCU. Ps: I don’t think in TEncGOP.compressGOP so maybe in the 2 other classes.
(I put the answer to comment that #Mourad put four hours ago here, becuase it will be long)
I assume that you could manage to find where the actual encoding after the RDO loop is implemented. As you correctly mentioned, xEncodeCU is the function you need to refer to make sure you are not still in the RDO.
Now you need to find the exact function in xEncodeCU that is responsible for your target codec tool.
For instance, if you want to count the number of bits for coefficient coding, you should be looking into the m_pcEntropyCoder->encodeCoeff() (It's a JEM function and may have different name in the HM). Once you find this line in the xEncodeCU, you may do this and get the number of bits written inside encodeCoeff() function:
UInt b_before = m_pcEntropyCoder->getNumberOfWrittenBits();
m_pcEntropyCoder->encodeCoeff( ... );
UInt b_after = m_pcEntropyCoder->getNumberOfWrittenBits();
UInt writtenBitsCoeff = b_after - b_before;
One important point: as you cas see, the function getNumberOfWrittenBits() gives you integer rates, which is obtained by rounding sum of fractional rates corresponding to all syntax elements coded inside the function encodeCoeff. This error might or might not be acceptable, depending on your problem. For example, if instead of coefficient coding rate, you wanted to know the rate of CBF, then this error would not be acceptable at all. Because, CBF rate is mostly less than one bit. If this is your case, then you would need to calculate the fractional bits one-by-one. It would be totally different and relatively more complicated than this.
Point 1: There is one rule of tumb that logging coding decisions (e.g. pred mode, MV, IPM, block size) is much easier at the decoder side than encoder. This is because of the fact that you have super complicated RDO process at the encoder side that can easily make you get lost in the loops. But at the decoder side, everything appears only once. However, if you insist on doing it at the encoder side, you may find some tips here: Get some information from HEVC reference software
Point 2: Unlike coding decisions, logging rate (i.e. number of written bits for different syntax elements) is more complicated at the decoder side than encoder. This is particularly true for fractional bits associated to anything that is encoded in non-EP mode (i.e. with CABAC contexts). So you may do this part at the ecoder side. But I am afraid it is not easy.
Point 3: I think the best way to understand the code is to read it line-by-line. It's very time-consuming but if you theoritically know the standard(s), you will probably be able to distiguish important parts and ignore the rest.
PS: I think there are too many questions, mostly too general, in your post. It makes it a bit difficult for me to answer them all together. So you I'll wait for you to take your next step and ask more precise questions.

OpenGL -- GL_LINE_LOOP --

I am using GL_LINE_LOOP to draw a circle in C and openGL! Is it possible for me to fill the circle with colors?
If needed, this is the code I'm using:
const int circle_points=100;
const float cx=50+i, cy=50+x, r=50;
const float pi = 3.14159f;
int i = 50;
glColor3f(1, 1, 1);
glBegin(GL_LINE_LOOP);
for(i=0;i<circle_points;i++)
{
const float theta=(2*pi*i)/circle_points;
glVertex2f(cx+r*cos(theta),cy+r*sin(theta));
}
glEnd();
Lookup polygon triangulation!
I hope something here is somehow useful to someone, even though this question was asked in February. There are many answers, even though a lot of people would give none. I could witter forever, but I'll try to finish before then.
Some would even say, "You never would," or, "That's not appropriate for OpenGL," I'd like to say more than them about why. Converting polygons into the triangles that OpenGL likes so much is outside of OpenGL's job-spec, and is probably better done on the processor side anyway. Calculate that stage in advance, as few times as possible, rather than repeatedly sending such a chunky problem on every draw call.
Perhaps the original questioner drifted away from OpenGL since February, or perhaps they've become an expert. Perhaps I'll re-inspire them to look at it again, to hack away at some original 'imposters'. Or maybe they'll say it's not the tool for them after all, but that would be disappointing. Whatever graphics code you're writing, you know that OpenGL can speed it up!
Triangles for convex polygons are easy
Do you just want a circle? Make a triangle fan with the shared point at the circle's origin. GL_POLYGON was, for better or worse, deprecated then killed off entirely; it will not work with current or future implementations of OpenGL.
Triangles for concave polygons are hard
You'll want more general polygons later? Well, there are some tricks you could play with, for all manner of convex polygons, but concave ones will soon get difficult. It would be easy to start five different solutions without finishing a single one. Then it would be difficult, on finishing one, to make it quick, and nearly impossible to be sure that it's the quickest.
To achieve it in a future-proofed way you really want to stick with triangles -- so "polygon triangulation" is the subject you want to search for. OpenGL will always be great for drawing triangles. Triangle strips are popular because they reuse many vertices, and a whole mesh can be covered with only triangle strips, (perhaps including the odd lone triangle or pair of triangles). Drawing with only one primitive usually means the entire mesh can be rendered with a single draw call, which could improve performance. (Number of draw calls is one performance consideration, but not always the most important.)
Polygon triangulation gets more complex when you allow convex polygons or polygons with holes. (Finding algorithms for triangulating a general polygon, robustly yet quickly, is actually an area of ongoing research. Nonetheless, you can find some pretty good solutions out there that are probably fit for purpose.)
But is this what you want?
Is a filled polygon crucial to your final goals in OpenGL? Or did you simply choose what felt like it would be a simple early lesson?
Frustratingly, although drawing a filled polygon seems like a simple thing to do -- and indeed is one of the simplest things to do in many languages -- the solution in OpenGL is likely to be quite complicated. Of course, it can be done if we're clever enough -- but that could be a lot of effort, without being the best route to take towards your later goals.
Even in languages that implement filled polygons in a way that is simple to program with, you don't always know how much strain it puts on the CPU or GPU. If you send a sequence of vertices, to be linked and filled, once every animation frame, will it be slow? If a polygon doesn't change shape, perhaps you should do the difficult part of the calculation just once? You will be doing just that, if you triangulate a polygon once using the CPU, then repeatedly send those triangles to OpenGL for rendering.
OpenGL is very, very good at doing certain things, very quickly, taking advantage of hardware acceleration. It is worth appreciating what it is and is not optimal for, to decide your best route towards your final goals with OpenGL.
If you're looking for a simple early lesson, rotating brightly coloured tetrahedrons is ideal, and happens early in most tutorials.
If on the other hand, you're planning a project that you currently envision using filled polygons a great deal -- say, a stylized cartoon rendering engine for instance -- I still advise going to the tutorials, and even more so! Find a good one; stick with it to the end; you can then think better about OpenGL functions that are and aren't available to you. What can you take advantage of? What do you need or want to redo in software? And is it worth writing your own code for apparently simple things -- like drawing filled polygons -- that are 'missing from' (or at least inappropriate to) OpenGL?
Is there a higher level graphics library, free to use -- perhaps relying on OpenGL for rasterisation -- that can already do want you want? If so, how much freedom does it give you, to mess with the nuts and bolts of OpenGL itself?
OpenGL is very good at drawing points, lines, and triangles, and hardware accelerating certain common operations such as clipping, face culling, perspective divides, perspective texture accesses (very useful for lighting) and so on. It offers you a chance to write special programs called shaders, which operate at various stages of the rendering pipeline, maximising your chance to insert your own unique cleverness while still taking advantage of hardware acceleration.
A good tutorial is one that explains the rendering pipeline and puts you in a much better position to assess what the tool of OpenGL is best used for.
Here is one such tutorial that I found recently: Learning Modern 3D Graphics Programming
by Jason L. McKesson. It doesn't appear to be complete, but if you get far enough for that to annoy you, you'll be well placed to search for the rest.
Using imposters to fill polygons
Everything in computer graphics is an imposter, but the term often has a specialised meaning. Imposters display very different geometry from what they actually have -- only more so than usual! Of course, a 3D world is very different from the pixels representing it, but with imposters, the deception goes deeper than usual.
For instance, a rectangle that OpenGL actually constructs out of two triangles can appear to be a sphere if, in its fragment shader, you write a customised depth value to the depth coordinate, calculate your own normals for lighting and so on, and discard those fragments of the square that would fall outside the outline of the sphere. (Calculating the depth on those fragments would involve a square root of a negative number, which can be used to discard the fragment.) Imposters like that are sometimes called flat cards or billboards.
(The tutorial above includes a chapter on imposters, and examples doing just what I've described here. In fact, the rectangle itself is constructed only part way through the pipeline, from a single point. I warn that the scaling of their rectangle, to account for the way that perspective distorts a sphere into an ellipse in a wide FOV, is a non-robust fudge . The correct and robust answer is tricky to work out, using mathematics that would be slightly beyond the scope of the book. I'd say it is beyond the author's algebra skills to work it out but I could be wrong; he'd certainly understand a worked example. However, when you have the correct solution, it is computationally inexpensive; it involves only linear operations plus two square roots, to find the four limits of a horizontally- or vertically-translated sphere. To generalise that technique for other displacements requires one more square root, for a vector normalisation to find the correct rotation, and one application of that rotation matrix when you render the rectangle.)
So just to suggest an original solution that others aren't likely to provide, you could use an inequality (like x * x + y * y <= 1 for a circle or x * x - y * y <= 1 for a hyperbola) or a system of inequalities (like three straight line forms to bound a triangle) to decide how to discard a fragment. Note that if inequalities have more than linear order, they can encode perfect curves, and render them just as smoothly as your pixelated screen will allow -- with no limitation on the 'geometric detail' of the curve. You can also combine straight and curved edges in a single polygon, in this way.
For instance, a fragment shader (which would be written in GLSL) for a semi-circle might have something like this:
if (y < 0) discard;
float rSq = x * x + y * y;
if (1 < rSq) discard;
// We're inside the semi-circle; put further shader computations here
However, the polygons that are easy to draw, in this way, are very different from the ones that you're used to being easy. Converting a sequence of connected nodes into inequalities means yet more code to write, and deciding on the Boolean logic, to deal with combining those inequalities, could then get quite complex -- especially for concave polygons. Performing inequalities in a sensible order, so that some can be culled based on the results of others, is another ill-posed headache of a problem, if it needs to be general, even though it is easy to hard-code an optimal solution for a single case like a square.
I suggest using imposters mainly for its contrast with the triangulation method. Something like either one could be a route to pursue, depending on what you're hoping to achieve in the end, and the nature of your polygons.
Have fun...
P.S. have a related topic... Polygon triangulation into triangle strips for OpenGL ES
As long as the link lasts, it's a more detailed explanation of 'polygon triangulation' than mine. Those are the two words to search for if the link ever dies.
A line loop is just an outline.
To fill the middle as well, you want to use GL_POLYGON.

Advice for Object Detection on Embedded System with no non-standard libraries

I am looking for some advice for a good way to detect either square or circular objects in an image. I currently have a canny edge algorithm running on the original greyscale and I can produce this output:
http://imgur.com/FAwowr1
Now I can see that there is a cubesat in this picture, but what is a good computationally efficient way that the program can see that aswell? I have looked at houghs transform but that seems to be very computation heavy. I have also looked at Harris corner detect, but I feel I would get to many false positives, for I am essentially looking to isolate pictures that contain said cube satellite.
Anyone have any thoughts on some good algorithms to pursue? I am very limited on space so I cannot use any large external libraries like opencv. (This is all in C btw)
Many Thanks!
I would into what is called mathematical morphology
Basically you operate on binary images, so you must find a clever way to threshold them first , the you do operations such as erosion and dilation with some well selected structuring element to extract areas of interest in your image.

Unprecise rendering of huge WPF visuals - any solutions?

When rendering huge visuals in WPF, the visual gets distorted and more distorted with increasing coordinates. I assume that it has something to do with the floating point data types used in the render pipeline, but I'm not completely sure. Either way, I'm searching for a practical solution to solve the problem.
To demonstrate what I'm talking about, I created a sample application which just contains a custom control embedded in a ScrollViewer that draws a sine curve.
You can see here that the drawing is alright for double values <= 2^24 (in this case the horizontal coordinate value), but from that point on it gets distorted.
The distortion gets worse at 2^25 and so the distortion continues to increase with every additional bit until it just draws some vertical lines.
For performance reasons I'm just drawing the visible part of the graph, but for layouting reasons I cannot "virtualize" the control which would make this problem obsolete. The only solution I could come up with is to draw the visible part of the graph to a bitmap, and then render the bitmap at the appropriate point - but there I have again the precision problem with big values, as I cannot accurately place the bitmap at the position where I need it.
Does anybody have an idea how to solve this?
It is not WPF's fault.
Floating point numbers get less and less precise the farther from zero they are - it is a cost of stuffing enormous data range (-Inf, +Inf) into 32 (float) / 64 (double) bits of data space. Floats actually become less precise than integer at around 2^30.
64bit integers have constant spacing (1), but have limited range of −9,223,372,036,854,775,808 to +9,223,372,036,854,775,807.
You may also consider using Decimal type (which however has also limited value range).
(update: oh didnt see how old this post was... i guess i clicked the wrong filter button in stack overflow...)
The relative precision is relevant here. So just saying "look 2^24 is fine and 2^25 is not" is not enough information. You said it is a sin, thus I guess y-axis max and min never changes between those pictures. So y-axis doesnt matter. Furthermore the x-step size stays the same, i guess? But you did not tell us the sin period length or the x-step size, you chose. That is relevant here. The relative precision of the x-size steps becomes worse when you go to higher x-values, because the x-step-size becomes too small relativly to the x-value itself.
precision of c# floating point types:
https://learn.microsoft.com/de-de/dotnet/csharp/language-reference/builtin-types/floating-point-numeric-types
example:
x-step size = 1.
x = 1 (no problem)
x = 1000 (no problem)
x = >2^23 (32 bit starts to get problems with step size = 1; 64 bit no problems yet)
x = >2^52 (64 bit starts to get problems with step size = 1)

How do I detect whether the sample supplied by VideoSink.OnSample() is right-side up?

We're currently using the Silverlight VideoSink to capture video from users' local webcams, kinda like so:
protected override void OnSample(long sampleTime, long frameDuration, byte[] sampleData)
{
if (FrameShouldBeSubmitted())
{
byte[] resampledData = ResizeFrame(sampleData);
mediaController.SetVideoFrame(resampledData);
}
}
Now, on most of the machines that we've tested, the video sample provided in the byte[] sampleData parameter is upside-down, i.e., if you try to take the RGBA data and turn it into, say, a WriteableBitmap, the bitmap will be upside-down. That's odd, but fairly easy to correct, of course -- you just have to reverse the array as you encode it.
The problem is that at least on some machines (e.g., the single Macintosh in our test environment), the video sample provided is no longer upside-down, but right-side up, and hence, flipping the image actually results in an image that's received upside-down on the far side.
I reported this to MS as a bug, but their (terse) response was that it was "As Designed". Further attempts at clarification have so far been ignored.
Now, I'll grant that it's kinda entertaining to imagine the discussions behind this design decision: "OK, just to make it interesting, let's play the video rightside up on a Mac, but let's turn it upside down for Windows!" "Great idea!" "Yeah, that'll keep those developers guessing!" But beyond that, I can't find this, umm, "feature" documented anywhere, nor can I find any documentation on how one is supposed to be able to tell that a given video sample is upside down or rightside up. Any thoughts on how to tell this?
EDIT 3/29/10 4:50 pm - I got a response from MS which said that the appropriate way to tell was through the Stride property on the VideoFormat object, i.e., if the stride value is negative, the image will be upside-down. However, my own testing indicates that unless I'm doing something wrong, this isn't the case. At least on my own machine, whether the stride value is zero or negative (the only options I see), the sampled image is still upside-down.
I was going to suggest looking at VideoFormat.Stride provided at VideoSink.OnFormatChange but then I noticed your edit. I went ahead and tested it at my dev machine, image is upside down and stride is negative as expected. Have you checked again recently?
Even though stride made perfect sense for native applications (using stride at pointer operations), I agree that current behavior is not what you expect from a modern API. However performance wise, it is better not to make changes on data received from native API.
Yet at this point, while we are talking about performance, why not provide samples in formats other than PixelFormatType.Format32bppArgb so that we can avoid color space conversion? BTW, there is a VideoCaptureDevice.DesiredFormat property which only works for resolution as there is no alternative to PixelFormatType.Format32bppArgb.

Resources