The documentation states
GameplayKit also works well for 3D games built with the SceneKit framework
However, there seems to be no mention of using SceneKit's pathfinding features such as GKGraph with SCNNodes that exist in 3d space.
Are GameplayKit's pathfinding feature unsuitable for SceneKit games, or is there extra documentation somewhere to illustrate how to combine the two?
Depends on the scenario really. My current side project is a SceneKit based boat game; boats move on a 2D plane which means GameplayKit's 2D pathfinding works well.
It's not without complications though... SpriteKit gives you some useful functions such as obstaclesFromSpriteTextures:accuracy:, to help with the generation of your pathfinding graph. There is no corresponding function in SceneKit. I've adopted the approach of rendering my scene 'top down' to an offscreen buffer, and using edge detection to trace around the 2D projection of my islands.
For full 3D pathfinding I can't see GameplayKit being much help, well not without some hacks (eg; break 3D pathfinding down into several 2D planes).
Related
I am working on a AR based solution in which I am rendering some 3D models using SceneKit and ARKit. I have also integrated CoreML to identify objects and render corresponding 3D objects in scene.
But right now I am just rendering it in the center of screen as soon I detect the object(Only for the list of objects that I have). Is it possible to get the position of the real world object so that I can show some overlay above the object?
That is if I have a water bottled scanned, I should able to get the position of the water bottle. It could be anywhere in the water bottle but shouldn't go outside of it. Is this possible using SceneKit?
All parts of what you ask are theoretically possible, but a) for several parts, there’s no integrated API to do things for you, and b) you’re probably signing yourself up for a more difficult problem than you think.
What you presumably have with your Core ML integration is an image classifier, as that’s what most of the easy to find ML models do. Image classification answers one question: “what is this a picture of?”
What you’re looking for involves at least two additional questions:
“Given that this image has been classified as containing (some specific object), where in the 2D image is that object?”
“Given the position of a detected object in the 2D video image, where is it in the 3D space tracked by ARKit?”
Question 1 is pretty reasonable. There are models that do both classification and detection (location/bounds within an image) in the ML community. Probably the best known one is YOLO — here’s a blog post about using it with Core ML.
Question 2 is the “research team and five years” part. You’ll notice in the YOLO papers that it gives you only coarse bounding boxes for detected objects — that is, it’s working in 2D image space, not doing 3D scene reconstruction.
To really know the shape, or even the 3D bounding box of an object means integrating object detection with scene reconstruction. For example, if an object has some height in the 2D image, are you looking at a 3D object that’s tall with a small footprint, or one that’s long and low, receding into the distance? Such integration would require taking apart the inner workings of ARKit, which nobody outside Apple can do, or recreating an ARKit-alike from scratch.
There might be some assumptions you can make to get very rough estimates of 3D shape from a 2D bounding box, though. For example, if you do AR hit tests on the lower corners of a box and find that they’re on a horizontal plane, you can guess that the 2D height of the box is proportional to the 3D height of the object, and that its footprint on the plane is proportional to the box’s width. You’d have to do some research and testing to see if assumptions like that hold up, especially in whatever use cases your app covers.
So, to start off, I'm not very good at computer graphics. I'm trying to implement a GUI toolkit where one of the features is being able to apply 3D transformations to 2D "layers". (a layer only has one Z coordinate, as pre-transform, it's a two dimensional axis aligned rectangle)
Now, this is pretty straightforward, until you come to 3D transformations that would push the layer back, requiring splitting the layer into several polygons in order to render it correctly, as illustrated here. And because we can have transparency, layers may not get completely occluded, while still requiring getting split.
So here is an illustration depicting the issue and the desired outcome. In this scenario, the blue layer (call it B) is on top of the red layer (R), while having the same Z position (but B was added after R). In this scenario, if we rotate B, its top two points will get a Z index lower than 0 while the bottom points will get an index higher than 0 (with the anchor point being the only point/line left as 0).
Can somebody suggest a good way of doing this on the CPU? I've struggled to find a suitable algorithm implementation (in C++ or C) that would be appropriate to this scenario.
Edit: To clarify myself, at this stage in the pipeline, there is no rendering yet. We just need to produce a set of polygons for each layer that would then represent the layer's transformed and occluded geometry. Then, if required, rendering (either software or hardware) is done if required, which is not always the case (for example, when doing hit testing).
Edit 2: I looked at binary space partitioning as an option of achieving this but I have only been able to find one implementation (in GL2PS), which I'm not sure how to use. I do have a vague understanding of how BSPs work, but I'm not sure how they can be used for occlusion culling.
Edit 3: I'm not trying to do colour and transparency blending at this stage. Just pure geometry. Transparency can be handled by the renderer, and overdraw is okay. In this case, the blue polygon can just be drawn under the red one, but with more complicated cases, depth sorting or even splitting up the polygons may be required (example of a scary case like that below). Although the viewport is fixed, because all layers can be transformed in 3D, creating a shape shown below is possible.
So what I'm really looking for is an algorithm that would geometrically split layer B into two blue shapes, one of which would be drawn "above" and one of which would be drawn below R. The part "below" would get overdraw, yes, but it's not a major issue. So B just need to be split into two polygons so it would appear to cut through R when those polygons are drawn in order. No need to worry about blending.
Edit 4: For the purpose of this, we cannot render anything at all. This all has to be done purely geometrically (producing 2D polygons). This is what I was originally getting at.
Edit 5: I should note that the overall number of quads per subscene is around 30 (average). Definitely won't go above 100. Unless the layers are 3D transformed (which is where this problem arises), they are just radix sorted by Z positions before being drawn. Layers with the same Z position are drawn in order in which they were added (first in, first out).
Sorry if I didn't make it clear in the original question.
If you "aren't good with computer graphics", Doing it on CPU (software rendering) will be extremely difficult for you, if polygons can be transparent.
The easiest way to do it is to use GPU rendering (OpenGL/Direct3D) with Depth Peeling technique.
Cpu solutions:
Soltuion #1 (extremely difficult):
(I forgot the name of this algorithm).
You need to split polygon B into two, - for example, using polygon A as clip plane, then render result using painter's algorithm.
To do that you'll need to change your rendering routines so they'll no longer use quads, but textured polygons, plus you'll have to write/debug clipping routines that'll split triangles present in scene in such way that they'll no longer break paitner's algorithm.
Big Problem: If you have many polygons, this solution can potentially split scene into infinite number of triangles. Also, writing texture rendering code yourself isn't much fun, so it is advised to use OpenGL/Direct3D.
This can be extremely difficult to get right. I think this method was discussed in "Computer Graphics Using OpenGL 2nd edition" by "Francis S. Hill" - somewhere in one of their excercises.
Also check wikipedia article on Hidden Surface Removal.
Solution #2 (simpler):
You need to implement multi-layered z-buffer that stores up to N transparent pixels and their depth.
Solution #3 (computationally expensive):
Just use ray-tracing. You'll get perfect rendering result (no limitations of depth peeling and cpu solution #2), but it'll be computationally expensive, so you'll need to optimize rendering routines a lot.
Bottom line:
If you're performing software rendering, use Solution #2 or #3. If you're rendering on hardware, use technique similar to depth-peeling, or implement raytracing on hardware.
--edit-1--
required knowledge for implementing #1 and #2 is "line-plane intersection". If you understand how to split line (in 3d space) into two using a plane, you can implement raytracing or clipping easily.
Required knowledge for #2 is "textured 3d triangle rendering" (algorithm). It is a fairly complex topic.
In order to implement GPU solution, you need to be able to find few OpenGL tutorials that deal with shaders.
--edit-2--
Transparency is relevant, because in order to get transparency right, you need to draw polygons from back to front (from farthest to closest) using painter's algorithms. Sorting polygons properly is impossible in certain situation, so they must be split, or you should use one of the listed techniques, otherwise in certain situations there will be artifacts/incorrectly rendered images.
If there's no transparency, you can implement standard zbuffer or draw using hardware OpenGL, which is a very trivial task.
--edit-3--
I should note that the overall number of quads per subscene is around 30 (average). Definitely won't go above 100.
If you will split polygons, it can easily go way above 100.
It might be possible to position polygons in such way that each polygon will split all others polygon.
Now, 2^29 is 536870912, however, it is not possible to split one surface with a plane in such way that during each split number of polygons would double. If one polygon is split 29 timse, you'll get 30 polygons in the best-case scenario, and probably several thousands in the worst case if splitting planes aren't parallel.
Here's rough algorithm outline that should work:
Prepare list of all triangles in scene.
Remove back-facing triangles.
Find all triangles that intersect each other in 3d space, and split them using line of intersection.
compute screen-space coordinates for all vertices of all triangles.
Sort by depth for painter's algorithm.
Prepare extra list for new primitives.
Find triangles that overlap in 2D (post projection) screen space.
For all overlapping triangles check their rendering order. Basically a triangle that is going to be rendered "below" another triangles should have no part that is above another triangle.
8.1. To do that, use camera origin point and triangle edges to split original triangles into several sub-regions, then check if regions conform to established sort order (prepared for painter's algorithm). Regions are created by splitting existing pair of triangles using 6 clip planes created by camera origin points and triangle edges.
8.2. If all regions conform to rendering order, leave triangles be. If they don't, remove triangles from list, and add them to the "new primitives" list.
IF there are any primitives in new primitives list, merge the list with triangle list, and go to #5.
By looking at that algorithm, you can easily understand why everybody uses Z-buffer nowadays.
Come to think about it, that's a good training exercise for universities that specialize in CG. The kind of exercise that might make your students hate you.
I am going to come out say give the simpler solution, which may not fit your problem. Why not just change your artwork to prevent this problem from occuring.
In problem 1, just divide the polys in Maya or whatever beforehand. For the 3 lines problem, again, divide your polys at the intersections to prevent fighting. Pre-computed solutions will always run faster than on the fly ones - especially on limited hardware. From profesional experience, I can say that it also does scale, well it scales ok. It just requires some tweaking from the art side and technical reviews to make sure nothing is created "ilegally." You may end up getting more polys doing it this way than rendering on the fly, but at least you won't have to do a ton of math on CPUs that may not be up to the task.
If you do not have control over the artwork pipeline, this won't work as writing some sort of a converter would take longer than getting a BSP sub-division routine up and running. Still, KISS is often the best solution.
Can someone tell me the difference between Point Sprites and Billboards in OpenGL? I read a lot about both of them and I'm getting confused more and more about when to use which of them and whether there is actually a difference?
Wikipedia only knows about Sprites (Billboard redirects there):
In computer graphics, a sprite is a two-dimensional bitmap that is integrated into a larger scene, most often in a 2D video game. Originally, the term sprite referred to fixed-sized objects composited together, by hardware, with a background.3 Use of the term has since become more general.
One source states:
Sprite
A sprite is the traditional term given to a 2D image displayed in a game
Billboard
... you need to re-orient each particle so that it's facing the viewer.
This technique of re-orienting the sprites is called billboarding.
Another source:
Billboarding is a popular technique used in 3D graphics programming.
Billboarding allows an object (usually a quad) to always face a given
camera. Here are some common uses of billboarding:
– particles – halo surrounding an object – trees rendering
For the particular case of particles, the billboarding is a GPU
built-in feature when point-sprites are used (a single point is
transformed to a billboarded quad).
Yet another states that both face the camera, but billboards only rotate about their vertical axis (think trees).
Some references specifically for OpenGL:
https://learnopengl.com/In-Practice/2D-Game/Rendering-Sprites
https://learnopengl.com/In-Practice/2D-Game/Particles (billboarding)
https://www.opengl-tutorial.org/intermediate-tutorials/billboards-particles/billboards/
Live Examples by Three.js/WebGL (though I can't tell the difference):
https://threejs.org/examples/?q=billboard#webgl_points_billboards
https://threejs.org/examples/?q=sprite#webgl_points_sprites
I'm trying to write an CAD-like application in WPF(.NET 4.0) that needs to be able to display a lot of 2D points/lines. It will be used to display CAD-plans of entire cities with zoom, pan, rotate and point snapping on mouseover.
Right now I purely use WPF. I read the objects from the CAD file draw them into a StreamGeometry, use it as stroke of a new Path and add it to a Canvas, with several transforms.
My problem is that this solution doesn't scale well enough. It works fine with small CAD-files, but when I want to display like half a city(with houses and land boundaries) it is very very delayed.
I also tried to convert my CAD-file to an image, but
- a resolution a 32000x32000 is sometimes not enough
- when zooming out the lines are too thin.
In the end I need to be able to place this on a Canvas(2D/3D) as background.
What are my best options here?
Thanks,
Niklas
wpf is not good for a large 3d models. im afraid it is too slow. Your best bet is direct 3d or openGL
However, even with the speed of direct3d,openGL you will still need to work out how to cull as many polygons/vertices as possible before the rendering of the scene if you are trying to show an entire city.
there is a large amount of information on this (generally under game development)
there are a few techniques including frustrum culling, near and far plane culling.
also, since you probably have a static scene you may be able to use binary spacial partitioning.
As I understand the subject is 2D CAD system within WPF.
Great! I use it...
OpenGL and DirectX are in infinite loop OnDraw always. The CPU works all the time.
WPF/Silverlight 2D is smart model.
Yes, total amount of elements (for example, primitives inherited from Shape) must be not so much. But how many?
I tested own app (Silverlight). WPF will be a bit faster I hope...
Here my 2D CAD results. Performance is still great. Each beam consists of multiple primitives.
Use a VirtualCanvas like this one from Chris Lovett.
I am looking for some algorithms to add a convex mirror effect and concave mirror effect to an image. I want to know also how to make this efficiently: applying the algorithm to image data or overlay it by a transparent image that contains the effect. But I don't think the second choice is applicable in this case.
If you are doing it manually instead of using hardware primitives, then the bresenham interpolation algorithm (usually used for line drawing) is the way to go: error propagation is far more efficient than other, more complex, methods.
What Bresenham does is just interpolation. Don't miss the opportunity to use its efficient design elsewhere (slope calculation for line-drwaing is just one of the many applications of interpolation: you can interpolate another dimension: 2D, 3D, transparency, reflection, colors, etc.).
25 years ago, I remember having used it to resize bitmaps and even do texture mapping in a real-time 3D engine! That was at a time graphic-accelerated video boards costed a fortune...
CImg library has a fisheye sample, in examples\CImg_demo.cpp. The core algorithm seems very simple (and fast, as generally this library). I think it's an approximation of the real optical effect, but could be modified to handle the convex mirroring. I don't know if it could be extended to handle 'negative' curvature.
You can use a pre-calculated sin() table and interpolate values to match the size of your bitmap. The inverse effect is achieved by either using an offset or a larger table.
Remembers me the (great times of the) DOS demos in the 80s...