What is the intended use case of ARKit's tracked raycast? - scenekit

I'm developing an AR app for iOS that lets the user place a model in the physical world, using ARKit and SceneKit.
I've looked at this Apple sample code for inspiration. In the sample project, they use tracked raycasts to position 3D models in a scene. This is close to what I want, and what led me to assume I need to do the same to achieve the most accurate positioning.
However, when I use a tracked raycast to position my model, the model drifts around the scene a lot as ARKit updates the position of the raycast.
I get much more stable positioning when using a non-tracked raycast.
That makes me ask: what actually is the intended use case for a tracked raycast? Am I understanding this API wrong?
I've tried:
Positioning the model using an image anchor. This is very stable.
Positioning the model using a non-tracked raycast. This is about as stable as the image anchor.
Positioning the model using a tracked raycast. This drifts all over the scene.
I also understand what an AR raycast in general is for:
getting the intersection of a 2D point on the screen with the 3D geometries that ARKit is tracking.
As this post has explained already.

In Apple's example app you mentioned, raycasting is used to update the FocusSquare all the time. You don't really need it for placing your model. You can get a certain (real-world) position (using the FocusSquare) to place a model on that exact location. For this you can fetch static positon data from the FocusSquare at the moment you add your model on scene. I hope I understood corectly what you want.

Related

Google model-viewer - How to fully rotate around the model?

When using Google's model-viewer, it's not possible to see the model from above, the orbit of the camera seems limited to some arbitrary angle.
I've tried to set max-camera-orbit but without success.
Can you share a link? the current limit seems to be close to the normal of the polar north of the model. Do you want to be able to go beyond that (i.e. make the model upside down)?

What is the best approach to render 3D model in iOS after facial landmark detection?

I would like to place a hairstyle after facial landmark detection. I'm able to render 2D images properly. I would like to render 3D model. I thought of using SceneKit to render 3D model. I would like to know how Instagram, snapchat and other face filter apps are rendering 3D models. I observe SceneKit coordinate system is different from UIKit coordinate system. I have googled but couldn't find the conversion of coordinate system. Could anyone help. Thanks.
Look for worldUp and simdWorldUp instance properties to understand how ARKit constructs a scene coordinate system based on real-world device motion (also, you can inspect ARConfiguration.WorldAlignment enum).
Please, look at this SO post: Understand coordinate spaces in ARKit for complete info.
And remember, ARAnchor is your best friend when placing 3D object. Click here
for further details.

Is it possible to get a "SCNVector3" position of a World object using CoreML and ARKit?

I am working on a AR based solution in which I am rendering some 3D models using SceneKit and ARKit. I have also integrated CoreML to identify objects and render corresponding 3D objects in scene.
But right now I am just rendering it in the center of screen as soon I detect the object(Only for the list of objects that I have). Is it possible to get the position of the real world object so that I can show some overlay above the object?
That is if I have a water bottled scanned, I should able to get the position of the water bottle. It could be anywhere in the water bottle but shouldn't go outside of it. Is this possible using SceneKit?
All parts of what you ask are theoretically possible, but a) for several parts, there’s no integrated API to do things for you, and b) you’re probably signing yourself up for a more difficult problem than you think.
What you presumably have with your Core ML integration is an image classifier, as that’s what most of the easy to find ML models do. Image classification answers one question: “what is this a picture of?”
What you’re looking for involves at least two additional questions:
“Given that this image has been classified as containing (some specific object), where in the 2D image is that object?”
“Given the position of a detected object in the 2D video image, where is it in the 3D space tracked by ARKit?”
Question 1 is pretty reasonable. There are models that do both classification and detection (location/bounds within an image) in the ML community. Probably the best known one is YOLO — here’s a blog post about using it with Core ML.
Question 2 is the “research team and five years” part. You’ll notice in the YOLO papers that it gives you only coarse bounding boxes for detected objects — that is, it’s working in 2D image space, not doing 3D scene reconstruction.
To really know the shape, or even the 3D bounding box of an object means integrating object detection with scene reconstruction. For example, if an object has some height in the 2D image, are you looking at a 3D object that’s tall with a small footprint, or one that’s long and low, receding into the distance? Such integration would require taking apart the inner workings of ARKit, which nobody outside Apple can do, or recreating an ARKit-alike from scratch.
There might be some assumptions you can make to get very rough estimates of 3D shape from a 2D bounding box, though. For example, if you do AR hit tests on the lower corners of a box and find that they’re on a horizontal plane, you can guess that the 2D height of the box is proportional to the 3D height of the object, and that its footprint on the plane is proportional to the box’s width. You’d have to do some research and testing to see if assumptions like that hold up, especially in whatever use cases your app covers.

Adding hundreds of pushpins programatically to bing maps freezes the WP7 UI

I'm working on a WP7 app that uses bing maps to display ~600 pushpins. When i add them to the map using map.Children.Add(pushpin) the UI freezes for ~200 ms. I've seen that in silverlight you can use Microsoft.Maps.EntityCollection to add pins to a map but unfortunately I couldn't find how to use the assembly on WP7. Does anyone know a solution to this?
Maybe you're looking at the problem the wrong way round. WP7 is a compact (though powerful) that excels at showing the user what they want to know quickly (when the apps are written properly).
The user can't possibly see 600 pushpins in one go on a device that small, so why not just show them pushpins that are in the viewable area (or close to it) and add pushpins as the user pans around the map?
Alternatively you could "trickle" feed the pushpins by adding them one (or more) at a time using the DispatcherTimer so that the user sees pushpins being gradually added without drastically affecting performance.
Another possibility (which is what I usually do) is to add a MapItemsControl with the DataTemplate set to a Pushpin and to bind the collection to your collection of pushpin locations. If the binding is to an ObservableCollection you can "trickle" feed it as mentioned above if perf is an issue.
In a viewpoint similar to Derek's, I find it highly unlikely that you seriously want to put 600 pins on the screen at the same time. I'm guessing that they span a large geographic area and the user is unlikely to see more than a handful at a time.
If this is the case, you can trivially apply a cliprect to cull your points, then add the resultant modest list to a layer, and Presto! High performance.
In addition, there is the issue of what to do when the user zooms a long way out, bringing so many pins into view that they merge into one big useless but brightly coloured blob. This is a more complex problem traditionally solved with a quadtree, and I have a suspicion that you just said "a what?" but luckily Google is your friend.
Oh, and to address your stated problem - don't add the pins directly to a map. Add them to a MapLayer and then add that.

What is the best approach to render charts in WPF?

What is the best approach to render charts and then save them on a hard drive for further distribution using WPF?
I found a number of ways to accomplish this by using the following types:
DrawingVisual - creating a object of this type and then rendering graphics on its context;
Shape - deriving from the Shape class and then overriding its DefiningGeometry property where the actual rendering is happening;
PathFigure - adding LineSegment-s to an instance of this class and then adding this instance to a Canvas;
Adorner - deriving from it and then overriding its OnRender method;
WritableBitmap - rendering on it and then adding the bitmap to a Canvas.
Of course I'm going to write an app to test how fast each of these will be. But can anybody tell me:
whether am I on the right track?
are there any other means to do such rendering?
which one of them is the best in
terms of performance?
It all depends on your actual usage, in your case you mention saving on the hard drive for "further distribution" - I'm going to assume you are saving them as an image (jpg or png) and not as wpf objects (xaml).
You should consider if WPF is the right tool for the job, WPF is a UI framework and not a generic image processing library, it may be best to use something else entirely for generating images.
For a reasonable number of points your performance bottleneck will be encoding the image and saving it to disk - not actually rendering it - so you should choose the method that is easier for you to code.
All the articles about high performance WPF charts are a: about charts with 10,000 points and more (because that is where the performance problems are), b: about charts you display in your GUI (because otherwise you can use an image processing library to create the bitmap) and c: charts that change all the time (so they work nicely with data binding) - there's a reason why they don't talk about saving charts to disk.
For a very large number of points:
The fastest way to draw in WPF is to inherit from FrameworkElement (not Adorner) and override OnRender.
When the data changes often it is recommended to use multiple DrawingVisual objects because then you don't have to re-render everything when one value change - but this is not relevant for you since the image won't change after you save it anyway.
WritableBitmap is used for raw bitmap access, you use it when you decide to give up on all the nice layout and drawing WPF gives you because you can't take the overhead, if this is the case you should re-read my first point above.
So, to summarize, you are asking the wrong question :-) if you need to save images to disk than either the WPF rendering speed is not your bottleneck or you shouldn't be using WPF to begin with. If you do use WPF just pick whatever is easiest for you to code.
BTW: Adorners are used to display "floating" elements above the normal UI, you can use them for tooltip-like features but not for the main chart rendering (and you probably don't want them at all since your main usage is saving the image to disk), FrameworkElement is the base class you are looking for.

Resources