Visualization for Indoor Positioning System in Real Time - maps

I want to visualize moving assets on a indoor map of a facility.
I think the first step would be to trace floor plan image in some form of accurate vector drawing with precise lengths of all the structures to create a digitized version of the facility.
The hardware setup gives me relative x,y positioning, for example within 50x50 meters bounding box where coordinates are 0,0 (left bottom point) to 50,50 (right top).
Accuracy of the indoor map drawing is very critical for the application as I need to plot moving objects. I came across OpenStreetMap's indoor maps libraries like openindoor6, it looks good for static maps to show internal structures of the buildings, but I've doubts about the measurement accuracy of the structures (length of walls, room sizes, etc) as I'll have to manually Georeference the floorplan, then map the x, y coordinates that I obtain from the hardware to LatLon.
In short, I need tools that'll help me to draw accurate indoor maps with reliable coordinate system, do some layering stuff like using markers, marking zones and indoor geofencing.
I'm looking for open source tools if possible to achieve all this. Any suggestions? TIA.


How to take account in rotation of pixels in building trajectory, from overhead camera?

I am trying to build trajectory of moving camera which is seeing downwards. It works perfectly when camera is just translating, but fails when camera rotates. How to take into account of camera heading?
I am using feature matching which gives me where the specific patch is in my image and identifies the coordinates. I am tracking that patch and it give me trajectory of camera (if camera is not rotating.) But when camera rotates at a single place, it identifies the patch at the same place, and when camera starts moving it doesn't take into account of it.
For eg. if my camera is moving forward in north direction and camera is rotated to south and starts moving forward, my algorithm will not recognize it it builds trajectory just a straight line, instead of a right angle.
How to take into account of the camera rotation.
It works perfectly when camera is just translating, but fails when camera rotates. How to take into account of camera heading?
Direct approach (probably not possible)
Something must be responsible for the camera rotating. This something may know how much the camera rotates and may be able to tell you. I guess that in your case this information might not be readily available though.
Feature based image registration
A single feature is not sufficient to detect all affine transformations (translation, rotation, scaling, ..). You would need to take two features at least (for translation and rotation) or better three features (for full affine transformations) into account.
In case of two features and translation and rotation only, the center of the two features is the translation and the orientation of the connection of the two features is the rotation.
Frequency domain, intensity based image registration
Cross-correlation (via FFT) is fast in detecting translations, however, you can use this technique also for detecting rotation and scaling (see An FFT-based technique for translation, rotation, and scale-invariant image registration or Robust image registration using Log-Polar Transform)
Improved accuracy
Instead of comparing consecutive camera frames with features or intensity based techniques, compare all possible frame combinations within a certain time window (for example the time to move half a frame away), then find the single trajectory that fits all the transformations for all combinations. Computationally more expensive but more accurate.
Some words of caution
If the direct approach fails, you may be fooled by the image structures. In certain cases (uniform images, rotationally symmetric images, ...) it just won't work without an independent confirmation.

How to measure horizontal plane surface(visible in camera) using ARKit-Scenekit before placing objects?

I want to measure the horizontal plane surface to find whether it fits the object that i am going to place. For ex. if i am going to place a cot 3D model(with fixed size) in a room using iOS 11 ARKit,
First i want to detect if that room surface is sufficient or not to place my 3D model by measuring the surface area(width and height etc.)
Second if the user tries to place it without sufficient place, i should not allow him to place the cot and show him error message.
I created a sample POC by following using which i am able to detect the horizontal plane and place the cot. But the issue is whatever may be the surface, user is able to place the cot which shouldn't be allowed in real time.
I saw couple of demos in which they say we can measure the size of the room or a horizontal plane(
I am using ARKit Scenekit inorder to achieve this and i am new to AR and Scenekit. I need to know if this is doable, and if so how to achieve it.
You could estimate the size of a detected plane by inspecting its dimensions. But you shouldn't.
ARKit has plane estimation, not scene reconstruction. That is, it'll tell you there's a flat surface at (some point) and that said surface probably extends at least (some distance) from that point. It doesn't know exactly how big the surface is (it's even refining its estimate over time), and it doesn't tell you where there are interruptions in that continuous surface, much less the size and shape of such interruptions.
In fact, if you're looking at the floor and moving around, and you see one patch of floor, then another patch of floor on the other side of a solid wall from the first, ARKit will happily recognize that those two patches are coplanar and merge them into the same anchor. At the same time, neither detected patch may cover the entire extent of the floor around it.
If you attempt to restrict where the user can place virtual objects in AR based on plane estimates, you're likely to frustrate them with two kinds of error: you'll have areas where it looks to the user like they can place something but that don't allow it, and you'll have areas that look like they should be off-limits that do allow placing things.
Instead, design your experience to involve the user in deciding where the sensible places for content are. See this demo for example — ARKit detects the level of the floor (not its boundaries), then uses that to show UI indicating the size/shape of objects to be placed. It's up to the user to make sure there's enough room for the couch, etc.
As for the technical how-to on what you probably shouldn't do: The docs for ARPlaneAnchor.extent say that the x and z coordinates of that vector are the width and length of the estimated plane. And all units in ARKit are meters. (Which is width and which is length? It's a matter of perspective. And of the rotation encoded in the anchor's transform.)

Detect road surface in a traffic scene point cloud

I want to analyze a traffic scene. My source data is a point cloud like this one (see images at the bottom of that post). I want to be able to detect objects that are on the road (cars, cyclists etc.). So first of all I need know where the road surface is so that I can remove or ignore these points or simply just run a detection above the surface level.
What are the ways to detect such road surface? The easiest scenario is a straight and flat road - I guess I could try to registrate a simple plane to the approximate position of the surface (I quite surely know it begins just in front of the car) and because the road surface is not a perfect plane I have to allow some tolerance around the plane.
More difficult scenario would be a curvy and wavy (undulated?) road surface that would form some kind of a 3D curve... I will appreciate any inputs.
A relatively simple starting point:
If you can assume that the road surface starts directly in front of the camera then you can use a region growing algorithm to find a region such that the curvature does not change so much within the region (thereby using sharp edges to delineate the region). This would involve calculating the curvature first. This can make a first approximation; there will be issues with occluding objects and other artefacts I am sure.

Blob detection in C (not with OPENCV)

I am trying to do my own blob detection who will receive a real time video, and try to detect a white paper sheet.
Even if is something written inside the paper. I need to detect the paper and is corner, because what i really want is to draw a opengl polygon over the paper in each corner of the paper will be a corner of the polygon. Then i need the coordinates of the paper to do other stuffs.
So i need to:
- detect a square white blob.
- get the coordinates of the cornes
- draw a polygon over the white sheet.
Any ideias how can i do that?
Much depends on context. For example, suppose that you:
know that the paper is always roughly centered (i.e. W/2, Y/2 is always inside the blob), and no more rotated than 45 degrees (30 would be better)
have a suitable border around the sheet so that the corners never touch the edges of the FOV
are able (through analysis of local variance, or if you're lucky, check of background color or luminance) to say whether a point is inside or outside the blob
the inside/outside function never fails (except possibly in the close vicinity of a border)
then you could walk a line from a point on the border (surely outside) and the center (surely inside), even through bisection, and find a point - an areal - on the edge.
Two edge points give a rect (two areals give a beam), two rects give an intersection (two beams give a larger areal) - and there's your corner. You should carry along the detection uncertainty (areal radius) in order to validate corners (another less elegant approach is to roughly calculate where the corner is, and pinpoint it with a spiral search or drunkard's walk).
This algorithm is amenable to parallelization and, as long as the hypotheses hold, should be really fast.
All that said, it remains a hack -- I agree with unwind, why reinvent the wheel? If you have memory or CPU constraints (embedded systems, etc.), I believe there ought to be OpenCV and e-Vision "lite" ports also for ARM and embedded platforms.
(Sorry for my terminology - I'm monkey-translating from Italian. "Areal" is likely to correspond to your "blob", a beam is the family of lines joining all couples of points in two different blobs, line intensity being the product of distance from a point from its areal's center)
I am trying to do my own blob detection who will receive a real time video, and try to detect a white paper sheet.
Your first shot could be a simple flood-fill. That is, select a good threshold to binarize the image and apply the algorithm. The threshold can be fixed if you know the paper is always brighter than X and the background is always darker than this. Or this can be an adaptive threshold, for example Otsu's method. OpenCV offers this for free.
If you'd need to speed it up you could use a union-find data structure.
Finally you'd need to come up with some heuristic how to identify the corners (e.g. the four extreme values in x/y direction).
Then i need [...] the coordinates of the cornes [...]
Then you don't need blob detection, but corner detection or contour detection in the first place. OpenCV has some nice functionality for exactly this.
If you can't use it, I would suggest to binarize the image as above and use a harris-detector to find the corners of the object.
OpenCV's TBB support could also come quite handy if you'd use it and you have problems to meet your real-time requirements.

What is dataset Bounding Box?

I am bit new to Imaging and want to understand below:
what is the bounding box of a dataset and why is that needed? Does it represent something of measurement in real world or just for computer screen where it is displayed? How is this related to the image size specified in pixels?
What does WMTS layers zoom level & matrix sets mean? I understand that WMTS works by using getting tiles of the dataset. Also, I see that the get Capabilities for a specific WMTS dataset returns back matrix Sets in the XML which I don't understand?
what do the matrix datasets and zoom levels signify and how can I understand them as a layman?
I have tried googling a bit but it looks like the articles assume some technical knowledge around this already which I am trying to gather.
A bounding box is the (imaginary) rectangle that you can draw around a dataset (or feature) that touches it's maximums and minimums in both X and Y direction. It is measured in the same units as the geometry. It is related to the image size in pixels as the resolution or scale which are bbox.width/image.width or (the inverse), and are in units of metres/pixel or pixels/metre (or degree or foot).
A WMTS layer is a set of pre-rendered tiles that have been produced at a fixed set of scales and over a fixed area. These are related in the matrix sets of a WMTS layer - the zoom level is how far down that set of scales you have traversed with 0 being the top and an arbitrary number (usually between 15-20 for global data sets) being the lowest (or most detailed).
See 2. - You should not really need to understand them in detail as your client library will handle all of that for you.
