Obtaining meaningful velocity information from noisy position data - video-processing

If this question is posted on the wrong stackexchange site - please suggest where I can migrate it to!
I'm studying the velocity of an object that undergoes multiple conditions with walls as well as with other objects. The raw data of the position of the object is slightly noisy, for two reasons: firstly, the resolution of the video is limited, and secondly, my tracking software also has some error in tracking the object (as the image of the object changes slightly over time).
If the velocity of the object is calculated simply by using the raw data of the position of the object, there is significant error (more than that of the velocity) as the object is being tracked at a high frame rate.
I am most interested in the velocity of the object during the time right before and after collisions, and this is thus a significant problem.
Possible options I've considered / attempted.
Applying a discrete Kalman Filter on the position data: This is a solution that comes out pretty often in posts about relate question. However, given that we have all the data when we begin to smooth our data, is the Kalman Filter the best way to make use of the available data? My understanding is that the filter is designed for data that is coming in over time (e.g. position data that is being received in real-time rather than a full set of position data).
Applying Savitsky-Golay smoothing on the position data: When I tried this on my data, I found that significant artifacts were introduced in the area of ±10 data points after each collision. I suspect this has something to do with the significant acceleration at collision, but after trying a range of parameters for the SG smoothing, I am unable to eliminate the artifacts.
Separating the data at collision, then smoothing velocities using a moving average: To overcome the problem introduced by the acceleration at each collision, I separated the data into multiple series at each collision point. For example, if there were three collisions, the data would be separated into four series. The velocity for each data series was then calculated and smoothed using a moving average.
In addition, some of my colleagues suggested passing the velocity information through a low-pass filter, which I have not attempted.
The two questions below are related to mine, and are provided as a reference.
Smooth of series data
Smooth GPS data
In addition, the paper below also seems to provide a good suggestion of how to implement the Kalman Filter, albeit for real-time data.
http://transportation.ce.gatech.edu/sites/default/files/files/smoothing_methods_designed_to_minimize_the_impact_of_gps_random_error_on_travel_distance_speed_and_acceleration_profile_estimates-trr.pdf

Choosing an appropriate filtering algorithm depends mainly on the behavior of your object and your measurement errors (or noise). So I can only give some generic tips:
Differentiation, i.e., calculating the velocity from position data amplifies noise considerably. So probably you do need some kind of smoothing. My ad-hoc approach would be: Fourier-Transform your position data, do the derivative in Fourier space and play around to find an appropriate boundaries for low-path filtering. Applying other transfer functions to your transformed positioned data can be interpreted as kernel smoothing (though some mathematical insight in kernel methods is needed to do that properly).
The Kalman filter is state estimator, which works recursively. If you have a proper (discrete time) motion model & measurement model, it will yield in good results and give you a direct estimate for the velocity. The rules of thumb for such an approach:
Model in the 3D space or 6D space if your object has rotational degrees of freedom, not in image space (the noise behaves differently)
Carefully investigate your projection errors (camera calibration) & carefully choose the noise parameters
Use an Unscented Kalman filter (it is much better than an Extended Kalman filter) if non-linearities occur
Kalman filtering and low path filtering are closely related. For many simple applications the Kalman Filter can be thought of an adaptive low path filter, which does smoothing.
The non-recursive Kalman Filter is a called Gaussian process - though I only see advantages over the Kalman filter if your trajectories have a small number data points. Their application is not as straight forward as the KF.

Related

Motion recognition with Artificial Neural Networks

I am developing (for my senior project) a dumbbell that is able to classify and record different exercises. The device has to be able to classify a range of these exercises based on the data given from an IMU (Inertial Measurement Unit). I have acceleration, gyroscope, compass, pitch, yaw, and roll data.
I am leaning towards using an Artificial Neural Network in order to do this, but am open to other suggestions as well. Ultimately I want to pass in the IMU data into the network and have it tell me what kind of exercise it is (Bicep curl, incline fly etc...).
If I use an ANN, what kind should I use (recurrent or not) and how should I implement it? I am not sure how to get the network to recognize an exercise when I am passing it a continuous stream of data. I was thinking about constantly performing an FFT on a portion of the inputs and sending a set number of frequency magnitudes into the network, but am not sure if that will work either. Any suggestions/comments?
Your first task should be to collect some data from the dumbbell. There are many, many different schemes that could be used to classify the data, but until you have some sample data to work with, it is hard to predict exactly what will work best.
If you get 5 different people to do all of the exercises and look at the resulting data yourself (e.g. pilot the different parts of the data collected), can you distinguish which exercise is which? This may give you hints on what pre-processing you might want to perform on the data before sending it to a classifier.
First you create a large training set.
Then you train it, telling it what actually happens.
And you might uses averages of data as well.
Perhaps use actual movement and movement that is averaged over 2 sec 5 sec and 10 sec. use those too as for input nodes.
while exercising the trained network can be feeded with the averaged data as well ea (the last x samples divided by x), this will give you a stable approach. Otherwise the neural network can become hectic erratic.
Notice the training set might require averaged data as well and thus you will need a large training set.

Open Street Map enclosing polygons

I am working on an Android application that uses the Overpass API at [1]. My goal is to get all circular ways that enclose a certain lat-long point.
In order to do so I build a request for a rectangle that contains my location, then parse the response XML and run a ray-casting algorithm to filter the ways that enclose the given lat-long position. This is too slow for the purpose of my application because sometimes the response has tens or hundreds of MB.
Is there any OSM API that I can call to get all ways that enclose a certain location? Otherwise, how could I optimize the process?
Thanks!
[1] http://overpass-api.de/
To my knowledge, there is no standard API in OSM to do this (it is indeed a very uncommon usecase).
I assume you define enclose as the point representing the current location is inside the inner area of the polygon. Furthermore I assume optimizing the process might including changing the entire concept of the algorithm.
First of all, you need to define the rectangle to fetch data. For that, you need to consider that querying a too large rectangle would yield too much data. As far as I know there is no specific API to query circular ways only, and even if there is, querying a too large rectangle would probably denied by the server, because the server load would be enormous.
Server-side precomputation / prefiltering
Therefore I suggest the first optimization: Instead of querying an API that is not specifically suited for your purpose, use an offline database saved on the Android device. OsmAnd and others save the whole database for a country offline, but in your specific usecase you only need to save a pre-filtered database of circular ways.
As far as I know, only a small fraction of the ways in OSM is circular. Therefore I suggest writing a script that regularly downloads OSM dumps e.g. from Geofabrik, remove non-circular ways (e.g. you could check if the last node ID in a way is equal to the first node ID, but you'd need to check if that captures any way you would define as circular). How often you would run it depends on your usecase.
This optimization solves:
The issue of downloading a large amount of data
The issue of overloading the API with large request
The issue of not being able to request large chunks of data
If that is not suitable for your usecase, I suggest to build a simple API for that on your server.
Re-chunking the data into appriopriate grids
However, you still would need to filter a large amount of data. In order to partially solve this, I suggest the second optimization: Re-chunk your data. For example, if your current location is in Virginia, you would not need to filter circular ways that have an area not beyond Texas. Because filtering by state etc. would by highly country-dependent and difficult (CPU-intensive), I suggest to choose a grid, say e.g. 0.05 lat/lon degree (I'd choose a equirectangular projection because it's easy to calculate if you already have lat/lon coordinates).
The script that preprocessed that data shall then create one chunk of data (that could be a file, but we don't know enough about your usecase to talk about specific data strucutres) for any rectangle in the area you want to use. A circular way is included in this chunk if and only if it has at least one node that is inside the chunk area.
You would then only request / filter the specific chunk your position is currently in. Choose the chunk size appropriately for your application (preferably rather small, but that depends on numerous factors!).
This optimization solves:
Assuming most of the circular ways are quite small in terms of their bounding rectangles, you only need to filter a tiny fraction of the overall ways
IO is minimized, especially if you
Hysteretic heuristics
If the aforementioned optimizations do not sufficiently reduce your computation time, I'd suggest the third optimization that depends on how many circular ways you want to find (if you really need to find all, it won't help at all): Use hysteresis. Save the circular ways you were inside of during the last computation (assuming the new current location is near to the last location) and check them first. If your location didn't change too much, you have a high chance of hitting a way you're inside of during the first few raycasts.
Leveraging relations between different circular ways
Also, a fourth optimization is possible: There will be some circular ways that are fully enclosed in another circular way. You could code your program so that it knows about that relation and checks the inner circular way first. If this check succeeds, you automatically now that the current position is also contained in the outer circular way. I think computing the information (server-side) could be incredibly CPU-intensive and implementing it might also be a hard task, so I'd suggest to use this optimization only if not avoidable.
Tuning the parameters of these optimizations should be sufficient to decrease the CPU time needed for your computation significantly. Please feel free to comment/ask if you have further questions regarding these suggestions.

Which DSP filter-algorithms are simple to implement?

I have a datalogger with very little memory, and I want to implement support for filters in the firmware.
Which types of filters can I implement easily without the need for buffers or huge functions?
One that comes to mind is an exponential moving average, something like:
sample = (alpha * new_sample) + (1.0 - alpha) * sample
Are there any other well-known DSP-filters that could be accomplished in a few lines?
It is impossible to implement frequency selective filtering without some buffering. Even the example you give requires a buffer of one sample. First off, forget about FFT filtering for most realtime data. For most filtering applications, and certainly applications where you are concerned about memory, you will want to use a time-domain filter.
Time-domain filters generally come in two favors, IIR and FIR. Filters are also distinguished based on their "order". The example you gave above is a first order IIR filter. The relevant facts are:
Broadly speaking, IIR filters require a lower order than FIR for a given response.
A filter can be implemented with a number of memory locations equal to the order of the filter. This is not always the best way to implement a filter, but it can be done using something called Direct Form II.
For a broad range of applications, second order IIR filters (sometimes called "biquads") are an excellent choice. I have a tutorial on second order biquads here. It is oriented towards audio applications, but you will probably find it useful. Keep in mind this tutorial uses Direct Form I which is more numerically stable, but requires more memory locations. At four locations, though, I don't think it's much to fret about even for a highly memory starved application.

SIFT- image comparison

I am comparing two images using SIFT in java using sift implementation by Stephan Saalfeld-- http://fly.mpi-cbg.de/~saalfeld/Projects/javasift.html. But due to lack of proper example,i am finding difficult in using it. I am able to get the descriptors for the two images, then their corresponding matching descriptors and finally applying RANSAC to neglect the false matches. Now, i am left with a number of inliers. But I am confused how to conclude if two images are similar or not?
RANSAC gives you the transformation matrix (including translation,rotation, and scaling values). Using this information you can try to fit images on each other in order to see the matches that are found by SIFT.
An advantage of RANSAC is its ability to do robust estimation of the model parameters, i.e., it can estimate the parameters with a high degree of accuracy even when a significant number of outliers are present in the data set. A disadvantage of RANSAC is that there is no upper bound on the time it takes to compute these parameters. When the number of iterations computed is limited the solution obtained may not be optimal, and it may not even be one that fits the data in a good way. In this way RANSAC offers a trade-off; by computing a greater number of iterations the probability of a reasonable model being produced is increased. Another disadvantage of RANSAC is that it requires the setting of problem-specific thresholds.
RANSAC can only estimate one model for a particular data set. As for any one-model approach when two (or more) model instances exist, RANSAC may fail to find either one. The Hough transform is an alternative robust estimation technique that may be useful when more than one model instance is present.
Concluding, you can say how much two images are similar. It cannot always tell you that it is a total match or a total difference. So you will get the matches after applying RANSAC. Then you can find out that the percentage of the good matches over total matches, and then you need to decide according to this information.

AI Behavior Decision making

I am running a physics simulation and applying a set of movement instructions to a simulated skeleton. I have a multiple sets of instructions for the skeleton consisting of force application to legs, arms, torso etc. and duration of force applied to their respective bone. Each set of instructions (behavior) is developed by testing its effectiveness performing the desired behavior, and then modifying the behavior with a genetic algorithm with other similar behaviors, and testing it again. The skeleton will have an array behaviors in its set list.
I have fitness functions which test for stability, speed, minimization of entropy and force on joints. The problem is that any given behavior will work for a specific context. One behavior works on flat ground, another works if there is a bump in front of the right foot, another if it's in front of the left, and so on. So the fitness of each behavior varies based on the context. Picking a behavior simply on its previous fitness level won't work because that fitness score doesn't apply to this context.
My question is, how do I program to have the skeleton pick the best behavior for the context? Such as picking the best walking behavior for a randomized bumpy terrain.
In a different answer I've given to this question, I assumed that the "terrain" information you have for your model was very approximate and large-grained, e.g., "smooth and flat", "rough", "rocky", etc. and perhaps only at a grid level. However, if the world model is in fact very detailed, such as from a simulated version of a 3-D laser range scanner, then algorithmic and computational path/motion planning approaches from robotics are likely to be more useful than a machine-learning classifier system.
PATH/MOTION PLANNING METHODS
There are a fairly large number of path and motion planning methods, including some perhaps more suited to walking/locomotion, but a few of the more general ones worth mentioning are:
Visibility graphs
Potential Fields
Sampling-based methods
The general solution approach would be use a path planning method to determine the walking trajectory that your skeleton should follow to avoid obstacles, and then use your GA-based controller to achieve the appropriate motion. This is very much at the core of robotics: sense the world and determine actions and motor control required to achieve some goal(s).
Also, a quick literature search turned up the following papers and a book as a source of ideas and starting points for further investigation. The paper on legged robot motion planning may be especially useful as it discusses several motion planning strategies.
Reading Suggestions
Steven Michael LaValle (2006). Planning Algorithms, Cambridge University Press.
Kris Hauser, Timothy Bretl, Jean-Claude Latombe, Kensuke Harada, Brian Wilcox (2008). "Motion Planning for Legged Robots on Varied Terrain", The International Journal of Robotics Research, Vol. 27, No. 11-12, 1325-1349,
DOI: 10.1177/0278364908098447
Guilherme N. DeSouza and Avinash C. Kak (2002). "Vision for Mobile Robot Navigation: A Survey", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 2, February, pp 237-267.
Why not test the behaviors against a randomized bumpy terrain? Just set the parameters of the GA so that it's a little forgiving, and won't condemn a behavior for one or two failures.
You have two problems:
Bipedal locomotion without senses is very difficult. I've seen good robotic locomotion over rough terrain without senses, but never with only two legs. So the best solution you can possibly find this way might not be very good.
Running a GA is as much art as science. There are a lot of knobs you can turn, and it's hard to find parameters that will allow novelty to grow without drowning it in noise.
Starting simple (e.g. crawling) will help with both of these.
EDIT:
Wait... you're training it over and over on the same randomized terrain? Well no wonder you're having trouble! It's optimizing for that particular layout of rocks and bumps, which is much easier than generalizing. Depending on how your GA works, you might get some benefit from making the course really long, but a better solution is to randomize the terrain for every pass. When it can no longer exploit specific features of the terrain, it will have an evolutionary incentive to generalize. Since this is a more difficult problem it will not learn as quickly as it did before, and it might not be able to get very good at all with its current parameters; be prepared to tinker.
There are three aspects to my answer: (1) control theory, (2) sensing, and (3) merging sensing and action.
CONTROL THEORY
The answer to your problem depends partially on what kind of control scheme you are using: is it feed-forward or feedback control? If the latter, what simulated real-time sensors do you have other than terrain information?
Simply having terrain information and incorporating it into your control strategy would not mean you are using feedback control. It is possible to use such information to select a feed-forward strategy, which seems closest to the problem that you have described.
SENSING
Whether you are using feed-forward or feedback control, you need to represent the terrain information and any other sensory data as an input space for your control system. Part of training your GA-based motion controller should be moving your skeleton through a broad range of random terrain in order to learn feature detectors. The feature detectors classify the terrain scenarios by segmenting the input space into regions critical to deciding what is the best action policy, i.e., what control behavior to employ.
How to best represent the input space depends on the level of granularity of the terrain information you have for your simulation. If it's just a discrete space of terrain type and/or obstacles in some grid space, you may be able to present it directly to your GA without transformation. If, however, the data is in a continuous space such as terrain type and obstacles at arbitrary range/direction, you may need to transform it into a space from which it may be easier to infer spatial relationships, such as coarse-coded range and direction, e.g., near, mid, far and forward, left-forward, left, etc. Gaussian and fuzzy classifiers can be useful for the latter approach, but discrete-valued coding can also work.
MERGING SENSING AND ACTION
Using one of the input-space-encoding approaches above, you have a few options for how to connect behavior selection search space and motion control search space:
Separate the two spaces into two learning problems and use a separate GA to evolve the parameters of a standard multi-layer perceptron neural network. The latter would have your sensor data (perhaps transformed) as inputs and your set of skeleton behaviors as outputs. Instead of using back-propagation or some other ANN-learning method to learn the network weights, your GA could use some fitness function to evolve the parameters over a series of simulated trials, e.g., fitness = distance traveled in a fixed time period toward point B starting from point A. This should evolve over successive generations from completely random selection of behaviors to something more coordinated and useful.
Merge the two search spaces (behavior selection and skeleton motor control) by linking a multi-layer perceptron network as described in (1) above into the existing GA-based controller framework that you have, using the skeleton behavior set as the linkage. The parameter space that will be evolved will be both the neural network weights and whatever your existing controller parameter space is. Assuming that you are using a multi-objective genetic algorithm, such as the NSGA-II algorithm, (since you have multiple fitness functions), the fitness functions would be stability, speed, minimization of entropy, force on joints, etc, plus some fitness function(s) targeted at learning the behavior-selection policy, e.g., distance moved toward point B starting from point A in a fixed time period.
The difference between this approach and (1) above is that you may be able to learn both better coordination of behaviors and finer-grain motor control since the parameter space is likely to be better explored when the two problems are merged as opposed to being separate. The downside is that it may take much longer to converge on reasonable parameter solutions(s), and not all aspects of motor control may be learned as well as they would if the two learning problems were kept separate.
Given that you already have working evolved solutions for the motor control problem, you are probably better off using approach (1) to learn the behavior-selection model with a separate GA. Also, there are many alternatives to the hybrid GA-ANN scheme I described above for learning the latter model, including not learning a model at all and instead using a path planning algorithm as described in a separate answer from me. I simply offered this approach since you are already familiar with GA-based machine learning.
The action selection problem is a robust area of research in both machine learning and autonomous robotics. It's probably well-worth reading up on this topic in itself to gain better perspective and insight into your current problem, and you may be able to devise a simpler strategy than anything I've suggested so far by viewing your problem through the lens of this paradigm.
You're using a genetic algorithm to modify the behavior, so that must mean you have devised a fitness function for each combination of factors. Is that your question?
If yes, the answer depends on what metrics you use to define best walking behavior:
Maximize stability
Maximize speed
Minimize forces on joints
Minimize energy or entropy production
Or do you just try a bunch of parameters, record the values, and then let the genetic algorithm drive you to the best solution?
If each behavior works well in one context and not another, I'd try quantifying how to sense and interpolate between contexts and blend the strategies to see if that would help.
It sounds like at this point you have just a classification problem. You want to map some knowledge about what you are currently walking on to one of a set of classes. Knowing the class of the terrain allows you to then invoke the proper subroutine. Is this correct?
If so, then there are a wide array of classification engines that you can use including neural networks, Bayesian networks, decision trees, nearest neighbor, etc. In order to pick the best fit, we will need more information about your problem.
First, what kind of input or sensory data do you have available to help you identify the behavior class you should invoke? Second, can you describe the circumstances in which you will be training this classifier and what the circumstances are during runtime when you deploy it, such as any limits on computational resources or requirements of robustness to noise?
EDIT: Since you have a fixed number of classes, and you have some parameterized model for generating all possible terrains, I would consider using k-means clustering. The principle is as follows. You cluster a whole bunch of terrains into k different classes, where each cluster is associated with one of your specialized subroutines that performs best for that cluster of terrains. Then when a new terrain comes in, it will probably fall near one of these clusters. You then invoke the corresponding specialized subroutine to navigate that terrain.
Do this offline: Generate enough random terrains to sufficiently sample the parameter space, map these terrains to your sensory space (but remember which points in sensory space correspond to which terrains), and then run k-means clustering on this sensory space corpus where k is the number of classes you want to learn. Your distance function between a class representative C and a point P in sensory space would be simply the fitness function of letting algorithm C navigate the terrain that generated P. You would then get a partitioning of your sensory space into k clusters, each cluster mapping to the best subroutine that you've got. Each cluster will have a representative point in sensory space.
Now during runtime: You will get some unlabeled point in sensory space. Use a different distance function to find the closest representative point to this new incoming point. That tells you what class the terrain is.
Note that the success of this method depends on the quality of the mapping from the parameter space of terrain generation to sensory space, from sensory space to your fitness functions, and the eventual distance function you use to compare points in sensory space.
Note also that if you had enough memory, instead of only using the k representative sensory points to tell you which class an unlabeled sensory point belongs to, you might go through your training set and label all points with the learned class. Then during runtime you pick the nearest neighbor, and conclude that your unlabeled point in sensory space is in the same class as that neighbor.

Resources