Distance Threshold using bluetooth - mobile

I know many people have asked this about larger distances, but is there a way to determine using NFC/BLE/Bluetooth whether two phones are within a small distance of one another? (Say along the order of about 2-3 inches). Determination of precise location would be interesting to see, but not critical.

At a very low level, you can check the intensity of the signal.
I don't think you can code an apps who give you the precise distance, because it depends of the power of the emiter and the power of the receiver.ith less than 50 reputation.

Related

Gyroscope drift on mobile phones

Lots of posts talk about the gyro drift problem. Some guys say that the gyro reading has drift, however others say the integration has drift.
The raw gyro reading has drift[link].
The integration has drift[link](Answer1).
So, I conduct one experiment. The next two figures are what I got. The following figure shows that gyro reading doesn't drift at all, but has the offset. Because of the offset, the integration is horrible. So it seems that the integration is the drift, is it?
The next figure shows that when the offset is reduced the integration doesn't drift at all.
In addition, I conducted another experiment. First, I put the mobile phone stationary on the desk for about 10s. Then rotated it to the left then restore to back. Then right and back. The following figure tells the angle quite well. What I used is only reducing the offset then take the integration.
So, my big problem here is that maybe the offset is the essence of the gyro drift(integration drift)? Can complimentary filter or kalman filter be applied to remove the gyro drift in this condition?
Any help is appreciated.
If the gyro reading has "drift", it is called bias and not drift.
The drift is due to the integration and it occurs even if the bias is exactly zero. The drift is because you are accumulating the white noise of the reading by integration.
For drift cancellation, I highly recommend the Direction Cosine Matrix IMU: Theory manuscript, I have implemented sensor fusion for Shimmer 2 devices based on it.
(Edit: The document is from the MatrixPilot project, which has since moved to Github, and can be found in the Downloads section of the wiki there.)
If you insist on the Kalman filter then see https://stackoverflow.com/q/5478881/341970.
By why are you implementing your own sensor fusion algorithm?
Both Android (SensorManager under Sensor.TYPE_ROTATION_VECTOR) and iPhone (Core Motion) offers its own.
The dear Ali wrote something that is really questionable and imprecise (wrong).
The drift is the integration of the bias. It is the visible "effect" of bias when you integrate. The noise - any kind of stationary noise - that has mean zero, consequently has integral zero (I am not talking of the integral of PSD, but of the additive noise of the signal integrated in time).
The bias changes in time, as a function of voltage and exercise temperature. E.g. if voltage changes (and it changes), bias changes. The bias it is not fixed nor "predictable".
That is why you can not eliminate bias using the proposed subtraction of the estimated bias by the signal. Also any estimate has an error. This error cumulates in time. If the error is lower, the effects of cumulation (the drifting) become visible in a longer interval, but it still exists.
Theory says that a total elimination of bias it is not possible, at the present days. At the state of the art, no one has still found a way to eliminate the bias - based only gyroscopes and accelerometers magnetometers - that could filter all the bias out.
Android and iPhone have limited implementations of bias elimination algorithms. They are not totally free by bias effects (e.g. in small intervals). For some applications this can cause severe problems and unpredictable results.
In this discussion both Ali and Stefano have raised two fundamental aspects of drifts due to ideal integration.
Basically zero mean white noise is an idealized concept and even for such ideal noise integration offer higher gain over lower frequency component of noise, which introduces a low frequency drift in the integrated signal. By theory the zero mean noise should not cause any drift iff observed over significantly long time but practically ideal integration never works.
On the other hand, even a minor dc-offset in the reading (input signal) can cause a significant drift over a time, if an ideal integration (loss-less summation) is performed on it. It can ramp up a very small dc-offsets in the system, as ideal integration has infinite gain on DC component of an input signal. Therefore for the practical purpose we substitute ideal integration by a low pass filter whose cut-off can be as low as required but can not be zero or too low for practical purpose.
Motivated by Ali reply (thanks Ali!), I did some reading and some numerical experiments and decided to post my own reply about the nature of gyro drift.
I've written a simple octave online script plotting white noise and integrated white noise:
The angle plot with reduced offset that is shown in the question seems to resemble a typical random walk. Mathematical random walks has zero mean value, so that cannot be accounted as drift. However, I believe numerical integration of white noise leads to non-zero mean (as can be seen in the histogram plot for random walk below). This, together with linearly increasing variance could be associated to the so-called gyro drift.
There is a great introduction to errors arising from gyroscopes and accelerometers here. In any case, I still have much to learn, so I could be wrong.
Regarding the complimentary filter, there's some discussion here, showing how the gyro drift is reduced by it. The article is very informal, but I found it interesting.

A.I.: How would I train a Neural Network across multiple machines?

So, for larger networks with large data sets, they take a while to train. It would be awesome if there was a way to share the computing time across multiple machines. However, the issue with that is that when a neural network is training, the weights are constantly being altered every iteration, and each iteration is more or less based on the last -- which makes the idea of distributed computing at the very least a challenge.
I've thought that for each portion of the network, the server could send maybe a 1000 sets of data to train a network on... but... you'd have roughly the same computing time as I wouldn't be able to train on different sets of data simultaneously (which is what I want to do).
But even if I could split up the network's training into blocks of different data sets to train on, how would I know when I'm done with that set of data? especially if the amount of data sent to the client machine isn't enough to achieve the desired error?
I welcome all ideas.
Quoting http://en.wikipedia.org/wiki/Backpropagation#Multithreaded_Backpropagation:
When multicore computers are used multithreaded techniques can greatly decrease the amount of time that backpropagation takes to converge. If batching is being used, it is relatively simple to adapt the backpropagation algorithm to operate in a multithreaded manner.
The training data is broken up into equally large batches for each of the threads. Each thread executes the forward and backward propagations. The weight and threshold deltas are summed for each of the threads. At the end of each iteration all threads must pause briefly for the weight and threshold deltas to be summed and applied to the neural network.
which is essentially what other answers here describe.
Depending on your ANN model you can exploit some parallelism on multiple machines by running the same model with the same training and validation data on multiple machines but set different ANN properies; initial values, ANN parameters, noise etc, for different runs.
I used to do this a lot to make sure I'd explored the problem space effectively and wasn't stuck in local minima etc. This is a very easy way to take advantage of multiple machines without having to recode your algorith. Just another approach you might want to consider.
My assumption is you have more than 1 training set, and you have a gold standard. Also, I assume you have some way of storing the state of the neural network (whether it's a list of probability weights for each node, or something along those lines).
Using as many compute nodes in a cluster as you can, launch the program on a data set on each node. Save the results for each, and test on the gold standard. Which ever neural network state performs best set as the input for the next round of training. Repeat as much as you see fit
If I understand correctly, you're trying to figure out a way to train an ANN on a cluster of machines? As you stated, partitioning the network isn't the right approach, and as far as I know, is seemingly unfeasible for most models. A possible approach might be to partition the training sets and run local copies of your network, and then merge the results. An intuitive way to do this and gain some validation along the way would be with cross-validation. As you stated, knowing when the network has had the right amount of training is a problem, but that variability is a problem inherent to neural nets in general, not in parallelizing the work.
As you also stated, the updates that happen during each iteration of training are dependent on the current state of the weights, but without mixing up training sets/validation, you're likely overfitting. This is why CV is nice, because your training sets will all get a chance to play a role in the training, and the validating, across multiple samples.
If you do batch training, the weight are only altered after you have been through the entire dataset. You can compute the weight update vector for each data point in the set on a separate machine/core and add them up at the end, then proceed with the next epoch.
Here is a link to a question about batch training.

Efficient dataset size for a feed-foward neural network training

I'm using a feed-foward neural network in python using the pybrain implementation. For the training, i'll be using the back-propagation algorithm. I know that with the neural-networks, we need to have just the right amount of data in order not to under/over-train the network. I could get about 1200 different templates of training data for the datasets.
So here's the question:
How do I calculate the optimal amount of data for my training? Since I've tried with 500 items in the dataset and it took many hours to converge, I would prefer not to have to try too much sizes. The results we're quite good with this last size but I would like to find the optimal amount. The neural network has about 7 inputs, 3 hidden nodes and one output.
How do I calculate the optimal amount
of data for my training?
It's completely solution-dependent. There's also a bit of art with the science. The only way to know if you're into overfitting territory is to be regularly testing your network against a set of validation data (that is data you do not train with). When performance on that set of data begins to drop, you've probably trained too far -- roll back to the last iteration.
The results were quite good with this
last size but I would like to find the
optimal amount.
"Optimal" isn't necessarily possible; it also depends on your definition. What you're generally looking for is a high degree of confidence that a given set of weights will perform "well" on unseen data. That's the idea behind a validation set.
The diversity of the dataset is much more important than the quantity of samples you are feeding to the network.
You should customize your dataset to include and reinforce the data you want the network to learn.
After you have crafted this custom dataset you have to start playing with the amount of samples, as it is completely dependant on your problem.
For example: If you are building a neural network to detect the peaks of a particular signal, it would be completely useless to train your network with a zillion samples of signals that do not have peaks. There lies the importance of customizing your training dataset no matter how many samples you have.
Technically speaking, in the general case, and assuming all examples are correct, then more examples are always better. The question really is, what is the marginal improvement (first derivative of answer quality)?
You can test this by training it with 10 examples, checking quality (say 95%), then 20, and so on, to get a table like:
10 95%
20 96%
30 96.5%
40 96.55%
50 96.56%
you can then clearly see your marginal gains, and make your decision accordingly.

hardware specialized for bitmap indexes?

This is just an out of curiosity question. Let's say you have a database table with 1m rows in it, and you want to often do queries like looking for either male or female, US or non-US, voter or non-voter etc, it's clearly very efficient to define a bitmap index for the table in which each bit represents one either-or condition.
However, to execute the query, you still have to scan through (probably) all of the index doing a bitand to select matching rows.
My question is is there some kind of bitmap-optimized storage such that the bit 'channels' are pre-created in the hardware? I'm envisaging something similar to knitting needles lifting punched cards out of an old library catalog system. In other words, rather than going row by row through memory locations, the chip can just pull out the matching rows electronically because there are hardware connections for each bit channel? I've a feeling the brain must work something like this. If I think of 'all blue objects', and then restrict that to 'all long blue objects' and then 'all long blue heavy objects', my brain does it effortlessly and I'm sure it's not scanning through all the objects I know about every time. It seems like perhaps there is some neurons that provide pathways for different dimensions for quick retrieval. I'm just wondering if there's anything like this in the hardware world?
Thanks!
Why invent something that's already there?
Content-addressable memory
You could certainly wire up some logic to perform this (e.g. using programmable logic devices) but you'll need a large number of logic elements and connections, making such circuits probably expensive to build for large databases.
For example, one would have to build matching logic (is this bit being selected on ? what is the required value ?) into each 'row' giving you one signal (selected/not selected) per row.
You would then have a logic circuit with one million output lines (telling you which records were selected) which you probably at some point have to 'serialize' anyway, e.g. when you interface with the PCI bus inside a computer (i.e. first transmit the result for record 0 then 1 etc. or transmit the numbers of the selected records).
As bitwise operations in modern CPUs are fast (should only take one clock cycle for logicl operations such as bitwise and, or and 'xor') you're probably not gaining much using such a custom circuit compared to optimized software (not mentioning the 'hardware' development and testing effort) unless you have a very special use case.

How do you measure SQL Fill Factor value

Usually when I'm creating indexes on tables, I generally guess what the Fill Factor should be based on an educated guess of how the table will be used (many reads or many writes).
Is there a more scientific way to determine a more accurate Fill Factor value?
You could try running a big list of realistic operations and looking at IO queues for the different actions.
There are a lot of variables that govern it, such as the size of each row and the number of writes vs reads.
Basically: high fill factor = quicker read, low = quicker write.
However it's not quite that simple, as almost all writes will be to a subset of rows that need to be looked up first.
For instance: set a fill factor to 10% and each single-row update will take 10 times as long to find the row it's changing, even though a page split would then be very unlikely.
Generally you see fill factors 70% (very high write) to 95% (very high read).
It's a bit of an art form.
I find that a good way of thinking of fill factors is as pages in an address book - the more tightly you pack the addresses the harder it is to change them, but the slimmer the book. I think I explained it better on my blog.
I would tend to be of the opinion that if you're after performance improvements, your time is much better spent elsewhere, tweaking your schema, optimising your queries and ensuring good index coverage. Fill factor is one of those things that you only need to worry about when you know that everything else in your system is optimal. I don't know anyone that can say that.

Resources