How to add noise to the data set in dB? - dataset

I have a dataset and I need to draw a graph of SNR vs MSE, based on deep learning. How can I add noise to the dataset, based on SNR value? So, we can say that the dataset has a SNR of 10dB, or 20dB?

You maybe need to try => Jitter function.

Related

How can you exclude a large large initial value from a running delta calculation?

I'm trying to use a running delta calculation to graph how much additional storage is used per hour, from a field that contains how much storage is used. Let's say I have a field like disk_space_used_mb. If I have the values 50000, 50100, 50300, the running delta would be 50000, 100, 200, but I don't really care about the first value, and it throw off my graph. I can of course set the max value of the y axis manually, but that isn't dynamic.
How can I prevent this first large value from throwing off my graph? is there a way to force that to 0?
Here's an example of why this is a problem (with different numbers):
Sadly, this is currently not possible and it is a very common problem when plotting running delta.
To workaround, if your initial value is static, you can create a new calculated field where you subtract the initial value from all rows (so the initial value will be always zero). But obviously, this is not an elegant solution and your chart Y-axis values will be different from the real values.
But if the initial value can be changed by the user (it is dynamic), you're really out of lucky. The only solution I can imagine is to search for an alternative visualization that support this feature or develop your own visualization.
The second option probably solves your problem, but the development of community visualizations is far from being an easy task.

Kalman filter - quaternions - angle sensor

Kalman filters and quaternions are something new for me.
I have a sensor which output voltage on its pins changes in function of its inclination on x,y and/or z-axis, i.e. an angle sensor.
My questions:
Is it possible to apply a Kalman filter to smooth the results and avoid any noise on the measurements?
I will then only have 1 single 3D vector. What kind of operations with quaternions could I use with this 3d vector to learn more about quaternions?
You can apply a Kalman filter to accelerometer data, it's a powerful technique though and there are lots of ways to do it wrong. If your goal is to learn about the filter then go for it - the discussion here might be helpful.
If you just want to smooth the data and get on with the next problem then you might want to start with a moving average filter, or traditional lowpass/bandpass filters.
After applying a Kalman filter you will still have a sequence of data - it won't reduce it to a single vector. If this is your goal you might as well take the mean of each coordinate sequence.
As for quaternions, you could probably come up with a way of performing quaternion operations on your accelerometer data but the challenge would be to make it meaningful. For the purposes of learning about the concept you really need it to make some sense, so that you can visualise the results and interpret them.
I would be tempted to write some functions to implement quaternion operations instead - multiplication is the strange one. This will be a good introduction to the way they work, and then when you find an application that calls for them you can use your functions and you'll already know how the mechanics work.
If you want to read the most famous use of quaternions have a look at Maxwell's equations in their original quaternion form, before Heaviside dramatically simplified them and put them in the vector notation we use today.
Also a lot of work is done using tensors these days and if you're interested in the more complex mathematical datatypes that would be a worthwhile one to look into.

Generate a 2D/3D/n-dimensional uncertain data set with/without label

I am going to work on uncertainty visualization. My main problem is finding/generating a 2D/3D/n-dimensional data set with uncertain data.
How do I can generate/create a data set which includes uncertain data (with and/or without label)? Is there any benchmarking data set?
After my hardly working and searching, I could find some results:
Actually, there is no a benchmarking data set with uncertainties features. One solution is adding noise to the original data set to make uncertainties due to affection of the noise.
The ideal way is application of White Gaussian Noise. Two ways are as follows:
(1) MATLAB can support this issue with the function wgn.
(2) using randn function from MATLAB.
(3) my suggestion is using Mean * randn(n,1) + Standard Deviation, which add noise in your data set with your preferred mean and Std. (Standard Deviation)
I hope that my recommendation being useful.

Dataset for Apriori algorithm

I am going to develop an app for Market Basket Analysis (using apriori algorithm) and I found a dataset which has more than 90,000 Transaction records .
the problem is this dataset doesn't have the name of items in it and only contains the barcode of the items .
I just start the project and doing research on apriori algorithm , can anyone help me about this case , how is the best way to implement this algorithm using the following dataset ?
these kind of datasets are consider critical information and chain stores will not give you these information but you can generate some sample dataset yourself using SQL Server .
The algorithm is defined independent of the identifiers used for the object. Also, you didn't post the 'following data set' :P If your problem is that the algorithm expects your items to be numbered 0,1,2,... then just scan your data set and map each individual barcode to a number.
If you're interested, there's been some papers on how to represent frequent item sets very efficiently: http://www.google.de/url?sa=t&source=web&cd=1&ved=0CB8QFjAA&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.163.4827%26rep%3Drep1%26type%3Dpdf&ei=QdVuTsn7Cc6WmQWD7sWVCg&usg=AFQjCNGDG8etNN2B4GQ52pSNIfQaTH7ajQ&sig2=7r3buh8AcfJmn2CwjjobAg
The algorithm does not need the name of the items.

What optimization problems do you want to have solved?

I love to work on AI optimization software (Genetic Algorithms, Particle Swarm, Ant Colony, ...). Unfortunately I have run out of interesting problems to solve. What problem would you like to have solved?
This list of NP complete problems should keep you busy for a while...
How about the Hutter Prize?
From the entry on Wikipedia:
The Hutter Prize is a cash prize
funded by Marcus Hutter which rewards
data compression improvements on a
specific 100 MB English text file.
[...]
The goal of the Hutter Prize is to
encourage research in artificial
intelligence (AI). The organizers
believe that text compression and AI
are equivalent problems.
Basically the idea is that in order to make a compressor which is able to compress data most efficiently, the compressor must be, in Marcus Hutter's words, "smarter". For more information on the relation between artificial intelligence and compression, see the Motivation and FAQ sections of the Hutter Prize website.
Does the Netflix Prize count?
I would like my bank balance optimised so that there is as much money as possible left at the end of the month, instead of the other way round.
What about the Go Game ?
Here's an interesting practical problem I came up while tinkering with color quantization and image compression.
The basic idea is that I would like a program to which I give a picture and it reduces the amount of colors is it as much as possible without me noticing it. Since every person has a different sensitivity of the eye (and eyes have different sensitivity of red/green/blue intensities), it should be possible to specify this sensitivity threshold in some way.
In other words, in a truecolor picture, replace every pixel's color with another color so that:
The total count of different colors in a picture would be the smallest possible; and
Every new pixel would have it's color no further from the original color than some user-specified value D.
The D can be defined in different ways, pick your favorite. For example:
Separate red, green and blue components for specifying the maximum possible deviation for each of them (for every pixel you get a rectangular cuboid of valid replacement values);
A real number which would represent the maximum allowable distance in the RGB cube (for every pixel you get a sphere of valid replacement values);
Something inbetween or completely different.
Most efficient solution to a given set of Sudoku puzzles. (excluding brute-force methods)

Resources