Fixed Place Permutation/Combination - permutation

I am looking for a way where I can generated different combination of 4 sets elements in such a manner that every set's element has a fixed place in the final combination:
To explain better my requirement let me give sample of those 4 sets and finally what I am looking for:
Set#1(Street Pre Direction) { N, S }
Set#2(Street Name) {Frankford, Baily}
Set#3(Street Type) {Ave, St}
Set#4(Street Post Direction) {S}
Let me list few expecting combinations:
N Baily Ave S
S Frankford St S
S Baily Av S
Now as you can see that every set's element is falling into its place
Pre Direction is in Place 1
Street Name is in Place 2
Streety Type is in Place 3
Street Description is in Place 4
I am looking for the most efficient way to carry out this task, One way to do it is to work at 2 sets at a time like:
Make Combination of Set 1 and Set 2 --> create a new set 5 of resulting combinations
Make Combination of Set 5 and Set 3 --> create a new set 6 of resulting combinations
Make Combination of Set 6 and Set 4 --> This will give me the final combinations
Is there a best way to do this thing? Kindly help. I will prefer C# or Java.

Here's some linq (c#) that gives you all combinations, it is not "the most efficient way".
var query =
from d in predirections
from n in names
from t in types
from s in postdirections
select new {d, n, t, s};

It sounds like you're looking for the cartesian product of some sets. You can do it using nested for loops. Here's the Haskell code, which you didn't ask for.
Prelude> [[x,y] | x <- ['1'..'3'], y <- ['A'..'C']]

# David B
What if predirections list is empty, is there a way that we still get the combinations since through your way none cartesian product will be returned.
David B here:
var query =
from d in predirections.DefaultIfEmpty()
from n in names.DefaultIfEmpty()
from t in types.DefaultIfEmpty()
from s in postdirections.DefaultIfEmpty()
select new {d, n, t, s};


#VALUES! while using IF and OR together

I have the File as following format
Name Number Position
A 1
B 2
C 3
D 4
Now on position A3 , I applied =IF(B2=1,"Goal Keeper",OR(IF(B2=2,"Defender",OR(IF(B2=3,"MidField","Striker"))))) But it giving me an error #value!
Looked up at google, and my formula is correct.
What i basically want it
1- Goalkeeper 2-Defender 3-Midfield 4-Striker
Yes the other way is to to just filter the number and copy paste the text
But I want to do it using formula and want to know where did I go wrong.
Your immediate problem lies with the expression (for example):
| \__/ \________/ \_______/ |
| bool string string |
The OR function expects a series of boolean values (true or false) and you're giving it a string value from the inner IF.
You don't actually need the or bits in this specific case, the if is a full if-else. So you can just use:
=IF(B1=1,"Goal Keeper",IF(B2=2,"Defender",IF(B2=3,"MidField","Striker")))
This means that B1=1 will result in "Goal Keeper", otherwise it will evaluate IF(B2=2,"Defender",IF(B2=3,"MidField","Striker")).
Then that means that, if B2=2, it will result in "Defender", otherwise it will evaluate IF(B2=3,"MidField","Striker").
Finally, that means the B2=3 will result in "MidField", anything else will give "Striker".
The only situation I can envisage when OR would come in handy here would be when two different numbers were to generate the same string. Let's say both 1 and 4 should give "Goalie", you could use:
Keep in mind that a more general solution would be better implemented with the Excel lookup functions, ones that would search a table (on the spreadsheet somewhere) which mapped the integers to strings. Then, if the mapping needed to change, you would just update the table rather than going back and changing the formula in every single row.
If you are actually tasked with solving the problem by using the IF and OR function within the same equation, this is the only way I can see how:
=IF(OR(B1=1, B1 = 2, B1 = 3, B1 = 4),IF(B1 = 1, "Goal Keeper", IF(B1 = 2,"Defender",IF(B1 = 3,"MidField","Striker")))
If B1 does not equal 1-4, the OR function will return FALSE and completely bypass all of the nested IF statements.

Split matrix into several depending on value in Matlab

I have a cell array that I need to split into several matrices so that I can take the sum of subsets of the data. This is a sample of what I have:
A = {'M00.300', '1644.07';...
'M00.300', '9745.42'; ...
'M00.300', '2232.88'; ...
'M00.600', '13180.82'; ...
'M00.600', '2755.19'; ...
'M00.600', '15800.38'; ...
'M00.900', '18088.11'; ...
'M00.900', '1666.61'};
I want the sum of the second columns for each of 'M00.300', 'M00.600', and 'M00.900'. For example, to correspond to 'M00.300' I would want 1644.07 + 9745.42 + 2232.88.
I don't want to just hard code it because each data set is different, so I need the code to work for different size cell arrays.
I'm not sure of the best way to do this, I was going to begin by looping through A and comparing the strings in the first column and creating matrices within that loop, but that sounded messy and not efficient.
Is there a simpler way to do this?
Classic use of accumarray. You would use the first column as an index and the second column as the values associated with each index. accumarray works where you group values that belong to the same index together and you apply a function to those values. In your case, you'd use the default behaviour and sum things together.
However, you'll need to convert the first column into numeric labels. The third output of unique will help you do this. You'll also need to convert the second column into a numeric array and so str2double is a perfect way to do this.
Without further ado:
[val,~,id] = unique(A(:,1)); %// Get unique values and indices
out = accumarray(id, str2double(A(:,2))); %// Aggregate the groups and sum
format long g; %// For better display of precision
T = table(val, out) %// Display on a nice table
I get this:
>> T = table(val, out)
T =
val out
_________ ________
'M00.300' 13622.37
'M00.600' 31736.39
'M00.900' 19754.72
The above uses the table class that is available from R2013b and onwards. If you don't have this, you can perhaps use a for loop and print out each cell and value separately:
for idx = 1 : numel(out)
fprintf('%s: %f\n', val{idx}, out(idx));
We get:
M00.300: 13622.370000
M00.600: 31736.390000
M00.900: 19754.720000

MATLAB cell array indexing and looping

I'm trying to create a script that reads data from a text file, and plots the data onto a scatter plot.
For example, say the file name is prices.txt and contains:
Pens 2 4
Pencils 1.5 3
Rulers 3 3.5
Sharpeners 1 3
Highlighters 3 4
Where columns 2 and 3 are prices of the items for two different stores.
What my script should do is read the prices, calculates (using another function) future prices of the stores and plots these prices onto a scatter plot where x is one store and y is another. This is a silly example I know but it fits the description.
Don't worry to much about the other function that does the calculation, just assume it does what its supposed to.
Basically, I've come up with the following:
pricesfile = fopen('Prices.txt');
prices = textscan(pricesfile, '%s %d d');
count = 1;
while count <= length(prices{1})
for item = constants{1}
name = constants{1}{count};
store_A = prices{2}{count};
store_B = prices{3}{count};
(...other function goes here...)
After doing this I'm completely stuck. My thought process behind this was to go through each item name, and create a vector that's assigned to this name with its two corresponding prices as items in the vector eg:
pens = [2 4]
pencils = [1.5 3]
etc. Then, I would somehow plot those items in the vector on a scatter plot and use the name of the vector as a label.
I'm not too sure how to carry out the rest of my code or even if what I've written will get me to the solution.
Please help and thanks in advance.
pricesfile = fopen('Prices.txt');
data = textscan(pricesfile, '%s %d d');
You were on the right track but after this (through a bit of hackery) you don't actually need a loop:
plot(repmat(data{2},1,2)', repmat(data{3},1,2)', '.')
What you DO NOT want to do is create variables named after strings. Rather store them in an array with an array of the names (which is basically what your textscan code gives you). Matlab is very good at handling matrices/arrays.
You could also split your price array up for example:
names = prices{1};
prices = [data{2:3}];
now you can perform calculations on prices quite easily like
prices_cents = prices*100;
plot(prices_cents(:,[1,1]), prices_cents(:,[2,2]))
Note that the [1,1] etc above is just using indexing as a short hand to achieve what repmat does...

Stata: Count a variable by another one?

My little Stata Problem:
I have a table like this:
I want to create a variable that counts the number of different cat for each citing. This is... For the A citing there are 2 cat... the 3 and the 6. So I want another variable (dif_cat) with two 2.
For this sample it would look something like this:
I have tried different methods I always feel like I am getting close but then I can't do it.
I tried bysort with preserve and restore but I don't seem to get there.
One attempt was:
egen tag = tag(cat citing)
egen distinct = total(tag), by(citing)
Can you help me?
PS: I know this has nothing to do with Stata (but it may inspire someone) with an actually programming language I would try something such as:
Having a cycle doing citing column and checking if equal to the one before
Having an auxiliary empty vector
Having a second cycle within the first that wouldsee if the current cat was in the vector and if not put it there.
When the citing changed I would count the lenght of the auxiliary matrix, reset it and do it again. The problem is that I need this in Stata code :S
One way (from Stata FAQ) is:
clear all
set more off
input ///
str1 citing cat
A 3
A 6
B 5
B 2
B 5
B 2
C 2
C 4
C 3
D 5
E 1
E 1
list, sepby(citing)
bysort citing cat: gen numvals = (_n == 1)
by citing: replace numvals = sum(numvals)
by citing: replace numvals = numvals[_N]
list, sepby(citing)

Growing arrays in Haskell

I have the following (imperative) algorithm that I want to implement in Haskell:
Given a sequence of pairs [(e0,s0), (e1,s1), (e2,s2),...,(en,sn)], where both "e" and "s" parts are natural numbers not necessarily different, at each time step one element of this sequence is randomly selected, let's say (ei,si), and based in the values of (ei,si), a new element is built and added to the sequence.
How can I implement this efficiently in Haskell? The need for random access would make it bad for lists, while the need for appending one element at a time would make it bad for arrays, as far as I know.
Thanks in advance.
I suggest using either Data.Set or Data.Sequence, depending on what you're needing it for. The latter in particular provides you with logarithmic index lookup (as opposed to linear for lists) and O(1) appending on either end.
"while the need for appending one element at a time would make it bad for arrays" Algorithmically, it seems like you want a dynamic array (aka vector, array list, etc.), which has amortized O(1) time to append an element. I don't know of a Haskell implementation of it off-hand, and it is not a very "functional" data structure, but it is definitely possible to implement it in Haskell in some kind of state monad.
If you know approx how much total elements you will need then you can create an array of such size which is "sparse" at first and then as need you can put elements in it.
Something like below can be used to represent this new array:
data MyArray = MyArray (Array Int Int) Int
(where the last Int represent how many elements are used in the array)
If you really need stop-and-start resizing, you could think about using the simple-rope package along with a StringLike instance for something like Vector. In particular, this might accommodate scenarios where you start out with a large array and are interested in relatively small additions.
That said, adding individual elements into the chunks of the rope may still induce a lot of copying. You will need to try out your specific case, but you should be prepared to use a mutable vector as you may not need pure intermediate results.
If you can build your array in one shot and just need the indexing behavior you describe, something like the following may suffice,
import Data.Array.IArray
test :: Array Int (Int,Int)
test = accumArray (flip const) (0,0) (0,20) [(i, f i) | i <- [0..19]]
where f 0 = (1,0)
f i = let (e,s) = test ! (i `div` 2) in (e*2,s+1)
Taking a note from ivanm, I think Sets are the way to go for this.
import Data.Set as Set
import System.Random (RandomGen, getStdGen)
startSet :: Set (Int, Int)
startSet = Set.fromList [(1,2), (3,4)] -- etc. Whatever the initial set is
-- grow the set by randomly producing "n" elements.
growSet :: (RandomGen g) => g -> Set (Int, Int) -> Int -> (Set (Int, Int), g)
growSet g s n | n <= 0 = (s, g)
| otherwise = growSet g'' s' (n-1)
where s' = Set.insert (x,y) s
((x,_), g') = randElem s g
((_,y), g'') = randElem s g'
randElem :: (RandomGen g) => Set a -> g -> (a, g)
randElem = undefined
main = do
g <- getStdGen
let (grownSet,_) = growSet g startSet 2
print $ grownSet -- or whatever you want to do with it
This assumes that randElem is an efficient, definable method for selecting a random element from a Set. (I asked this SO question regarding efficient implementations of such a method). One thing I realized upon writing up this implementation is that it may not suit your needs, since Sets cannot contain duplicate elements, and my algorithm has no way to give extra weight to pairings that appear multiple times in the list.
