Insert a space to separate a database - database

Good morning, I have the following set, but with thousands of more information:
215 22221121110110110101
212 22221121110110110101
468 22221121110110110101
1200 22221121110110110101
400 22221121110110110101
100 22221121110110110101
200 22221121110110110101
And I need to separate it into columns this way:
215 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
212 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
468 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
1200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
400 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
100 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
I tried to use a simple sed, but don't work
sed -i -e 's// /g'

Perl to the rescue!
perl -lane 'push #F, split //, pop #F; print "#F"'
-n reads the input line by line
-l removes newlines from input and adds them back to output
-a splits each line on whitespace into the #F array
pop removes the last element of an array and returns it, in this case it returns the second "word"
split turns a string into a list, with // it splits the string into individual characters
push is dual to pop, it adds the elements to the end of an array (in this case, it adds individual characters to the array currently containing only the first column)
when printing an array in double quotes, by default the members are separated by spaces.

you can use GNU awk gensub function.
gawk '{$2=gensub(/./, "& ", "g", $2)}1' file

to eliminate extra space at the end of line by other solutions you can use this
$ awk '{print $1 gensub(/./," &","g",$2)}'

Could you please try following with GNU awk and do let me know if this helps you.
awk '{num=split($2,a,"");printf $1;for(i=0;i<=num;i++){printf("%s%s",a[i],i==num?RS:FS)};}' Input_file

Using awk's gsub(regexp, replacement [, target])
awk '{gsub(/./," &",$2); print $1 $2}' infile
Explanation:
gsub(/./,"& ",$2) match any char (except for line terminators) and replace it with the same, along with single space in second column of current record read.
The Dot Matches (Almost) Any Character. In regular expressions, the
dot or period is one of the most commonly used metacharacters.
The
dot matches a single character, without caring what that character is.
The only exception are line break characters.
If the special character & appears in replacement, it stands for the precise substring that was matched by regexp.
Test Results:
$ cat infile
215 22221121110110110101
212 22221121110110110101
468 22221121110110110101
1200 22221121110110110101
400 22221121110110110101
100 22221121110110110101
200 22221121110110110101
$ awk '{gsub(/./," &",$2); print $1 $2}' infile
215 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
212 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
468 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
1200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
400 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
100 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1

speed comparison of some of the answers
$ perl -0777 -ne 'print $_ x 1000000' ip.txt > f1
$ du -h f1
169M f1
time given for two consecutive runs
$ time perl -lane 'push #F, split //, pop #F; print "#F"' f1 > t1
real 0m34.004s
real 0m33.729s
$ time perl -lane 'print join " ",$F[0],split //,$F[1]' f1 > t2
real 0m23.291s
real 0m23.935s
$ time LC_ALL=C awk '{gsub(/./," &",$2); print $1 $2}' f1 > t3
real 0m30.834s
real 0m30.723s
$ diff -s t1 t2
Files t1 and t2 are identical
$ diff -s t1 t3
Files t1 and t3 are identical

Another approach with bash
while read a b;do
printf "%s" $a
while read -n1 c;do
printf " %c" "$c"
done<<<$b
echo
done<lefile

This might work for you (GNU sed):
sed 's/ /\n/;h;s/\B/ /g;H;g;s/\n.*\n/ /' file
Replace the first space by a newline, copy the line, replace all non-word boundaries with a space, append the change line to the copy and then rearrange the line.

How about coreutils:
paste -d '' \
<(cut -d' ' -f1 infile ) \
<(cut -d' ' -f2 infile | sed 's/./ &/g')
Output:
215 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
212 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
468 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
1200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
400 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
100 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1
200 2 2 2 2 1 1 2 1 1 1 0 1 1 0 1 1 0 1 0 1

Try
sed -i -e 's/\(.\)/\1 /g'
That is, capture character by character, then replace the capture with itself, plus a space.

Related

Summarizing dataframe and sorting with inserting 0 if the target words doesn't exist

I have a tab-delimited data frame like this:
Sample 1
A 2
B 1
D 2
Sample 2
C 1
D 1
E 2
Sample 3
D 1
E 3
Sample 4
A 1
E 3
Sample 5
A 2
B 3
Sample 6
C 1
D 2
Sample 7
D 1
E 3
Sample 8
A 3
D 2
Sample 9
A 1
Sample 10
A 1
C 2
E 3
and I would like to transpose it and link the alphabets and the associated values. If the sample does not contain alphabets, I want "0" inserted.
So the modified dataframe I desire is
Sample A B C D E
1 2 1 0 2 0
2 0 0 1 1 1
3 0 0 0 1 3
4 1 0 0 0 3
5 2 3 0 0 0
6 0 0 1 2 0
7 0 0 0 1 3
8 3 0 0 2 0
9 1 0 0 0 0
10 1 0 2 0 3
I tried to summarize the datafreame, and then transpose it. When I used,
awk 'NR==1{print} NR>1{a[$1]=a[$1]" "$2}END{for (i in a){print i " " a[i]}}' TEST
the output data was
Sample 1
A 2 1 2 3 1 1
B 1 3
C 1 1 2
D 2 1 1 2 1 2
E 2 3 3 3 3
Sample 2 3 4 5 6 7 8 9 10
Sample 1 was isolated, and there was no space when the alphabets were not included by the samples.
I hope that makes sense.

Matrix Permutations with Contraint

I have the following matrix:
1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2
0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2
0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2
I'd like to randomly permute the columns, with the constraint that every four numbers in the second row should contain some form of
0 0 1 2
e.g. Columns 1:4, 5:8, 9:12, 13:16, 17:20, 21:24 in the example below each contain the numbers 0 0 1 2.
0 1 0 2 2 0 1 0 0 0 2 1 1 2 0 0 2 0 1 0 1 0 0 2
Every column in the permuted matrix should have a corresponding one in the first matrix. In other words, nothing should be altered within a column.
I can't seem to think of an intuitive solution to this - Is there another way of coming up with some form of the initial matrix that both satisfies the constraint and retains the integrity of the columns? Each column represents conditions in an experiment, which is why I'd like them to be balanced.
You can compute the permutations directly in the following manner: First, permute all columns with 0 in the second row among themselves, then all 1s among themselves, and finally all 2s among themselves. This ensures that, for example, any two 0 columns are equally likely to be the first two columns in the resulting permutation of A.
The second step is to permute all columns in blocks of 4: permute columns 1-4 randomly, permute columns 5-8 randomly, etc. Once you do this, you have a matrix that maintains the (0 0 1 2) pattern for every block of 4 columns, but each set of (0 0 1 2) is equally likely to be in any given block of 4, and the (0 0 1 2) are equally likely to be in any order.
A = [1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2
0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2
0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2];
%% Find the indices of the zeros and generate a random permutation with that size
zeroes = find(A(2,:)==0);
perm0 = zeroes(randperm(length(zeroes)));
%% Find the indices of the ones and generate a random permutation with that size
wons = find(A(2,:) == 1);
perm1 = wons(randperm(length(wons)));
%% NOTE: the spelling of `zeroes` and `wons` is to prevent overwriting
%% the MATLAB builtin functions `zeros` and `ones`
%% Find the indices of the twos and generate a random permutation with that size
twos = find(A(2,:) == 2);
perm2 = twos(randperm(length(twos)));
%% permute the zeros among themselves, the ones among themselves and the twos among themselves
A(:,zeroes) = A(:,perm0);
A(:,wons) = A(:,perm1);
A(:,twos) = A(:,perm2);
%% finally, permute each block of 4 columns, so that the (0 0 1 2) pattern is preserved, but each column still has an
%% equi-probable chance of being in any position
for i = 1:size(A,2)/4
perm = randperm(4) + 4*i-4;
A(:, 4*i-3:4*i) = A(:,perm);
end
Example result:
A =
Columns 1 through 15
1 1 2 2 2 2 1 1 2 2 1 2 2 1 2
0 0 2 1 0 2 0 1 0 2 1 0 1 2 0
0 1 2 2 2 0 1 1 1 1 2 0 0 2 0
Columns 16 through 24
2 1 1 1 1 1 2 2 1
0 2 0 0 1 0 0 1 2
1 1 2 2 0 0 2 1 0
I was able to generate 100000 constrained permutations of A in about 9.32 seconds running MATLAB 2016a, to give you an idea of how long this code takes. There are certainly ways to optimize the permutation selection so you don't have to make quite so many random draws, but I always prefer the simple, straightforward approach until it proves insufficient.
You could use a rejection method: keep trying random permutations, chosen equiprobably, until one satisfies the requirement. This guarantees that all valid permutations have the same probability of being picked.
A = [ 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2
0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2
0 0 0 0 1 1 1 1 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2 ]; % data matrix
required = [0 0 1 2]; % restriction
row = 2; % row to which the resitriction applies
sorted_req = sort(required(:)); % sort required values
done = false; % initiallize
while ~done
result = A(:, randperm(size(A,2))); % random permutation of columns of A
test = sort(reshape(result(row,:), numel(required), []), 1); % reshape row
% into blocks, each block in a column; and sort each block
done = all(all(bsxfun(#eq, test, sorted_req))); % test if valid
end
Here's an example result:
result =
2 1 1 1 1 2 1 2 1 2 2 1 2 2 1 2 2 2 1 1 1 2 1 2
2 0 0 1 2 1 0 0 0 1 0 2 2 0 1 0 1 2 0 0 2 0 1 0
2 1 2 2 1 2 2 0 1 1 1 2 1 1 0 0 0 0 0 0 0 2 1 2

How to find all combinations of multiple 2D arrays(matrix) , rotation allowed

I have 3 2d Arrays(matrix) with 0 and 1-
For each array, I will rotate 4 times clock-wise , 4 times anti clock-wise and flip the array and repeat the above and for each iteration I will repeat the steps for other array and so on to combine the array to build a symmetry or kind of Rubik's cube but with 5 elements each side. It means if I like to add 2 arrays , it means 1 of Array 1 must be fit with 0 of Array 2.
Following kind of structure-
Following is my 3 arrays
0 0 1 0 1
1 1 1 1 1
0 1 1 1 0
1 1 1 1 1
0 1 0 1 1
-------------
0 1 0 1 0
0 1 1 1 0
1 1 1 1 1
0 1 1 1 0
0 0 1 0 0
-------------
1 0 1 0 0
1 1 1 1 1
0 1 1 1 0
1 1 1 1 1
0 1 0 1 0
-------------
This problem is evolved from the problem I asked How to solve 5 * 5 Cube in efficient easy way.
Consider my rotate methods are as follows -
rotateLeft()
rotateRight()
flipSide()
for (firstArray){
element = single.rotateLeft();
for(secondArray){
element2 = single.rotateLeft();
if(element.combine(element2){
for(thirdArray){
}
}
}
}
Currently I have fixed 3 arrays , but how exactly and efficiently I must solve this problem.

Actionscript 3.0 Cube Crash like game

I'm trying to build game like http://games.yahoo.com/game/bricks-breaking in actionscript 3 (flash builder).
I am able to create an array of bricks (that are visible on game start), but I have no idea how to find a group of bricks in array.
Lets say we have array like so:
1 2 2 1 3 3 1 1 1 1 1 1 1
1 2 1 1 1 3 1 1 1 1 1 1 1
1 2 1 1 1 3 1 1 1 1 1 1 3
1 1 2 1 1 3 3 3 1 1 1 1 3
1 1 1 2 1 3 1 3 3 1 1 1 3
1 1 1 3 3 3 1 3 3 1 1 1 3
1 1 1 1 1 1 1 3 3 1 1 1 1
When the user clicks any brick colored red (in array lets say it is 3) the array after removing all 3 will look like that:
1 2 2 0 0 0 0 0 0 1 1 1 1
1 2 1 1 0 0 1 0 0 1 1 1 1
1 2 1 1 1 0 1 0 0 1 1 1 3
1 1 2 1 1 0 1 0 1 1 1 1 3
1 1 1 1 1 0 1 1 1 1 1 1 3
1 1 1 2 1 0 1 1 1 1 1 1 3
1 1 1 1 1 1 1 1 1 1 1 1 1
Basicly I want to remove all the items that are in group and are the same color.
Any suggestions how to do that?
Is there any kind of algorythm that I should use?
Thanks for advice
A simple way to remove elements is to use a recursive function. It's not the only way (or even a good one) but it should be enough for this kind of game. Basically something like this:
function breakBricks(x:int, y:int, color:int):void {
if(bricks[y][x] != color) return;
bricks[y][x] = 0;
breakBricks(x + 1, y, color);
breakBricks(x, y + 1, color);
breakBricks(x - 1, y, color);
breakBricks(x, y - 1, color);
}
Begin with the position that the user clicked and the colour of that position. If the colour matches it will set that entry to 0, if not it leaves the element alone. It recursively does this to all neighbouring elements. What is missing in this code are boundary checks which you need to add.
In the next step you could iterate over each of the arrays columns from bottom to top, keep reference of the position of the first 0 element you find and move any non-emtpy values you find after that to the lowest empty row position.

Comparing adjacent elements in MATLAB

Does anyone know how I can compare the elements in an array with the adjacent elements?
For example, if I have an array:
0 0 0 1 1 1 1 0
0 1 1 1 1 1 1 0
0 1 0 1 1 1 1 0
0 1 1 1 1 1 0 0
0 0 0 0 1 1 1 1
1 1 1 1 1 1 1 1
Is there a way to cycle through each element and perform a logical test of whether the elements around it are equal to 1?
Oops, it looks like someone is doing a homework assignment. Game of life maybe?
There are many ways to do such a test. But learn to do it in a vectorized form. This involves understanding how matlab does indexing, and how the elements of a 2-d array are stored in memory. That will take some time to explain in detail, more than I want to do at this exact moment. I would definitely recommend you learn it though.
Until then, I'll just suggest that if you really are doing the game of life, then the best trick is to use conv2. Thus,
A =[0 0 0 1 1 1 1 0
0 1 1 1 1 1 1 0
0 1 0 1 1 1 1 0
0 1 1 1 1 1 0 0
0 0 0 0 1 1 1 1
1 1 1 1 1 1 1 1];
B = conv2(A,[1 1 1;1 0 1;1 1 1],'same')
B =
1 2 4 4 5 5 3 2
2 2 5 6 8 8 5 3
3 4 8 7 8 7 4 2
2 2 4 5 7 7 6 3
3 5 6 7 7 7 6 3
1 2 2 3 4 5 5 3
Loren has recently posted about this very issue: http://blogs.mathworks.com/loren/2010/01/19/mathematical-recreations-tweetable-game-of-life/ - lots of interesting things can be learned by studying the code in that post and its comments

Resources