Merge 2 text files with the same first column - arrays

I need to merge this 2 files
File1
1
1
2
2
2
3
4
4
4
File2
1 A 0.2 0.8 0.3
2 B 0.4 0.3 0.2
3 C 0.8 0.9 0.5
4 D 0.6 0.7 0.8
Output should be
1 A 0.2 0.8 0.3
1 A 0.2 0.8 0.3
2 B 0.4 0.3 0.2
2 B 0.4 0.3 0.2
2 B 0.4 0.3 0.2
3 C 0.8 0.9 0.5
4 D 0.6 0.7 0.8
4 D 0.6 0.7 0.8
4 D 0.6 0.7 0.8

If you are using python and pandas then it's not too difficult I guess
d1 = pd.read_csv('doc1.txt',sep=" ",header=None)
d2 = pd.read_csv('doc2.txt',sep= " ",header=None)
data = d1.merge(d2,on=[0],how='left')
print(data)
There will be NAN values in data if second file does not have corresponding indices if you don't want that, you can change the type of join

Related

Alternative to multiple padarray calls to get a perimeter mask for image

I have an array of doubles img which I use to multiple with a mask mask.*img where the mask will have values of 1 in the middle but go linearly to 0 at the borders e.g. for a 5x5 mask it would be something like
0.1 0.1 0.1 0.1 0.1
0 0.5 0.5 0.5 0.1
0.1 0.5 1 0.5 0.1
0.1 0.5 0.5 0.5 0.1
0.1 0.1 0.1 0.1 0.1
My idea for this currently is to create the center using x = ones(M)
and then create a sequence of decreasing values y = [0.9 0.5 0.3 0.1]
and then do
for k = 1: size(y)
x = padarray(x,[1 1], y(k))
which will add the values of y as a perimeter around x multiple times, one at a time. Is there a more clever way to create this kind of mask that tapers off at the perimeter?
An interesting way to do something similar might be. Where vector Taper is the same as the centre row of the 5 by 5 matrix. The rows are generated by comparing the corresponding element in the transpose with the vector Taper which is Taper.'.
Broken down into steps:
Row 1: min([0.1 0.5 1 0.5 0.1],[0.1]); → [0.1 0.1 0.1 0.1 0.1]
Row 2: min([0.1 0.5 1 0.5 0.1],[0.5]); → [0.1 0.5 0.5 0.5 0.1]
Row 3: min([0.1 0.5 1 0.5 0.1],[1]); → [0.1 0.5 1 0.5 0.1]
Row 4: min([0.1 0.5 1 0.5 0.1],[0.5]); → [0.1 0.5 0.5 0.5 0.1]
Row 5: min([0.1 0.5 1 0.5 0.1],[0.1]); → [0.1 0.1 0.1 0.1 0.1]
Taper = [0.1 0.5 1 0.5 0.1];
Result = min(Taper, Taper.');
Result

How to set this array based on 2 other arrays in Amibroker?

I have these 2 arrays signal_arr and value_arr in Amibroker.
From these 2 arrays, I want to output an array output_arr such that when signal_arr is 1, it will follow the value of value_arr. When signal_arr is 0, output_arr will retain the value of value_arr when signal_arr was last 1.
This is best illustrated by an example.
signal_arr = [ 1 0 0 0 1 0 0 1 0 0 ]
value_arr = [0.5 0.6 0.4 0.2 0.8 0.7 0.6 0.2 0.3 0.4]
output_arr = [0.5 0.5 0.5 0.5 0.8 0.8 0.8 0.2 0.2 0.2]
Use ValueWhen.
output_arr = ValueWhen(signal_arr, value_arr);

Comparing two columns and summing the values in Matlab

I have 2 columns like this:
0.0 1.2
0.0 2.3
0.0 1.5
0.1 1.0
0.1 1.2
0.1 1.4
0.1 1.7
0.4 1.1
0.4 1.3
0.4 1.5
In the 1st column, 0.0 is repeated 3 times. I want to sum corresponding elements
(1.2 + 2.3 + 1.5) in the 2nd column. Similarly, 0.1 is repeated 4 times in the 1st
column. I want to sum the corresponding elements (1.0 + 1.2 + 1.4 + 1.7) in the 2nd
column and so on.
I am trying like this
for i = 1:length(col1)
for j = 1:length(col2)
% if col2(j) == col1(i)
% to do
end
end
end
This is a classical use of unique and accumarray:
x = [0.0 1.2
0.0 2.3
0.0 1.5
0.1 1.0
0.1 1.2
0.1 1.4
0.1 1.7
0.4 1.1
0.4 1.3
0.4 1.5]; % data
[~, ~, w] = unique(x(:,1)); % labels of unique elements
result = accumarray(w, x(:,2)); % sum using the above as grouping variable
You can also use the newer splitapply function instead of accumarray:
[~, ~, w] = unique(x(:,1)); % labels of unique elements
result = splitapply(#sum, x(:,2), w); % sum using the above as grouping variable
a=[0.0 1.2
0.0 2.3
0.0 1.5
0.1 1.0
0.1 1.2
0.1 1.4
0.1 1.7
0.4 1.1
0.4 1.3
0.4 1.5]
% Get unique col1 values, and indices
[uniq,~,ib]=unique(a(:,1));
% for each unique value in col1
for ii=1:length(uniq)
% sum all col2 values that correspond to the current index of the unique value
s(ii)=sum(a(ib==ii,2));
end
Gives:
s =
5.0000 5.3000 3.9000

SQL Server : adding rows for each row?

I have a table in SQL Server like this:
Col1 Col2 Col3
----- ---- -----
1 1 1
0.5 0.5 2
0.3 0.1 3
What I would like to do is that for each value in Col 3, so 1,2,3, add a 4th column that contains the numbers 1-53 in sequence. So, something like:
Col1 Col2 Col3 Col 4
----- ---- ----- ------
1 1 1 1
1 1 1 2
1 1 1 3
And so forth.
How could I accomplish this in T-SQL / Microsoft SQL Server 2016?
Thanks!
Are these the results you're trying to get?
IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
DROP TABLE #TestData;
CREATE TABLE #TestData (
Col1 DECIMAL(9,1) NOT NULL,
Col2 DECIMAL(9,1) NOT NULL,
Col3 INT NOT NULL
);
INSERT #TestData (Col1, Col2, Col3) VALUES
(1, 1 ,1), (0.5,0.5,2), (0.3,0.1,3);
SELECT
td.Col1, td.Col2, td.Col3, Col4 = t.n
FROM
#TestData td
CROSS APPLY dbo.tfn_Tally(53, 1) t;
Results...
Col1 Col2 Col3 Col4
----- ----- ---- -----
1.0 1.0 1 1
0.5 0.5 2 1
0.3 0.1 3 1
1.0 1.0 1 2
0.5 0.5 2 2
0.3 0.1 3 2
1.0 1.0 1 3
0.5 0.5 2 3
0.3 0.1 3 3
1.0 1.0 1 4
0.5 0.5 2 4
0.3 0.1 3 4
1.0 1.0 1 5
0.5 0.5 2 5
0.3 0.1 3 5
1.0 1.0 1 6
0.5 0.5 2 6
0.3 0.1 3 6
1.0 1.0 1 7
0.5 0.5 2 7
0.3 0.1 3 7
1.0 1.0 1 8
0.5 0.5 2 8
0.3 0.1 3 8
1.0 1.0 1 9
0.5 0.5 2 9
0.3 0.1 3 9
1.0 1.0 1 10
0.5 0.5 2 10
0.3 0.1 3 10
1.0 1.0 1 11
0.5 0.5 2 11
0.3 0.1 3 11
1.0 1.0 1 12
0.5 0.5 2 12
0.3 0.1 3 12
1.0 1.0 1 13
0.5 0.5 2 13
0.3 0.1 3 13
1.0 1.0 1 14
0.5 0.5 2 14
0.3 0.1 3 14
1.0 1.0 1 15
0.5 0.5 2 15
0.3 0.1 3 15
1.0 1.0 1 16
0.5 0.5 2 16
0.3 0.1 3 16
1.0 1.0 1 17
0.5 0.5 2 17
0.3 0.1 3 17
1.0 1.0 1 18
0.5 0.5 2 18
0.3 0.1 3 18
1.0 1.0 1 19
0.5 0.5 2 19
0.3 0.1 3 19
1.0 1.0 1 20
0.5 0.5 2 20
0.3 0.1 3 20
1.0 1.0 1 21
0.5 0.5 2 21
0.3 0.1 3 21
1.0 1.0 1 22
0.5 0.5 2 22
0.3 0.1 3 22
1.0 1.0 1 23
0.5 0.5 2 23
0.3 0.1 3 23
1.0 1.0 1 24
0.5 0.5 2 24
0.3 0.1 3 24
1.0 1.0 1 25
0.5 0.5 2 25
0.3 0.1 3 25
1.0 1.0 1 26
0.5 0.5 2 26
0.3 0.1 3 26
1.0 1.0 1 27
0.5 0.5 2 27
0.3 0.1 3 27
1.0 1.0 1 28
0.5 0.5 2 28
0.3 0.1 3 28
1.0 1.0 1 29
0.5 0.5 2 29
0.3 0.1 3 29
1.0 1.0 1 30
0.5 0.5 2 30
0.3 0.1 3 30
1.0 1.0 1 31
0.5 0.5 2 31
0.3 0.1 3 31
1.0 1.0 1 32
0.5 0.5 2 32
0.3 0.1 3 32
1.0 1.0 1 33
0.5 0.5 2 33
0.3 0.1 3 33
1.0 1.0 1 34
0.5 0.5 2 34
0.3 0.1 3 34
1.0 1.0 1 35
0.5 0.5 2 35
0.3 0.1 3 35
1.0 1.0 1 36
0.5 0.5 2 36
0.3 0.1 3 36
1.0 1.0 1 37
0.5 0.5 2 37
0.3 0.1 3 37
1.0 1.0 1 38
0.5 0.5 2 38
0.3 0.1 3 38
1.0 1.0 1 39
0.5 0.5 2 39
0.3 0.1 3 39
1.0 1.0 1 40
0.5 0.5 2 40
0.3 0.1 3 40
1.0 1.0 1 41
0.5 0.5 2 41
0.3 0.1 3 41
1.0 1.0 1 42
0.5 0.5 2 42
0.3 0.1 3 42
1.0 1.0 1 43
0.5 0.5 2 43
0.3 0.1 3 43
1.0 1.0 1 44
0.5 0.5 2 44
0.3 0.1 3 44
1.0 1.0 1 45
0.5 0.5 2 45
0.3 0.1 3 45
1.0 1.0 1 46
0.5 0.5 2 46
0.3 0.1 3 46
1.0 1.0 1 47
0.5 0.5 2 47
0.3 0.1 3 47
1.0 1.0 1 48
0.5 0.5 2 48
0.3 0.1 3 48
1.0 1.0 1 49
0.5 0.5 2 49
0.3 0.1 3 49
1.0 1.0 1 50
0.5 0.5 2 50
0.3 0.1 3 50
1.0 1.0 1 51
0.5 0.5 2 51
0.3 0.1 3 51
1.0 1.0 1 52
0.5 0.5 2 52
0.3 0.1 3 52
1.0 1.0 1 53
0.5 0.5 2 53
0.3 0.1 3 53
You'll have to invent a fake table with numbers in:
WITH nums as(
SELECT 1 as num
UNION ALL
SELECT num + 1 FROM nums
WHERE num <= 53
)
SELECT yourtable.*, num as col4 FROM
Yourtable
CROSS JOIN
nums
You can use below code. There are many ways to generate sequence (you can store it in temp table or use cte)
CREATE TABLE temp
(
Col1 DECIMAL(10,1),
Col2 DECIMAL(10,1),
Col3 INT
)
INSERT INTO temp
VALUES
(1,1,1)
,(0.5,0.5,2)
,(0.3,0.1,3)
DECLARE #Start INT =1
, #ENd INT = 53
SELECT
t.*
, seq.n AS Col4
FROM temp t
CROSS APPLY
(
SELECT DISTINCT n = number
FROM master..[spt_values]
WHERE number BETWEEN #start AND #end
) seq
RESULT:
Col1 Col2 Col3 Col4
--------------------------------------- --------------------------------------- ----------- -----------
1.0 1.0 1 1
1.0 1.0 1 2
1.0 1.0 1 3
1.0 1.0 1 4
1.0 1.0 1 5
1.0 1.0 1 6
1.0 1.0 1 7
1.0 1.0 1 8
1.0 1.0 1 9
1.0 1.0 1 10
1.0 1.0 1 11
1.0 1.0 1 12
1.0 1.0 1 13
1.0 1.0 1 14
1.0 1.0 1 15
1.0 1.0 1 16
1.0 1.0 1 17
1.0 1.0 1 18
1.0 1.0 1 19
1.0 1.0 1 20
1.0 1.0 1 21
1.0 1.0 1 22
1.0 1.0 1 23
1.0 1.0 1 24
1.0 1.0 1 25
1.0 1.0 1 26
1.0 1.0 1 27
1.0 1.0 1 28
1.0 1.0 1 29
1.0 1.0 1 30
1.0 1.0 1 31
1.0 1.0 1 32
1.0 1.0 1 33
1.0 1.0 1 34
1.0 1.0 1 35
1.0 1.0 1 36
1.0 1.0 1 37
1.0 1.0 1 38
1.0 1.0 1 39
1.0 1.0 1 40
1.0 1.0 1 41
1.0 1.0 1 42
1.0 1.0 1 43
1.0 1.0 1 44
1.0 1.0 1 45
1.0 1.0 1 46
1.0 1.0 1 47
1.0 1.0 1 48
1.0 1.0 1 49
1.0 1.0 1 50
1.0 1.0 1 51
1.0 1.0 1 52
1.0 1.0 1 53
0.5 0.5 2 1
0.5 0.5 2 2
0.5 0.5 2 3
0.5 0.5 2 4
0.5 0.5 2 5
0.5 0.5 2 6
and so on...

merge / append files and re-number first column in unix

I am many (3 just an example) text files in different directories (3 different names) like following:
Directory: A, file name: run.txt format: txt tab deliminated
; file one
10 0.2 0.5 0.3
20 0.1 0.6 0.8
30 0.2 0.1 0.1
40 0.1 0.5 0.3
Directory: B, file name: run.txt format: txt tab deliminated
; file two
10 0.2 0.1 0.2
30 0.1 0.6 0.8
50 0.2 0.1 0.1
70 0.3 0.4 0.4
Directory: C, file name: run.txt format: txt tab deliminated
; file three
10 0.3 0.3 0.3
20 0.3 0.6 0.8
30 0.1 0.1 0.1
40 0.2 0.2 0.3
I want to combine all three run.txt files into single and renumber the first column. The resulting new file will look like:
; file combined
10 0.2 0.5 0.3
20 0.1 0.6 0.8
30 0.2 0.1 0.1
40 0.1 0.5 0.3
50 0.2 0.1 0.2
70 0.1 0.6 0.8
90 0.2 0.1 0.1
110 0.3 0.4 0.4
120 0.3 0.3 0.3
130 0.3 0.6 0.8
140 0.1 0.1 0.1
150 0.2 0.2 0.3
This what my codes are at:
cat A/run.txt B/run.txt C/run.txt > combined.txt
(1) I do not know how to take care of renumbering by first column
(2) Also I do not how to take care of comment starting with ";"
Edit:
Let me be clear about the number scheme:
A/run.txt, B/run.txt and C/run.txt are actually parallel run to combined into one.
so each will have stored samples with run number. However gap can be uneven among the run.
(1) for first file A/run.txt (gap is 10, 20-10, 30-20)
10, 10+10, 20+10, 30+10
(2) for second file B/run.txt, starts from 10 but has gap of 20
(eg. 30-10, 50-70, 70-50)
40 (from last line of the first file) + 10 (first in file two) = 50,
50 + 20 = 70,70 + 20 = 90, 90+ 20 = 110
(3) file C/run.txt starts from 10 and increment is 10
110 (last number in file 2) + 10 = 120, 120+ 10 = 130,
130+10 = 140, 140+10 = 150`
You could use awk:
awk 'BEGIN{l=0;print "; file combined"}; {if($1!=";")print l,$2,$3,$4;l=l+10}' A/run.txt B/run.txt C/run.txt > combined.txt
EDIT
I made a guess about your numbering scheme (you've provided still no spec) and come up with:
awk 'BEGIN{line=0;last=0;print "; file combined"}; !/^;/{if($1<last){line=last+$1}else{line=line+$1-last;last=$1};print line,$2,$3,$4}' \
A/run.txt B/run.txt C/run.txt > combined.txt
Is it what you mean?
#!/usr/bin/awk -f
BEGIN {
OFS = "\t"
printf "%s\n", "; file combined"
}
! /^;/ {
if (FILENAME != prevfile) {
prevnum = $1
prevfile = FILENAME
interval = 10
c = 0
}
c++
if (c == 2) {
interval = $1 - prevnum
}
$1 = (i += interval)
print
}
To run it:
$ ./renumber {A,B,C}/run.txt
Given your sample input, it produces output that exactly matches your sample.
awk '{$1="";print NR"0",$0}' A/run.txt B/run.txt C/run.txt > combined.txt
This might work for you:
awk -F'[\t]' 'lastfile!=FILENAME{lastfile=FILENAME;i=l};{$1+=i;l=$1};1' A/run.txt B/run.txt C/run.txt > combined.txt

Resources