Read from txt file to matrix after a specific expression - file

I wanted via matlab to read a table of data from a txt file after a specific expression and a number of non desired lines for example the AA.txt have:
Information about students :
AAAA
BBBB
1 10 100
2 3 15
! ! ! a number of lines
10 6 9
I have like information the expression 'Information about students', the number of skipped lines 2 and the number of columns 3 and rows 10 in desired matrix.

if I understand correctly, you wanna skip the first 3 lines (assuming them as headers) and then reading the rest.
I would follow this procedure:
fid = fopen(filename,'r');
A = textscan(fid,'%f %f %f','HeaderLines',3,'Delimiter','\r\n');
I currently do not have access to MATLAB, but I do believe it will work.

Related

How to read text file in dolphindb?

The dolphindb manual says it can read file like
fin = file("C:/DolphinDB/test.txt")
x=fin.readLine()
But readLine() will return a string of this row.
I have a text file like:
1 2 3 4 5 6 7 8 9
How can I get these nine digits one by one ?
After you read the line, use the code below to convert
split(readLine(), ' ').int()

Writing my output into two columns in a text file .txt in C

I've looked around but couldn't find a satisfying solution... Basically I made a function that calculates the probability distribution of x number of loss in a portfolio of n credits... And I am trying to write the output in a text file into two columns where the first column would be the X (number of defaults) and second column would be the P(density function of each loss).. something like this:
X P
1 0.005
2 0.003
3 0.005
4 0.005
5 0.005
etc.
I've looked around and people suggested using negative- sign in front of my %d and %f when using fprintf but no luck....
Here's a sample of my code and the output it gives me...
Code:
for(i=0;i<d+1;i++)
{
Densite= gsl_ran_binomial_pdf(i,p,d);
fprintf(pF,"%-5d %-20f .\n",i, Densite);
}
Output:
0 0.005921 .
1 0.031161 .
2 0.081182 .
3 0.139576 .
4 0.178143 .
5 0.180018 .
6 0.150015 .
7 0.106026 .
8 0.064871 .
9 0.034901 .
10 0.016716 .
How to remedy?
Thanks in advance! (complete noob that started coding in C like two days ago..)
Did you run the executable program on Windows or Linux? If Window please use \r\n for new line.

Data Entry in SAS using Loops

I just learned about the "do" loop today and would like to try using it for data entry in SAS. I have tried most examples online, but I still cannot figure it out.
My dataset in an experiment with 6 treatments (1 to 6) using 2 sets of cues, 3 each, Visual and Audio. There's lag measured in seconds, which are 5, 10, and 15, which there are 2 sets.
Basically it looks like this:
Table
The entries I want are:
1. Obs_no, ranging from 1 to 18 (total of 18 observations, this allows me to easily delete outliers with an IF THEN)
2. Treatment type, which are Auditory and Visual.
3.Treatment number, 1 to 6, 3 sets.
4. Lag, 5, 10 or 15.
5. And the data itself
So far, my code makes 2 and 5 possible, it also makes the rest possible with an IF THEN statement and input statement, although I assume there's a way easier method:
data AVCue;
do cue = 'Auditory','Visual';
do i = 1 to 3;
input AVCue ##;
output;
end;
end;
datalines;
.204 .167 .202 .257 .283 .256
.170 .182 .198 .279 .235 .281
.181 .187 .236 .269 .260 .258
;
Lag and the rest was made possible using an IF THEN statement and the crude method of input:
data AVCue;
set AVCue;
IF i=1 THEN Lag=5;
IF i=2 THEN Lag=10;
IF i=3 THEN Lag=15;
input obs_no treatment;
cards;
1 1
2 2
3 3
4 4
5 5
6 6
7 1
8 2
9 3
10 4
11 5
12 6
13 1
14 2
15 3
16 4
17 5
18 6
;
proc print data=AVCue;
run;
The IF THEN should be fine, but the input statement here is just in my opinion counterproductive, and defeats the purpose of using loops, which is to me, to save time. If done this way, I might as well just put the data into excel and import it, or type everything out with ample copy and paste of the text in the
input obs_no treatment;
cards;
section.
My coding knowledge is basic, so sorry if this question sounds silly, I want to know:
1. How would I make a list of numbers using the "do" loops in SAS? I've made several attempts and all I get is a list containing the next number. I know why this happens, the loop counts to x and the value assigned would just be x. I just don't know how to get around that. Somehow this didn't happen in the datalines section, I guess SAS knows there's 18 numbers and the entry i is stored accordingly... or something?
2. How would I go about assigning in this case, the numbers 1 to 6 to each entry?
Thanks!
It is certainly much easier to read in the actual dataset instead of having to impute some of the variables based on the order the values have in the source data. You might be able to combine a SET statement and an INPUT statement in the same data step and get it to work, but it is probably NOT worth the effort. Just make two datasets and merge them.
Looking at the photograph you posted it looks like TREATMENT is not an independent variable. Instead it is just a label for the combination of CUE and LAG. To make it cycle from 1 to 6 just reset it back to 1 when it gets too large.
data AVCue;
do cue = 'Auditory','Visual';
do lag= 5, 10, 15 ;
treatment+1;
if treatment=7 then treatment=1;
obsno+1;
input AVCue ##;
output;
end;
end;
datalines;
.204 .167 .202 .257 .283 .256
.170 .182 .198 .279 .235 .281
.181 .187 .236 .269 .260 .258
;
You can get in trouble if you just let SAS guess at how you want to define your variables. For example if you change the order of the CUE values do cue = 'Visual','Auditory'; then SAS will make CUE with length $5 instead of $8. Add a LENGTH statement to define your variables before you use them.
length obsno 8 treatment 8 cue $8 lag 8 AVCue 8 ;
This will also let you control the order they are created in the dataset.
If you really did already have a SAS dataset and you wanted to add a variable like TREATMENT that cycled from 1 to 6 (or really any DO loop construct) then could nest the SET statement inside the DO loop. Just remember to add the explicit OUTPUT statement.
data new ;
do treatment=1 to 6 ;
set old;
output;
end;
run;

Python: How to compare 2 file text?

I have 1 large file text A and 1 small file text B. Now, I want compare file B and file A to see what is unique in file B.
For example:
File A:
1
2
3
4
5
File B
2
3
6
7
==> ouput
6
7
What is best solution for this ? I searched some thread in the website but i think my question is different because my file is large. Thanks
The below is my code but it doesn't work
with open('C:/unique.txt', 'wb') as out:
for line in open ('C:/B.txt'):
for line1 in open ( 'C:/A.txt' ):
if line != line1:
out.write(line)
I tried this and it worked for me. Hope this helps
with open('C:/unique.txt,'r+') as text:
with open('C:/unique2.txt','r+') as text2:
for read in text.readlines():
for read2 in text2.readlines():
if read2 not in read:
print(read2)

Print words from the corresponding line numbers

Hello Everyone,
I have two files File1 and File2 which has the following data.
File1:
TOPIC:topic_0 30063951.0
2 19195200.0
1 7586580.0
3 2622580.0
TOPIC:topic_1 17201790.0
1 15428200.0
2 917930.0
10 670854.0
and so on..There are 15 topics and each topic have their respective weights. And the first column like 2,1,3 are the numbers which have corresponding words in file2. For example,
File 2 has:
1 i
2 new
3 percent
4 people
5 year
6 two
7 million
8 president
9 last
10 government
and so on.. There are about 10,470 lines of words. So, in short I should have the corresponding words in the first column of file1 instead of the line numbers. My output should be like:
TOPIC:topic_0 30063951.0
new 19195200.0
i 7586580.0
percent 2622580.0
TOPIC:topic_1 17201790.0
i 15428200.0
new 917930.0
government 670854.0
My Code:
import sys
d1 = {}
n = 1
with open("ap_vocab.txt") as in_file2:
for line2 in in_file2:
#print n, line2
d1[n] = line2[:-1]
n = n + 1
with open("ap_top_t15.txt") as in_file:
for line1 in in_file:
columns = line1.split(' ')
firstwords = columns[0]
#print firstwords[:-8]
if firstwords[:-8] == 'TOPIC':
print columns[0], columns[1]
elif firstwords[:-8] != '\n':
num = columns[0]
print d1[n], columns[1]
This code is running when I type print d1[2], columns[1] giving the second word in file2 for all the lines. But when the above code is printed, it is giving an error
KeyError: 10472
there are 10472 lines of words in the file2. Please help me with what I should do to rectify this. Thanks in advance!
In your first for loop, n is incremented with each line until reaching a final value of 10472. You are only setting values for d1[n] up to 10471 however, as you have placed the increment after you set d1 for your given n, with these two lines:
d1[n] = line2[:-1]
n = n + 1
Then on the line
print d1[n], columns[1]
in your second for loop (for in_file), you are attempting to access d1[10472], which evidently doesn't exist. Furthermore, you are defining d1 as an empty Dictionary, and then attempting to access it as if it were a list, such that even if you fix your increment you will not be able to access it like that. You must either use a list with d1 = [], or will have to implement an OrderedDict so that you can access the "last" key as dictionaries are typically unordered in Python.
You can either:
Alter your increment so that you do set a value for d1 in the d1[10472] position, or simply set the value for the last position after your for loop.
Depending on what you are attempting to print out, you could replace your last line with
print d1[-1], columns[1]
to print out the value for the final index position you currently have set.

Resources