Matlab - Importing a .dat file into an array - arrays

I'm still fairly new to Matlab but for some reason the documentation hasn't been all that helpful with this.
I've got a .dat file that I want to turn into a _ row by 6 column array (the number of rows changes depending on the program that's generating the .dat file). What I need to do is get the dimensions of the image this array will be used to make from the 1st row 2nd column (x dimension) and 1st row 4th column (y dimension). When using the Import Data tool in Matlab, this works properly:
However I need the program to do it automatically. If the first line wasn't there, I'm pretty sure I could just use fscanf to put the data in the array, but the image dimensions are necessary.
Any idea what I need to use instead?

You may use textscan. The first call to this function will handle the first line (i.e. get the dimension of your file) and the second call the remaining of your file. The second call uses repmat to declare the format spec: %f, meaning double, repeated nb_col times. The option CollectOutput will concatenate all the columns in a single array. Note that textscan can read the entire file without specifying the number of rows.
The code would be
fileID = fopen('youfile.dat'); %declare a file id
C1 = textscan(fileID,'%s%f%s%f'); %read the first line
nb_col = C1{4}; %get the number of columns (could be set by user too)
%read the remaining of the file
C2 = textscan(fileID, repmat('%f',1,nb_col), 'CollectOutput',1);
fclose(fileID); %close the connection
In the case where the the number of columns is fixed, you can simply do
fileID = fopen('youfile.dat');
C1 = textscan(fileID,'%s%f%s%f'); %read the first line
im_x = C1{2}; %get the x dimension
im_y = C1{4}; %get the x dimension
C2 = textscan(fileID,'%f%f%f%f%f%f%*[^\n]', 'CollectOutput',1);
fclose(fileID);
The format specification %*[^\n] skips the remaining of a line.

Related

Keeping values of cell arrays when exporting to Excel

I have a cell array. Some of the elements in this cell array contains zeros as the first character and the whole element is only numbers (double) as well. When exporting these to Excel (which I prefer), the zeros are deleted and converting it to a number.
Let's take an example to illustrate my problem. I have a cell array with 10 elements:
NodeID = {'0000006';
'0000011';
'000011R';
'000016R';
'000021R';
'B276_2';
'EB 7.55';
'EB2521';
'EllebaekOPlB1';
'EllebaekOplB10'};
The first two elements contains zeros until the number 6 and 11, respectively. Unlike the third element and so forth, where letters are involved. So when exporting NodeID to Excel, it returns this in a column (I use writetable command by the way):
6
11
000011R
000016R
000021R
B276_2
EB 7.55'
EB2521
EllebaekOPlB1
EllebaekOplB10
Notice the removal of zeros for the first two elements. Now I know that in Excel, it will keep all the content with the addition of a quote symbol ' in front of the cell, eg. '0000006 for the first element.
I have searched in many places to find a solution to this. But is there a good way to avoid this from happening? Either by somehow adding an extra ยด or some other magical trick which I have not seen?
Thank you in advance!
One alternative (if your values are in a cell, as you say they are):
filename = 'NodeID.xlsx';
NodeID2 = cellfun(#(C) ['''',C], NodeID,'UniformOutput', false)
xlswrite(filename, NodeID2)
This gives you:
NodeID2 =
''0000006'
''0000011'
''000011R'
''000016R'
''000021R'
''B276_2'
''EB 7.55'
''EB2521'
''EllebaekOPlB1'
''EllebaekOplB10'
And an Excel file looking like this:
The cellfun line is equivalent to:
for ii = 1:numel(NodeID)
NodeID2{ii,1} = ['''', NodeID{ii}];
end
The part ['''', NodeID{ii}] inserts a single quotation mark in front for the string. Relevant answer.

Best way to compare data from file to data in array in Matlab

I am having a bit of trouble with a specific file i/o in matlab, I am fairly new to it still so some things are still a bit of a mystery to me. The input file is structured as so:
File Name: Processed_kplr003942670-2010174085026_llc.fits.txt
File contents- 6 Header Lines then:
1, 2, 3
1, 2, 3
basically a matrix of about [1443,3] with varying values
now here is the matrix that I'm comparing it to:
[(0123456, 1, 2, 3), (0123456, 2, 3, 4), (etc..)]
Now here is my problem, first I need to know how to properly do the file input in a way which can let me compare the ID number (0123456) that is in the filename with the ID value that is in the matrix, so that I can compare the other columns of both. I do not know how to achieve this in matlab. Furthermore, I need to be able to loop over every point in the the matrix that matches up to the specific file, for example:
If I have 15 files ranging from 'Processed_0123456_1' to 'Processed_0123456_15' then I want to be able to read in the values contained in 'Processed_0123456_1'and compare them to ANY row in the matrix that corresponds to that ID (0123456). I don't know if maybe accumaray can be used for this, but as I said I'm not sure.
So the code must:
-Read in file
-Compare file to any point in the matrix with corresponding ID
-Do operations
-Loop over until full list of files in the directory are read in and processed, and output a matrix with the results.
Thanks for any help.
EDIT: Exact File Sample--
Kepler I.D.-----Channel
[1161345]--------[84]
-TTYPE1--------TTYPE8------------TTYPE4
['TIME']---['PDCSAP_FLUX']---['SAP_FLUX']
['BJD - 2454833']--['e-/s']--------['e-/s']
CROWDSAP --- 0.9791
630.195880143,277165.0,268233.0
630.216312946,277214.0,268270.0
630.23674585,277239.0,268293.0
630.257178554,277296.0,268355.0
630.277611357,277294.0,268364.0
630.29804426,277365.0,268441.0
630.318476962,277337.0,268419.0
630.338909764,277403.0,268481.0
630.359342667,277389.0,268463.0
630.379775369,277441.0,268508.0
630.40020817,277545.0,268604.0
There are more entries than what was just posted but they go for about 1000 lines so it is impractical to post that all here.
To get the file ID, use regular expressions, e.g.:
filename = 'Processed_0123456_1';
file_id_str = regexprep(filename, 'Processed_(\d+)_\d+', '$1');
file_num_str = regexprep(filename, 'Processed_\d+_(\d+)', '$1')
To read in the file contents, assuming that it's all comma-separated values without a header, use textscan, e.g.,
fid = fopen(filename)
C = textscan(fid, '%f,%f,%f') % Use as many %f specifiers as you have entries per line in the file
textscan also works on strings. So, for example, if your file contents was:
filestr = sprintf('1, 2, 3\n1, 3, 3')
Then running textscan on filestr works like this:
C = textscan(filestr, '%f,%f,%f')
C =
[2x1 int32] [2x1 int32] [2x1 int32]
You can convert that to a matrix using cell2mat:
cell2mat(C)
ans =
1 2 3
1 3 3
You could then repeat this procedure for all files with the same ID and concatenate them into a single matrix, e.g.,
C_full = [];
for (all files with the same ID)
C = do_all_the_above_stuff;
C_full = [C_full; C];
end
Then you can look for what you want in C_full.
Update based on updated OP Dec 12, 2013
Here's code to read the values from a single file. Wrap this all in the the loop that I mentioned above to loop over all your files and read them all into a single matrix.
fid = fopen('/path/to/file');
% Skip over 12 header lines
for kk = 1:12
fgetl(fid);
end
% Read in values to a matrix
C = textscan(fid, '%f,%f,%f');
C = cell2mat(C);
I think your requirements are too complicated to write the whole script here. Nonetheless, I will try to give some pointers to help. Disclaimer: None of this is tested, just my best guess. Please expect syntax errors, etc. I hope you can figure them out :-)
1) You can use the textscan function with the delimiter option to get data from the lines of your file. Since your format varies as it does, we will probably want to use...
2) ... fgetl to read the first two lines into strings and process them separately using texstscan. Such an operation might look like:
fid = fopen('file.txt','w');
tline1 = fgetl(fid);
tline2 = fgetl(fid);
fclose(fid);
C1 = textscan(tline1,'%s %d %s','delimiter','_'); %C1{2} will be the integer we want
C2 = textscan(tline2,'%s %s'),'delimiter,':'); %C2{2} will be the values we want, but they're still a string so...
mat = str2num(C2{2});
3) Then, for the rest of the lines, we can use something like dlmread:
mat2 = dlmread('file.txt',',',2,0);
The 2,0 specifies the offset in 0-based rows,columns from the start of the file. You may need to look at something like vertcat to stitch mat and mat2 together.
4) The list of files in the directory can be found with the dir command. The filename is an attribute of the structure that's returned:
dirlist = dir;
for i = 1:length(dirlist)
filename = dirlist(i).name
%process your files
end
You can also pass matching strings to dir, like so:
dirlist = dir('*.txt');
which will find all of the files with extension .txt.
5) You can very easily loop through the comparison matrix:
sze = size(comparisonmatrix);
for i = 1:sze(1)
%compare comparisonmatrix(i,1) to C1{2}
%Perform whatever operations you need
end
Hope that helps!

Matlab - How to compare data in two arrays and output largest

I have a 60,000-by-2 array. The first column is data 1 and second column is data 2; both of equal length. I'm not sure how to properly write the syntax to compare data 1 to data 2, and if data 1 is larger than data 2 then write that to the third column. Or vice versa if data 2 is larger than data 1. I have begun constructing a for loop, but I'm having syntax issues comparing the columns.
No loops are needed. If you simply want to create a vector containing the largest element in each row of your 60,000-by-2 matrix you can use the max function:
A = rand(6e4,2); % Random demo data
B = max(A,[],2);
Or if you then want to put the result directly in a third column of A:
A(:,3) = max(A,[],2);
Read the documentation for max. You'll see that the 2 in the third argument applied the max function across each row of the input, A.

Creating sub-arrays from large single array based on marker values

I need to create a 1-D array of 2-D arrays, so that a program can read each 2-D array separately.
I have a large array with 5 columns, with the second column storing 'marker' data. Depending on the marker value, I need to take the corresponding data from the remaining 4 columns and put them into a new array on its own.
I was thinking of having two for loops running, one to take the target data and write it to a cell in the 1-D array, and one to read the initial array line-by-line, looking for the markers.
I feel like this is a fairly simple issue, I'm just having trouble figuring out how to essentially cut and paste certain parts of an array and write them to a new one.
Thanks in advance.
No for loops needed, use your marker with logical indexing. For example, if your large array is A :
B=A(A(:,2)==marker,[1 3:5])
will select all rows where the marker was present, without the 2nd col. Then you can use reshape or the (:) operator to make it 1D, for example
B=B(:)
or, if you want a one-liner:
B=reshape(A(A(:,2)==marker,[1 3:5]),1,[]);
I am just answering my own question to show any potential future users the solution I came up with eventually.
%=======SPECIFY CSV INPUT FILE HERE========
MARKER_DATA=csvread('ESphnB2.csv'); % load data from csv file
%===================================
A=MARKER_DATA(:,2); % create 1D array for markers
A=A'; % make column into row
for i=1:length(A) % for every marker
if A(i) ~= 231 % if it is not 231 then
A(i)=0; % set value to zero
end
end
edgeArray = diff([0; (A(:) ~= 0); 0]); % set non-zero values to 1
ind = [find(edgeArray > 0) find(edgeArray < 0)-1]; % find indices of 1 and save to array with beginning and end
t=1; % initialize counter for trials
for j=1:size(ind,1) % for every marked index
B{t}=MARKER_DATA(ind(j,1):ind(j,2),[3:6]); % create an array with the rows from the data according to indicies
t=t+1; % create a new trial
end
gazeVectors=B'; % reorient and rename array of trials for saccade analysis
%======SPECIFY MAT OUTPUT FILE HERE===
save('Trial_Data_2.mat','gazeVectors'); % save array to mat file
%=====================================

Outputting the rows from an (ixj) array into individual (5xj/5) arrays in a text file

In a program I'm writing, I've created an allocated, final product array AFT(n,92). In my output I would like present each row as its own table, 5 columns wide.
So in this case, it would be n individual tables of 19 rows X 5 columns with only 2 values on the final row. I attempted doing this as a do loop as shown in the code snip below, but the output comes out as just one long column. I'm not sure where to go from here.
DO i=1,n
WRITE(4,800) t(i), ' HHMM LDT' !Writes the table header using an array which holds the corresponding time value
800 FORMAT(14, A9)
DO j=1,92
WRITE(4,900) AFT(i,j)
900 FORMAT(5ES23.14)
END DO
END DO
I believe this is happening because the write command is performed for each j individually due to the use of a loop, but my inexperience with FORTRAN is leading me to a blank when I try to come up another approach.
Yes, each write statement produces one line of text output. If you want multiple items to be included in the same output record, you have to include them in the write statement. If you want to include portions of an array, you can use techniques such as:
do i=1, N
write (*, *) (array (i,j), j=1, 5)
end do
or
do i=1, N
write (*, *) array (i, 1:5)
end do
The first is using implied do loops, the second array sections.

Resources