Getting back empty arrays/ variables when uploading .txt file to Matlab - arrays

I am trying to load a .txt file with data to Matlab to use for some calculations. However, when I run the code the variables/arrays come back empty or blank.
Below I have the code I am using.
%% importing the data
% Open file in the memory
fileID = fopen('rainfall.txt');
% Read the txt file with formats: Integer, Integer, Float
% Treat multiple delimiters, which is "space" in here, as one. Put the data
% in a variable called chunk.
chunk = textscan(fileID,'%d %d %f','Delimiter',' ',...
'MultipleDelimsAsOne',1);
% Close file from the memory.
fclose(fileID);
% date
dt = chunk{:,1};
% hour
hr = chunk{:,2};
% precip
r = chunk{:,3};
% remove extra variables from Matlab workspace
clear fileID ans
In the Workspace tab in Matlab it shows chunk to be an empty 1x3 cell. This results in dt, hr, and r not having any values either and are listed as having a value of []. So my best guess is that something is going wrong with loading in the data to Matlab.
Also, here is small portion of the data I am working with. This is exactly how it is written in the .txt file as well.
STATION DATE HPCP
----------------- -------------- --------
COOP:132367 20040116 22:00 0.01
COOP:132367 20040116 23:00 0.01
COOP:132367 20040117 00:00 0.04
COOP:132367 20040117 01:00 0.02
COOP:132367 20040117 02:00 0.00
In the actual file I have a lot more data than what I have listed here, but this should give an idea of what the data looks like and how it's formatted.

From the textscan help page:
textscan attempts to match the data in the file to the conversion specifier in formatSpec. The textscan function reapplies formatSpec throughout the entire file and stops when it cannot match formatSpec to the data.
So the first problem is the title lines. You should discard them. For example, by manually reading 2 lines (using fgetl).
Next, you should make sure that the format matches the data. You tried reading 2 integers and a float but you also have the station name.
I think the following should be ok:
fileID = fopen('rainfall.txt');
l = fgetl(fileID);
l = fgetl(fileID);
chunk = textscan(fileID,'%s:%d %d %d %f','Delimiter',' ',...
'MultipleDelimsAsOne',1);

Related

GMT subtraction on MATLAB

I'm currently working on a small project on handling time difference on MATLAB. I have two input files; Time_in and Time_out. The two files contain arrays of time in the format e.g 2315 (GMT - Hours and Minute)
I've read both Time_in' and 'Time_out on MATLAB but I don't know how to perform the subtraction. Also, I want the corresponding answers to be in minutes domain only e.g (2hrs 30mins = 150minutes)
this is one of several possible solutions:
First, you should convert your time strings to a MATLAB serial date number. If you've done this, you can do your calculation as you want:
% input time as string
time_in = '2115';
time_out = '2345';
% read the input time as datenum
dTime_in = datenum(time_in,'HHMM');
dTime_out = datenum(time_out,'HHMM');
% subtract to get the time difference
timeDiff = abs(dTime_out - dTime_in);
% Get the minutes of the time difference
timeout = timeDiff * 24 * 60;
Furthermore, to calculate the time differences correctly you also should put some information about the date in your time vector, in order to calculate the correct time around midnight.
If you need further information about the function datenum you should read the following part of the MATLAB documentation:
https://de.mathworks.com/help/matlab/ref/datenum.html
Any questions?
In a recent version of MATLAB, you could use textscan together with datetime and duration data types to do this.
% read the first file
fh1 = fopen('Time_in');
d1 = textscan(fh1, '%{HHmm}D');
fclose(fh1);
fh2 = fopen('Time_out');
d2 = textscan(fh2, '%{HHmm}D');
fclose(fh2);
Note the format specifier '%{HHmm}D' tells MATLAB to read the 4-digit string into a datetime array.
d1 and d2 are now cell arrays where the only element is a datetime vector. You can subtract these, and then use the minutes function to find the number of minutes.
result = minutes(d2{1} - d1{1})

SAS INPUT DATA WITH SPECIAL CHARACTERS

I´m trying to import some dat file (comma delimited) to SAS University. However, one variable contains special characters (e.g. french accents). Most are replaced with �, but also some observations have some problems.
Example of a problem:
An original observation in the data looks like this:
Crème Brûlée,105,280
Running the following command:
DATA BenAndJerrys;
INFILE '/folders/myfolders/HW3/BenAndJerrys.dat' DLM = ',' DSD MISSOVER;
INPUT flavor_name :$48. portion_size calories;
RUN;
It has this problem:
flavor_name=Cr�me Br�l�e,105 portion_size=280 calories=
as you can see the value 105 which is the value of portion_size is merged with the value of flavor_name, and the value 280 of calories is assigned to portion_size.
How can solve this problem and allow SAS to import the data with the special characters?
Try telling SAS what encoding to use when reading the file.
I copied and saved your sample line into a text file using Windows NOTEPAD editor.
%let path=C:\Downloads ;
data _null_;
infile "&path\test.txt" dsd encoding=wlatin1;
length x1-x3 $50 ;
input x1-x3;
put (_all_) (=);
run;
Result in the log.
x1=Crème Brûlée x2=105 x3=280
NOTE: 1 record was read from the infile "C:\Downloads\test.txt".
The minimum record length was 20.
The maximum record length was 20.

How to read file in matlab?

I have a txt file, and the content of the file is rows of numbers,
each row have 5 float number in it, with comma seperate between each number.
example:
1.1 , 12 , 1.42562, 3.5 , 2.2
2.1 , 3.3 , 3 , 3.333, 6.75
How can I read the file content into matrix in matlab?
So far I have this:
fid = fopen('file.txt');
comma = char(',');
A = fscanf(fid, ['%f', comma]);
fclose(fid);
The problem is that it's only give me the first line and when I
try to write the content of A I get this: 1.0e+004 * some number
Can someone help me please?
I guess that for the file I need to read it in a loop but I don't know how.
Edit: One more question: When I do output to A I get this:
A =
1.0e+004 *
4.8631 0 0 0 0.0001
4.8638 -0.0000 -0.0000 0.0004 0.0114
4.8647 -0.0000 -0.0000 0.0008 0.0109
I want the same values that in the file to be in the matrix, how can I make the numbers to be regular float and not formatted like this? Or are the numbers in the matrix actually float, but the output is just displayed like this?
MATLAB's built-in dlmread function would be a much easier solution for what you want to accomplish.
A = dlmread('filename.txt',',') % call dlmread and specify a comma as the delimiter
try with using importdata function
A = importdata(`filename.txt`);
It will solve your question.
EDIT
Alternative 1)
A = dlmread('test_so.txt',',');
The answer is surprisingly simple:
fid = fopen('depthMap.txt');
A = fscanf(fid, '%f');
fclose(fid);

Dates to integer arrays in MATLAB

Before asking my question, here's a little background so you understand what I'm doing. I'm looking to analyze a very large data set (a little less than 2,000,000 rows). I've parsed the data set into Matlab and built a structure array from this data, giving names, dates, returns, etc for each asset i. Now, I would like to restrict my data set to being between two days, and Matlab doesn't seem to be particularly amenable to that kind of approach. One suggestion that was given to me was to take the dates, which are of the form MM/DD/YYYY and use a delimiter '/' to somehow build three integer arrays for my data structure (which I'd call stock(i).month, stock(i).day, and stock(i).year). However, nothing I'm doing seems to be working, and I'm very much stuck.
What I have been trying to do is something like the following:
%% Dates
fid = fopen('52c6d3831952b24a.csv');
C = textscan(fid, [repmat('%*s ',1,0),'%s %*[^\n]'], 'delimiter',',');
date = C{1}(2:end,1);
fclose(fid);
for i=1:numStock
locate = strcmp(uniquePermno{i},permno);
stock(i+1).date = date(locate);
end;
for i = 1:numStock
stock(i+1).date = char(stock(i+1).date);
D = textscan(stock(i+1).date, '%s %s %s', 'delimiter','/');
stock(i+1).month = D{1}(1:end);
stock(i+1).day = D{2}(1:end);
stock(i+1).year = D{3}(1:end);
end
I initially wanted to save them as integers (and was using %u instead), but I was getting a strange situation where most of my entries were just 0 and the non-zero ones were very large (obviously not what I expected). However, the above form returns the following error:
Error using textscan
Buffer overflow (bufsize = 4095) while reading string from
file (row 1 u, field 1 u). Use 'bufsize' option. See HELP TEXTSCAN.
44444444444444444444455555555555555555555566666666666666666666677777777777777777777778888888888888888888889999999999999999999990000000000000000000000011111111111111111112222222222222222222222111111111
Error in makeData_CRSP (line 87)
D = textscan(stock(i+1).date, '%s %s %s', 'delimiter','/');
So I'm honestly at a loss for how to approach this. What am I doing wrong? Seeing how I saved my dates vectors for my data structure, is this the best way to approach this problem?
You can use the datenum function to convert dates into numbers. The syntax is datenum(dateString, format). For example, if your dates are in the format YYYY MM DD then that would be
datenum('2012 12 04', 'yyyy mm dd')
Once you converted all your dates like that you can simply compare the resulting numbers using > and <:
>> datenum('2012 12 04', 'yyyy mm dd') > datenum('2012 12 03', 'yyyy mm dd')
ans =
1
>> datenum('2012 12 04', 'yyyy mm dd') > datenum('2012 12 05', 'yyyy mm dd')
ans =
0

Reading a text file in MATLAB line by line

I have a CSV file, I want to read this file and do some pre-calculations on each row to see for example that row is useful for me or not and if yes I save it to a new CSV file.
can someone give me an example?
in more details this is how my data looks like: (string,float,float) the numbers are coordinates.
ABC,51.9358183333333,4.183255
ABC,51.9353866666667,4.1841
ABC,51.9351716666667,4.184565
ABC,51.9343083333333,4.186425
ABC,51.9343083333333,4.186425
ABC,51.9340916666667,4.18688333333333
basically i want to save the rows that have for distances more than 50 or 50 in a new file.the string field should also be copied.
thanks
You could actually use xlsread to accomplish this. After first placing your sample data above in a file 'input_file.csv', here is an example for how you can get the numeric values, text values, and the raw data in the file from the three outputs from xlsread:
>> [numData,textData,rawData] = xlsread('input_file.csv')
numData = % An array of the numeric values from the file
51.9358 4.1833
51.9354 4.1841
51.9352 4.1846
51.9343 4.1864
51.9343 4.1864
51.9341 4.1869
textData = % A cell array of strings for the text values from the file
'ABC'
'ABC'
'ABC'
'ABC'
'ABC'
'ABC'
rawData = % All the data from the file (numeric and text) in a cell array
'ABC' [51.9358] [4.1833]
'ABC' [51.9354] [4.1841]
'ABC' [51.9352] [4.1846]
'ABC' [51.9343] [4.1864]
'ABC' [51.9343] [4.1864]
'ABC' [51.9341] [4.1869]
You can then perform whatever processing you need to on the numeric data, then resave a subset of the rows of data to a new file using xlswrite. Here's an example:
index = sqrt(sum(numData.^2,2)) >= 50; % Find the rows where the point is
% at a distance of 50 or greater
% from the origin
xlswrite('output_file.csv',rawData(index,:)); % Write those rows to a new file
If you really want to process your file line by line, a solution might be to use fgetl:
Open the data file with fopen
Read the next line into a character array using fgetl
Retreive the data you need using sscanf on the character array you just read
Perform any relevant test
Output what you want to another file
Back to point 2 if you haven't reached the end of your file.
Unlike the previous answer, this is not very much in the style of Matlab but it might be more efficient on very large files.
Hope this will help.
You cannot read text strings with csvread.
Here is another solution:
fid1 = fopen('test.csv','r'); %# open csv file for reading
fid2 = fopen('new.csv','w'); %# open new csv file
while ~feof(fid1)
line = fgets(fid1); %# read line by line
A = sscanf(line,'%*[^,],%f,%f'); %# sscanf can read only numeric data :(
if A(2)<4.185 %# test the values
fprintf(fid2,'%s',line); %# write the line to the new file
end
end
fclose(fid1);
fclose(fid2);
Just read it in to MATLAB in one block
fid = fopen('file.csv');
data=textscan(fid,'%s %f %f','delimiter',',');
fclose(fid);
You can then process it using logical addressing
ind50 = data{2}>=50 ;
ind50 is then an index of the rows where column 2 is greater than 50. So
data{1}(ind50)
will list all the strings for the rows of interest.
Then just use fprintf to write out your data to the new file
here is the doc to read a csv : http://www.mathworks.com/access/helpdesk/help/techdoc/ref/csvread.html
and to write : http://www.mathworks.com/access/helpdesk/help/techdoc/ref/csvwrite.html
EDIT
An example that works :
file.csv :
1,50,4.1
2,49,4.2
3,30,4.1
4,71,4.9
5,51,4.5
6,61,4.1
the code :
File = csvread('file.csv')
[m,n] = size(File)
index=1
temp=0
for i = 1:m
if (File(i,2)>=50)
temp = temp + 1
end
end
Matrix = zeros(temp, 3)
for j = 1:m
if (File(j,2)>=50)
Matrix(index,1) = File(j,1)
Matrix(index,2) = File(j,2)
Matrix(index,3) = File(j,3)
index = index + 1
end
end
csvwrite('outputFile.csv',Matrix)
and the output file result :
1,50,4.1
4,71,4.9
5,51,4.5
6,61,4.1
This isn't probably the best solution but it works! We can read the CSV file, control the distance of each row and save it in a new file.
Hope it will help!

Resources