Concatenating a numeric matrix with a 1xn cell - arrays

When trying the following concatenation:
for i=1:1:length(Open)
data(i,1) = Open(i);
data(i,2) = Close(i);
data(i,3) = High(i);
data(i,4) = Low(i);
data(i,5) = Volume(i);
data(i,6) = Adj_Close(i);
data(i,7) = cell2mat(dates(1,i));
end
Where all matrices but dates contain double values, and dates is a cell array with dates in the format '2001-01-01'. Running the code above, I get the following error:
??? Subscripted assignment dimension mismatch.
Error in ==> Test_Trades_part2 at 81
data(i,7) = cell2mat(dates(1,i));
The code above is tied to a master code which takes data from Yahoo Finance and then puts it in my SQL database.

A convenient way to store dates in completely numeric format is with datenum:
>> data(i,7) = datenum('2001-01-01');
>> disp(data(i,:))
0 0 0 0 0 0 730852
Whether this is useful to you depends on what you intend to do with the SQL database. Howeer, converting back to a string with MATLAB is straightforward with the datestr command:
>> datestr(730852,'yyyy-mm-dd')
ans =
2001-01-01
APPENDIX:
A serial date number represents a calendar date as the number of days that has passed since a fixed base date. In MATLAB, serial date number 1 is January 1, 0000.

Thank you all for the help!
I solved this issue using the following methodlogy (incorporating structs, should have thought of that .. stupid me):
data = [open, close_price, high, low, volume, closeadj];
s = struct('OpenPrice', data(:,1), 'ClosePrice', data(:,2), 'High', data(:,3), 'Low', data(:,4), 'Volume', data(:,5), 'Adj_Close', data(:,6), 'Dates', {dates});
This way, I enter all the values contained in the struct, circumventing the need to concatenate numeric and string matrices. Odd though, that is not allowed to have such matrices in on matrix; I would suppose that is the reason they created structs.

Related

Median-If With Month Criteria not working in LibreOffice

I have a simple spreadsheet like below, with columns:
A: Timestamp
B: A numerical result
C: Time duration to compute above result
I want to compute the median value for duration for year 2019 March in cell I4. I used the following formula for it:
{=MEDIAN(IF((YEAR(A:A) = G1) * (MONTH(A:A) = 3), C:C))}
I expect value 48.5 should appear (median value b/w 41 and 56). But, it's showing an error #VALUE! when entered using Ctrl-Shift-Enter.
Can someone point where the problem might be.

How to pull specific indices out of a character array in a loop?

I have an array that contains multiple dates in the format yyyymmdd, stored as a 50x1 double. I am trying to pull out the year,month, and day so I can use datenum to assign each date a serial number.
Indexing an individual date, converting the using str2num, then indexing and pulling the appropriate values works fine, but when I try to loop through the list of dates it doesn't work- only variations of the number 2 are returned.
dates = [20180910; 20180920; 20181012; 20181027; 20181103; 20181130; 20181225];
% version1
datesnums=num2str(dates); % dates is a list of dates stored as
integers
for i=1:length(datesnums)
pullyy=str2num(datesnums(1:4));
pullmm=str2num(datesnums(5:6));
pulldd=str2num(datesnums(7:8));
end
As well as
%version2
datesnums=num2str(dates,'%d')
for i = 1:length(datesnums)
dd=datenum(str2num(datesnums(i(1:4))),str2num(datesnums(i(5:6))),
str2num(datesnums(i(7:8))));
end
I'm trying to generate a new array that is just the serial numbers of the input dates. In the examples shown, I am only getting single integer values, which I know is because the loop is incorrect and I get errors that say "Index exceeds the number of array elements (1)." for version 1. When I've gotten it to successfully loop through everything, the outputs are just '2222','22,'22' for every single date which is incorrect. What am I doing wrong? Do I need to incorporate a cell array?
To get all the years, month, and days in a loop:
datesnums=num2str(dates);
for i=1:size(datesnums, 1)
pullyy(i) = str2num(datesnums(i,1:4));
pullmm(i) = str2num(datesnums(i,5:6));
pulldd(i) = str2num(datesnums(i,7:8));
end
Actually, you can do this without a loop:
pullyy = str2num(datesnums(:,1:4));
pullmm = str2num(datesnums(:,5:6));
pulldd = str2num(datesnums(:,7:8));
Explanation:
If for example the dates vector is a [6x1] array:
dates =[...
20190901
20170124
20191215
20130609
20141104
20190328];
Than datesnums=num2str(dates); creates a char matrix of size [6x8] where each row corresponds to one element in dates:
datesnums =
6×8 char array
'20190901'
'20170124'
'20191215'
'20160609'
'20191104'
'20190328'
So in the loop you need to refer to the row index for each date and and the column indices to extract the years, month, and days.
The easiest solution I can think of is:
SN = datenum(num2str(dates),'yyyymmdd')
You only have to specify the date format which is 'yyyymmdd'

MATLAB Extract all rows between two variables with a threshold

I have a cell array called BodyData in MATLAB that has around 139 columns and 3500 odd rows of skeletal tracking data.
I need to extract all rows between two string values (these are timestamps when an event happened) that I have
e.g.
BodyData{}=
Column 1 2 3
'10:15:15.332' 'BASE05' ...
...
'10:17:33:230' 'BASE05' ...
The two timestamps should match a value in the array but might also be within a few ms of those in the array e.g.
TimeStamp1 = '10:15:15.560'
TimeStamp2 = '10:17:33.233'
I have several questions!
How can I return an array for all the data between the two string values plus or minus a small threshold of say .100ms?
Also can I also add another condition to say that all str values in column2 must also be the same, otherwise ignore? For example, only return the timestamps between A and B only if 'BASE02'
Many thanks,
The best approach to the first part of your problem is probably to change from strings to numeric date values. In Matlab this can be done quite painlessly with datenum.
For the second part you can just use logical indexing... this is were you put a condition (i.e. that second columns is BASE02) within the indexing expression.
A self-contained example:
% some example data:
BodyData = {'10:15:15.332', 'BASE05', 'foo';...
'10:15:16.332', 'BASE02', 'bar';...
'10:15:17.332', 'BASE05', 'foo';...
'10:15:18.332', 'BASE02', 'foo';...
'10:15:19.332', 'BASE05', 'bar'};
% create column vector of numeric times, and define start/end times
dateValues = datenum(BodyData(:, 1), 'HH:MM:SS.FFF');
startTime = datenum('10:15:16.100', 'HH:MM:SS.FFF');
endTime = datenum('10:15:18.500', 'HH:MM:SS.FFF');
% select data in range, and where second column is 'BASE02'
BodyData(dateValues > startTime & dateValues < endTime & strcmp(BodyData(:, 2), 'BASE02'), :)
Returns:
ans =
'10:15:16.332' 'BASE02' 'bar'
'10:15:18.332' 'BASE02' 'foo'
References: datenum manual page, matlab help page on logical indexing.

Different results from STRREAD() for reading strings

I have a date cell array which is read from a csv file. The format is below:
date =
'2008.12.01'
'2008.12.02'
'2008.12.03'
'2008.12.04'
'2008.12.05'
... ...
And I want to:
turn the cell array to a string array,
use the strread() to read its "yyyy","mm" and "dd" value into 3 double array [year,mm,dd],
use the datenummx() to turn [year,mm,dd] into date seriel num.
After i use
date = char(date);
the date array become like this:
date =
2008.12.01
2008.12.02
2008.12.03
2008.12.04
2008.12.05
... ...
which I think the result is what i want...
But after I use the strread(), it gives me odd result.
[year,month,day]=strread(date,'%d%d%d','delimiter','.');
year =
-1
0
0
0
0
... ...
BUT if I use the code below, the strread() can give me the right answer:
s = sprintf('2008.12.01')
s =
2008.12.01
[year,month,day]=strread(s,'%d%d%d','delimiter','.')
year =
2008
month =
12
day =
1
And I checked in the matlab that both the "date" and "s" is a char array.(by using function 'ischar' and simply display both)...
But why do the strread() give differnt results?
Can anyone answer?
by the way, I use the MatLab v6.5.(for my own reason, so please don't comment by asking "why not use a higher version")....
Your problem is this line:
date = char(date);
It does not create an array of strings, there is no array of strings in matlab. It creates an array of chars. As you already noticed, your strread-line is fine if you input a single date, so input each date form your original cell array individually:
for idx=1:numel(date)
[year(idx),month(idx),day(idx)]=strread(date{idx},'%d%d%d','delimiter','.');
end
Preallocation of year, month and day improves the performance.

Dates to integer arrays in MATLAB

Before asking my question, here's a little background so you understand what I'm doing. I'm looking to analyze a very large data set (a little less than 2,000,000 rows). I've parsed the data set into Matlab and built a structure array from this data, giving names, dates, returns, etc for each asset i. Now, I would like to restrict my data set to being between two days, and Matlab doesn't seem to be particularly amenable to that kind of approach. One suggestion that was given to me was to take the dates, which are of the form MM/DD/YYYY and use a delimiter '/' to somehow build three integer arrays for my data structure (which I'd call stock(i).month, stock(i).day, and stock(i).year). However, nothing I'm doing seems to be working, and I'm very much stuck.
What I have been trying to do is something like the following:
%% Dates
fid = fopen('52c6d3831952b24a.csv');
C = textscan(fid, [repmat('%*s ',1,0),'%s %*[^\n]'], 'delimiter',',');
date = C{1}(2:end,1);
fclose(fid);
for i=1:numStock
locate = strcmp(uniquePermno{i},permno);
stock(i+1).date = date(locate);
end;
for i = 1:numStock
stock(i+1).date = char(stock(i+1).date);
D = textscan(stock(i+1).date, '%s %s %s', 'delimiter','/');
stock(i+1).month = D{1}(1:end);
stock(i+1).day = D{2}(1:end);
stock(i+1).year = D{3}(1:end);
end
I initially wanted to save them as integers (and was using %u instead), but I was getting a strange situation where most of my entries were just 0 and the non-zero ones were very large (obviously not what I expected). However, the above form returns the following error:
Error using textscan
Buffer overflow (bufsize = 4095) while reading string from
file (row 1 u, field 1 u). Use 'bufsize' option. See HELP TEXTSCAN.
44444444444444444444455555555555555555555566666666666666666666677777777777777777777778888888888888888888889999999999999999999990000000000000000000000011111111111111111112222222222222222222222111111111
Error in makeData_CRSP (line 87)
D = textscan(stock(i+1).date, '%s %s %s', 'delimiter','/');
So I'm honestly at a loss for how to approach this. What am I doing wrong? Seeing how I saved my dates vectors for my data structure, is this the best way to approach this problem?
You can use the datenum function to convert dates into numbers. The syntax is datenum(dateString, format). For example, if your dates are in the format YYYY MM DD then that would be
datenum('2012 12 04', 'yyyy mm dd')
Once you converted all your dates like that you can simply compare the resulting numbers using > and <:
>> datenum('2012 12 04', 'yyyy mm dd') > datenum('2012 12 03', 'yyyy mm dd')
ans =
1
>> datenum('2012 12 04', 'yyyy mm dd') > datenum('2012 12 05', 'yyyy mm dd')
ans =
0

Resources