am trying to get some specific data from a text file, although am getting the data, but they are coming as in individually i.e. each result is not saving together.
I am trying to extract a set of specific line in a large text file, i want to extract all the lines in which the word occurs, and after getting the lines, i want them together in an array or a table, i got it but the files are extracting into individual columns.
queryline = 'mybeat';
fID = fopen('log1.txt');
result = [];
while ~feof(fID);
tline = fgetl(fID);
if ~isempty(strfind(tline, queryline))
formatspec = sprintf(tline, '%[yyyy-mm-ddTI4:hh:mm]D%C%C%f%f%f%f','\n',queryline, )
result =[sscanf(tline,formatspec)];
end
end
fclose(fID);
but my result i coming one after the other, i need all the result in one array of cells. This is an example of results am getting,
formatspec =
2017-10-02T23:48:51.93Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
2017-10-02T23:48:57.58Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
2017-10-02T23:49:03.24Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
2017-10-02T23:49:08.90Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
2017-10-02T23:49:14.56Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
2017-10-02T23:49:20.22Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
2017-10-02T23:49:25.87Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
2017-10-02T23:49:31.53Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
2017-10-02T23:49:37.19Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
2017-10-02T23:49:42.84Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
2017-10-02T23:49:48.50Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
2017-10-02T23:49:54.15Z 'I|Beat:6548|mybeat:A1201:A'
formatspec =
I even tried using string join, so i tried to imporve the code by adding this
one_str = strjoin( formatspec, '\n' );
result = textscan( one_str, '%[yyyy-mm-ddTI4:hh:mm]D%C%C%f%f%f%f', 'CollectOutput',true );
% make sure that only results are included in the output
assert( strcmp( unique(result{1}), queryline ) ...
, 'Non-result rows included in result' )
out = result{2};
end
after the loop, but am still having a
Error using strjoin (line 52)
First input must be a 1xN cell array of strings.
Error in Untitled5 (line 19)
one_str = strjoin( formatspec, '\n' );
please can someone help. i can attach the file if you want
This is a sample of the log file
2017-10-02T14:36:14.01Z 'D|Beat:6528|evtT:361178'
2017-10-02T14:36:14.03Z 'I|Beat:6553|mybeat:81301:P'
2017-10-02T14:36:14.03Z 'I|Beat:6555|MyNodesDump:1'
2017-10-02T14:36:14.03Z 'I|Beat:6555|1301'
2017-10-02T14:36:14.03Z 'I|Beat:6556|MyRtrsDump:0'
2017-10-02T14:36:14.03Z 'D|Beat:4490|nxtIdx:0'
2017-10-02T14:36:14.03Z 'D|Beat:6604|BFlg:0 SFC:0'
2017-10-02T14:36:14.03Z 'I|Beat:6666|ldr:0'
2017-10-02T14:36:14.08Z 'D|Beat:2106|B->'
2017-10-02T14:42:18.70Z 'I|Beat:6553|mybeat:81301:P'
2017-10-02T14:42:18.70Z 'I|Beat:6555|MyNodesDump:3'
2017-10-02T14:42:18.70Z 'I|Beat:6555|1201'
2017-10-02T14:42:18.70Z 'I|Beat:6555|1301'
2017-10-02T14:42:18.70Z 'I|Beat:6555|1302'
2017-10-02T14:42:18.70Z 'I|Beat:6556|MyRtrsDump:5'
2017-10-02T14:42:18.70Z 'I|Beat:6556|b:21103 r:1302 p:1401'
2017-10-02T14:42:18.70Z 'I|Beat:6556|b:61202 r:1301 p:1203'
2017-10-02T14:42:18.70Z 'I|Beat:6556|b:91402 r:1301 p:1402'
2017-10-02T14:42:18.70Z 'I|Beat:6556|b:B1102 r:1201 p:1101'
2017-10-02T14:42:18.70Z 'I|Beat:6556|b:D1602 r:1302 p:1502'
2017-10-02T14:42:18.70Z 'D|Beat:4506|nxtIdx:0'
2017-10-02T14:42:18.70Z 'I|Beat:6666|ldr:21103'
2017-10-02T14:42:18.76Z 'D|Beat:1197|Rcv<-B, s:1201'
am trying to extract for example every line that contains mybeat in into an array and getting the date, time and second plus the numbers beside mybeat into another table.. like
2017-10-02T14:42:18.70Z 81301
into another separate array.
The problem i was having is that when i ran the script i wrote, the output is coming out individually and its getting the files into a single array, so i can't even access them all except i copy them from the command prompt.
First I couldn't get textscan with that format spec to run on your example. It seems badly formed so I don't know how you ever got any output. However, here is an option to try. This abandons the textscan and uses a regexp. Also note I added a line to your example so that there was letter in the otherNum section per your comment below.
queryline = 'mybeat';
fID = fopen('log1.txt');
k = 1;
while ~feof(fID)
tline = fgetl(fID);
if ~isempty(strfind(tline, queryline))
temp= regexp(tline,['(?^[0-9,-:T])Z.' queryline ':(?[0-9A-Z]*):'] ,'tokens');
outTime(k) = datetime(temp{1}{1},'InputFormat','yyyy-MM-dd''T''HH:mm:ss.SS');
% outNum(k) = str2double(temp{1}{2});
outNum{k} = temp{1}{2};
matchedLines{k} = tline; %Store the raw line
k = k+1;
end
end
fclose(fID);
This gives results that look like this:
outTime =
1×3 datetime array
02-Oct-2017 14:36:14 02-Oct-2017 14:42:18 02-Oct-2017 14:48:14
outNum =
'81301' '81301' 'C81301'
>> char(matchedLines)
ans =
2017-10-02T14:36:14.03Z 'I|Beat:6553|mybeat:81301:P'
2017-10-02T14:42:18.70Z 'I|Beat:6553|mybeat:81301:P'
2017-10-02T14:48:14.03Z 'I|Beat:6553|mybeat:C81301:P'
This will append to outTime & outNum in the loop which will get slow if there are a lot of these. So you have a couple of options. Try to read the file all at once or loop through once to get a count of how many times your queryline appears. Then initialize your outXXXX variables to the correct size ahead of time.
Here is an example of doing it without a while loop assuming you can read the entire file at once without having memory issues (30MB isn't that big).
queryline = 'mybeat';
fID = fopen('log1.txt');
C = textscan(fID,'%s','delimiter','\n');
fclose(fID);
C = C{1};
[temp matchedLines] = regexp(C,['(?<date>^[0-9,-:T]*)Z.*' queryline ':(?<otherNum>[0-9A-Z]*):.*'] ,'tokens','match');
matchedLines = [matchedLines{:}]';
temp = [temp{:}];
temp = reshape([temp{:}],2,[])';
outTime = datetime(temp(:,1),'InputFormat','yyyy-MM-dd''T''HH:mm:ss.SS');
otherNum = temp(:,2);
I'm pretty close on this problem. What I have to do is filter out a cell array. The cell array can have a variety of items in it, but what I want to do is pull out the strings, using recursion. I am pretty close on this one. I just have an issue when the cells have spaces in them. This is what I should get:
Test Cases:
cA1 = {'This' {{{[1:5] true} {' '}} {'is '} false true} 'an example.'};
[filtered1] = stringFilter(cA1)
filtered1 => 'This is an example.'
cA2 = {{{{'I told '} 5:25 'her she'} {} [] [] ' knows'} '/take aim and reload'};
[filtered2] = stringFilter(cA2)
filtered2 => 'I told her she knows/take aim and reload'
Here is what I have:
%find the strings in the cArr and then concatenate them.
function [Str] = stringFilter(in)
Str = [];
for i = 1:length(in)
%The base case is a single cell
if length(in) == 1
Str = ischar(in{:,:});
%if the length>1 than go through each cell and find the strings.
else
str = stringFilter(in(1:end-1));
if ischar(in{i})
Str = [Str in{i}];
elseif iscell(in{i})
str1 = stringFilter(in{i}(1:end-1));
Str = [Str str1];
end
end
end
end
I tried to use 'ismember', but that didn't work. Any suggestions? My code outputs the following:
filtered1 => 'This an example.'
filtered2 => '/take aim and reload'
You can quite simplify your function to
function [Str] = stringFilter(in)
Str = [];
for i = 1:length(in)
if ischar(in{i})
Str = [Str in{i}];
elseif iscell(in{i})
str1 = stringFilter(in{i});
Str = [Str str1];
end
end
end
Just loop through all elements in the cell a test, whether it is a string or a cell. In the latter, call the function for this cell again. Output:
>> [filtered1] = stringFilter(cA1)
filtered1 =
This is an example.
>> [filtered2] = stringFilter(cA2)
filtered2 =
I told her she knows/take aim and reload
Here is a different implememntation
function str = stringFilter(in)
if ischar(in)
str = in;
elseif iscell(in) && ~isempty(in)
str = cell2mat(cellfun(#stringFilter, in(:)', 'uni', 0));
else
str = '';
end
end
If it's string, return it. If it is a cell apply the same function on all of the elements and concatenate them. Here I use in(:)' to make sure it is a row vector and then cell2mat concatenates resulting strings. And if the type is anything else return an empty string. We need to check if the cell array is empty or not because cell2mat({}) is of type double.
The line
Str = ischar(in{:,:});
is the problem. It doesn't make any sense to me.
You're close to the getting the answer, but made a few significant but small mistakes.
You need to check for these things:
1. Loop over the cells of the input.
2. For each cell, see if it itself is a cell, if so, call stringFilter on the cell's VALUE
3. if it is not a cell but is a character array, then use its VALUE as it is.
4. Otherwise if the cell VALUE contains a non character, the contribution of that cell to the output is '' (blank)
I think you made a mistake by not taking advantage of the difference between in(1) and in{1}.
Anyway, here's my version of the function. It works.
function [out] = stringFilter(in)
out = [];
for idx = 1:numel(in)
if iscell (in{idx})
% Contents of the cell is another cell array
tempOut = stringFilter(in{idx});
elseif ischar(in{idx})
% Contents are characters
tempOut = in{idx};
else
% Contents are not characters
tempOut = '';
end
% Concatenate the current output to the overall output
out = [out, tempOut];
end
end
It's supposed to save each line in the text file into a list and split it based on commas and categorize them into multiple different lists. The error occurs at the while loop stating that the index is out of range.
lineCounter = 0
j = 0
file = open("savefile.txt","r")
with open('savefile.txt', 'r') as f:
string = [line.strip() for line in f]
for line in file:
lineCounter += 1
while(j<lineCounter):
tempList = string[j].split(',')
firstName[j] = tempList[0]
lastName[j] = tempList[1]
postition[j] = tempList[2]
department[j] = tempList[3]
seniority[j] = tempList[4]
vacationWeeks[j] = tempList[5]
sickDays[j] = tempList[6]
iD[j] = tempList[7]
status[j] = tempList[8]
j += 1
print firstName
file.close() # close the text file
NVM the problem was that the list needed to be appended rather than replaced.
EDIT 3
Hi! I had problems with the matrix dimensions but I've solved it. Now my problem is that I want to do the same operation on a large series of files on the same folder and I want write the output values on a separate line on text.txt. With the first one it works but it doesn't 'write' to the 'text', the rest. Is there something wrong?
myPath = 'C:\EX\';
a= dir (fullfile(myPath,'*.DIM'));
fileNames = { a.name };
% Rename files
for k = 1:length(fileNames)
newFileName = [fileNames{k}(1:2) fileNames{k}(4:6) '.txt'];
movefile([myPath fileNames{k}], [myPath newFileName]);
end
filePattern=fullfile( myPath,'*.txt');
txtFiles= dir(filePattern);
for k = 1:length(txtFiles)
baseFileName=txtFiles(k).name;
fullFileName= fullfile(myPath,baseFileName);
fid=fopen(fullFileName, 'r');
for i = 1:18
m{i} = fgetl(fid);
end
result2 = m{18};
result2b= result2([12:19]);
fid=fopen(fullFileName, 'r');
for i = 1:30
m{i} = fgetl(fid);
end
result3 = m{30};
result3b= result3([12:19]);
fid=fopen(fullFileName, 'r');
for i = 1:31
m{i} = fgetl(fid);
end
result4 = m{31};
result4b= result4([12:20]);
fid=fopen(fullFileName, 'r');
for i = 1:19
m{i} = fgetl(fid);
end
result5 = m{19};
result5b= result5([12:20]);
text= {baseFileName, result2b, result3b, result4b, result5b};
final= [Fields'; text];
end
Really thanks in advance!
Index exceeds dimensions is exactly what it means.
Try to put a breakpoint at the line where it occurs and check the dimension of result2. Assuming it is a vector, you will find that its length is less than 19.
My customer is sending TDM/TDX files captured in National Instruments Diadem, which I haven't got. I'm looking for a way to convert the files into .CSV, XLS or .MAT files for analysis in Matlab (without using Diadem or Diadem DLLs!)
The format consists of a well structured XML file (.TDM) and a binary (.TDX), with the .TDM defining how fields are packed as bits in the binary TDX. I'd like to read the files (for use in Matlab and other environments). Does anyone have a general purpose tool or conversion script in for instance Python or Perl (not using the NI DLL's) or directly in Matlab?
I've looked into buying the tool, but didn't like it for anything other than one-time conversion to a compatible file format.
Thanks!
I know this is a little late, but I have a simple library to read TDM/TDX files in Python. It works by parsing the TDM file to figure out the data type, then using NumPy.memmap to open the TDX file. It can then be used like a standard NumPy array. The code is pretty simple, so you could probably implement something similar in Matlab.
Here's the link: https://bitbucket.org/joshayers/tdm_loader
Hope that helps.
Maybe a little too late, but I think there is a simple way to get the data from TDM files: NI provides plug-ins for reading TDM files into Excel and OpenOffice Calc. Having the data in one of these programs you could use the CSV export. Search google for "tdm excel" or "tdm openoffice".
Hope this helps...
Gemue
The following script can convert all variables into 'variable' struct.
CurrDirectory = '...//'; % Path to current directory
fileNametdx = '.../utility/'; % Path to TDX file
%%
% Data type conversion
Dtype.eInt8Usi='int8';
Dtype.eInt16Usi='int16';
Dtype.eInt32Usi='int32';
Dtype.eInt64Usi='int64';
Dtype.eUInt8Usi='uint8';
Dtype.eUInt16Usi='uint16';
Dtype.eUInt32Usi='uint32';
Dtype.eUInt64Usi='uint64';
Dtype.eFloat32Usi='single';
Dtype.eFloat64Usi='double';
%% Read .tdx file Name
wb=waitbar(0,'Reading *.tdx Files');
fileNameTDM = strrep(fileNametdx,'.tdx','.TDM');
%% Read .TDM
tdm=xml2struct(fileNameTDM);
for i=1:numel(tdm.usi_colon_tdm.usi_colon_data.tdm_channel)
waitbar((1/numel(tdm.usi_colon_tdm.usi_colon_data.tdm_channel))*i,wb,['File ' fileNametdx ' conversion started']);
s1=strsplit(string(tdm.usi_colon_tdm.usi_colon_data.tdm_channel{1, i}.local_columns.Text),'"');
usi1=s1(2);
% if condition match untill we get usi2
for j=1:numel(tdm.usi_colon_tdm.usi_colon_data.localcolumn)
usi2=string(tdm.usi_colon_tdm.usi_colon_data.localcolumn{1, j}.Attributes.id);
if usi1==usi2
%take new usi
s2=strsplit(string(tdm.usi_colon_tdm.usi_colon_data.localcolumn{1, j}.values.Text),'"');
new_usi1=s2(2);
w1=strsplit(string(tdm.usi_colon_tdm.usi_colon_data.tdm_channel{1, i}.datatype.Text),'_');
str_1=char(strcat('tdm.usi_colon_tdm.usi_colon_data.',lower(w1(2)),'_sequence'));
str_2=char(strcat('tdm.usi_colon_tdm.usi_colon_data.',lower(w1(2)),'_sequence{1, k}.Attributes.id'));
str_3=char(strcat('tdm.usi_colon_tdm.usi_colon_data.',lower(w1(2)),'_sequence{1, k}.values.Attributes.external'));
str_4=char(strcat('tdm.usi_colon_tdm.usi_colon_data.',lower(w1(2)),'_sequence{1, k}.values'));
for k=1:numel(eval(str_1))
new_usi2=string(eval(str_2));
if new_usi1==new_usi2
if isfield(eval(str_4), 'Attributes')
inc_value1=string(eval(str_3));
for m=1:numel(tdm.usi_colon_tdm.usi_colon_include.file.block)
inc_value2=string(tdm.usi_colon_tdm.usi_colon_include.file.block{1, m}.Attributes.id);
if inc_value1==inc_value2
% offset=round(str2num(tdm.usi_colon_tdm.usi_colon_include.file.block{1, m}.Attributes.byteOffset)/8);
length = round(str2num(tdm.usi_colon_tdm.usi_colon_include.file.block{1, m}.Attributes.length));
offset1=round(str2num(tdm.usi_colon_tdm.usi_colon_include.file.block{1, m}.Attributes.byteOffset));
value_type = tdm.usi_colon_tdm.usi_colon_include.file.block{1, m}.Attributes.valueType;
m = memmapfile(fullfile(CurrDirectory,fileNametdx),'Offset',offset1,'Format',{Dtype.(value_type) [length 1] 'dat'},'Writable',true,'Repeat',1);
dat=m.Data.dat ;
end
end
else
str_5=char(strcat('tdm.usi_colon_tdm.usi_colon_data.',lower(w1(2)),'_sequence{1, k}.values.',char(fieldnames(tdm.usi_colon_tdm.usi_colon_data.string_sequence{1, k}.values))));
dat=eval(str_5)';
end
name_variable = string(tdm.usi_colon_tdm.usi_colon_data.tdm_channel{1, i}.name.Text);
varname = genvarname(char(name_variable));
variable.(varname) = dat;
end
end
end
end
end
waitbar(1,wb,[fileNametdx ' conversion completed']);
pause(1)
close(wb)
delete(fullfile(CurrDirectory,fileNametdx),fullfile(CurrDirectory,fileNameTDM));
%Output Variable is Struct
clearvars -except variable
This script requires following XML parser
function [ s ] = xml2struct( file )
%Convert xml file into a MATLAB structure
% [ s ] = xml2struct( file )
%
% A file containing:
% <XMLname attrib1="Some value">
% <Element>Some text</Element>
% <DifferentElement attrib2="2">Some more text</Element>
% <DifferentElement attrib3="2" attrib4="1">Even more text</DifferentElement>
% </XMLname>
%
% Will produce:
% s.XMLname.Attributes.attrib1 = "Some value";
% s.XMLname.Element.Text = "Some text";
% s.XMLname.DifferentElement{1}.Attributes.attrib2 = "2";
% s.XMLname.DifferentElement{1}.Text = "Some more text";
% s.XMLname.DifferentElement{2}.Attributes.attrib3 = "2";
% s.XMLname.DifferentElement{2}.Attributes.attrib4 = "1";
% s.XMLname.DifferentElement{2}.Text = "Even more text";
%
% Please note that the following characters are substituted
% '-' by '_dash_', ':' by '_colon_' and '.' by '_dot_'
%
% Written by W. Falkena, ASTI, TUDelft, 21-08-2010
% Attribute parsing speed increased by 40% by A. Wanner, 14-6-2011
% Added CDATA support by I. Smirnov, 20-3-2012
%
% Modified by X. Mo, University of Wisconsin, 12-5-2012
if (nargin < 1)
clc;
help xml2struct
return
end
if isa(file, 'org.apache.xerces.dom.DeferredDocumentImpl') || isa(file, 'org.apache.xerces.dom.DeferredElementImpl')
% input is a java xml object
xDoc = file;
else
%check for existance
if (exist(file,'file') == 0)
%Perhaps the xml extension was omitted from the file name. Add the
%extension and try again.
if (isempty(strfind(file,'.xml')))
file = [file '.xml'];
end
if (exist(file,'file') == 0)
error(['The file ' file ' could not be found']);
end
end
%read the xml file
xDoc = xmlread(file);
end
%parse xDoc into a MATLAB structure
s = parseChildNodes(xDoc);
end
% ----- Subfunction parseChildNodes -----
function [children,ptext,textflag] = parseChildNodes(theNode)
% Recurse over node children.
children = struct;
ptext = struct; textflag = 'Text';
if hasChildNodes(theNode)
childNodes = getChildNodes(theNode);
numChildNodes = getLength(childNodes);
for count = 1:numChildNodes
theChild = item(childNodes,count-1);
[text,name,attr,childs,textflag] = getNodeData(theChild);
if (~strcmp(name,'#text') && ~strcmp(name,'#comment') && ~strcmp(name,'#cdata_dash_section'))
%XML allows the same elements to be defined multiple times,
%put each in a different cell
if (isfield(children,name))
if (~iscell(children.(name)))
%put existsing element into cell format
children.(name) = {children.(name)};
end
index = length(children.(name))+1;
%add new element
children.(name){index} = childs;
if(~isempty(fieldnames(text)))
children.(name){index} = text;
end
if(~isempty(attr))
children.(name){index}.('Attributes') = attr;
end
else
%add previously unknown (new) element to the structure
children.(name) = childs;
if(~isempty(text) && ~isempty(fieldnames(text)))
children.(name) = text;
end
if(~isempty(attr))
children.(name).('Attributes') = attr;
end
end
else
ptextflag = 'Text';
if (strcmp(name, '#cdata_dash_section'))
ptextflag = 'CDATA';
elseif (strcmp(name, '#comment'))
ptextflag = 'Comment';
end
%this is the text in an element (i.e., the parentNode)
if (~isempty(regexprep(text.(textflag),'[\s]*','')))
if (~isfield(ptext,ptextflag) || isempty(ptext.(ptextflag)))
ptext.(ptextflag) = text.(textflag);
else
%what to do when element data is as follows:
%<element>Text <!--Comment--> More text</element>
%put the text in different cells:
% if (~iscell(ptext)) ptext = {ptext}; end
% ptext{length(ptext)+1} = text;
%just append the text
ptext.(ptextflag) = [ptext.(ptextflag) text.(textflag)];
end
end
end
end
end
end
% ----- Subfunction getNodeData -----
function [text,name,attr,childs,textflag] = getNodeData(theNode)
% Create structure of node info.
%make sure name is allowed as structure name
name = toCharArray(getNodeName(theNode))';
name = strrep(name, '-', '_dash_');
name = strrep(name, ':', '_colon_');
name = strrep(name, '.', '_dot_');
attr = parseAttributes(theNode);
if (isempty(fieldnames(attr)))
attr = [];
end
%parse child nodes
[childs,text,textflag] = parseChildNodes(theNode);
if (isempty(fieldnames(childs)) && isempty(fieldnames(text)))
%get the data of any childless nodes
% faster than if any(strcmp(methods(theNode), 'getData'))
% no need to try-catch (?)
% faster than text = char(getData(theNode));
text.(textflag) = toCharArray(getTextContent(theNode))';
end
end
% ----- Subfunction parseAttributes -----
function attributes = parseAttributes(theNode)
% Create attributes structure.
attributes = struct;
if hasAttributes(theNode)
theAttributes = getAttributes(theNode);
numAttributes = getLength(theAttributes);
for count = 1:numAttributes
%attrib = item(theAttributes,count-1);
%attr_name = regexprep(char(getName(attrib)),'[-:.]','_');
%attributes.(attr_name) = char(getValue(attrib));
%Suggestion of Adrian Wanner
str = toCharArray(toString(item(theAttributes,count-1)))';
k = strfind(str,'=');
attr_name = str(1:(k(1)-1));
attr_name = strrep(attr_name, '-', '_dash_');
attr_name = strrep(attr_name, ':', '_colon_');
attr_name = strrep(attr_name, '.', '_dot_');
attributes.(attr_name) = str((k(1)+2):(end-1));
end
end
end