How to implement lookahead in Ragel - c

I have two states; one is a specific instance of the other, more general, state.
I believe that the right way to avoid entering both states simultaneously is to implement lookahead with k>1, but I can't find any examples of how to do this.
The Ragle user's guide says:
In both the use of fhold and fexec the user must be cautious of combining the resulting machine with another in such a way that the transition on which the current position is adjusted is not combined with a transition from the other machine.
I'm not entirely sure what this means, except perhaps "don't try to read past the end of the current expression".
My machine looks like this:
seglen16 = any{2} >{ swab(p, &len, 2); len = len - 2; };
action check {len--}
buffer = (any when check)* %when !check #{ printf("[%d]:%d\n", len, *p); };
# JPEG Markers
mk_app0 = 0xFF 0xE0;
mk_appx = 0xFF (0xE0..0xEF);
marker = 0xFF ^0x00;
nonmarker = !marker - zlen;
# JPEG APP Segments
seg_app0_jfif = mk_app0 seglen16 "JFIF" 0x00 buffer #{ printf("jfif app0\n"); };
seg_appx_unk = mk_appx nonmarker* #{ printf("unknown app content\n"); };
seg_app = (seg_app0_jfif | seg_app1_exif | seg_appx_unk);
# Main Machine
expr = (mk_soi #lerr(bad) nonmarker* seg_app* nonmarker* mk_eoi);
I want to tokenize a JPEG header, skipping unknown segments and handling well-known segments like JFIF. The JPEG application segment app0 starts with 0xFFE0. If app0 contains JFIF data, the app0 marker will be followed by a two-byte length and the string "JFIF\0". This means I need 7 bytes of lookahead when identifying application segments.

I want to tokenize a JPEG header, skipping unknown segments and handling well-known segments like JFIF. The JPEG application segment app0 starts with 0xFFE0. If app0 contains JFIF data, the app0 marker will be followed by a two-byte length and the string "JFIF\0".
All right.
This means I need 7 bytes of lookahead when identifying application segments.
Why? You can make the "unknown" pattern apply to all segments except the ones that are known using the general pattern:
seg_app0_jfif = mk_app0 seglen16 "JFIF" 0x00 buffer #{ printf("jfif app0\n"); };
known_segment = (seg_app0_jfif | seg_app1_exif);
unknown_segment = ((mk_appx nonmarker*) - known_segment) #{ printf("unknown app content\n"); };
seg_app = (known_segment | unknown_segment);
Doing it this way doesn't require lookahead. Ragel generates the appropriate states and transitions, handling both patterns simultaneously until enough of the input has been processed to disambiguate them. The finishing action on unknown_segment will occur only if it is not a known_segment, which seems like behavior you're trying to achieve.

Related

supercollider - access buffer information inside a `Pbind` that uses a buffer array

in brief
i have an array of buffers; those are passed to a synth at random using a Pbind ; i need to access info on the current buffer from within the Pbind but I need help doing that !
explanation of the problem
i have loaded an array of buffers containing samples. those samples must be played in a random order (and at random intervals, but that's for later). to do so, i pass those buffers to a synth inside a Pbind. i want to set the \dur key to be the length of the current buffer being played. the thing is, that i can't find a way to access info on the current buffer from within the Pbind. i have tried using Pkey, Pfset and Plambda, but to no success.
does somebody know how to do this ???
code
the sounds are played using:
SynthDef(\player, {
/*
play a file from a buffer
out: the output channel
bufnum: the buffer to play
*/
arg out=0, bufnum;
Out.ar(
out,
PlayBuf.ar(1, bufnum, BufRateScale.kr(bufnum), doneAction: Done.freeSelf)) ! 2
)
}).add;
the buffers are loaded in an array:
path = PathName.new("/path/to/files");
bufferArray = Array.new(100);
path.filesDo({
arg file;
bufferArray.add( Buffer.read(s, file.fullPath) );
});
my Pbind pattern works like this:
i define a \buffer value which is a single buffer from the array
i pass this \buffer to my synth
i then try to calculate its duration (\dur) by dividing the number of frames of the buffer by its sample rate. this is what i can't seem to get right
p = Pbind(
\buffer, Prand(bufferArray, inf),
\instrument, \player,
\bufnum, Pkey(\buffer),
\dur, (Pkey(\buffer.numFrames) / Pkey(\buffer.sampleRate))
)
thanks in advance for your help !!
solution to the problem: how to access buffer information inside a Pbind pattern
after hours of searching, i've found a solution to this problem on the supercollider forum, and i'm posting my own solution in case others are looking on here, like i was !
define a global array of buffers
this isn't compulsory, but it allows to only create the buffer array once; the array is created asynchronously using the action parameter of Buffer.read(), which allows to trigger a function once the buffer is loaded:
var path;
Buffer.freeAll; // avoid using all buffers in server
path = PathName.new("/path/to/sound/files");
~bufferArray = Array.new(100);
path.filesDo({
// add the buffer to `~bufferArray` asynchronously
arg file;
b = Buffer.read(s, file.fullPath, action: {
arg buffer;
~bufferArray.add( buffer );
})
});
play the synth and use Pfunc to access buffer information inside of the Pbind
this is the solution per se:
define a Pbind pattern which activates a synth to play the buffer.
inside that, define a \buffer variable to hold the current buffer.
then, access data on that buffer inside of a Pfunc. this generates an argument containing the last event in the Pbind. using this event, the buffer data can be accessed
p = Pbind(
\buffer, Prand(~bufferArray, inf), // randomly access one buffer inside of the array
\instrument, \player,
\bufnum, Pfunc { arg event; event[\buffer] }, // define a `Pfunc` function to access the previous event containing a `\buffer` variable
\dur, Pfunc { arg event; event[\buffer].numFrames / event[\buffer].sampleRate } // duration
);
p.play;
see the original answer on the supercollider forum for more details !

How do I read OpenVINO IR models from memory with the OpenVINO C API

I am having trouble reading OpenVINO IR networks (XML and bin) from memory using ie_core_read_network_from_memory() in the OpenVINO 2021.4 C API ie_c_api.h.
I suspect that I am creating the network weight blob wrong, but I cannot find any information on how to create weight blobs correctly for networks.
I have read the OpenVINO C API docs but cannot deduce from docs what I am doing wrong. The OpenVINO code repo contains some C code samples, but none of the samples seem to use ie_core_read_network_from_memory().
Below is a cut out of the code I am having trouble with.
// void* dmem->data - network memory buffer (float32)
// size_t dmem->size - size of network memory buffer (bytes)
ie_core_t* ov_core = NULL;
IEStatusCode status = ie_core_create("", &ov_core);
if (status != OK)
{
// error handling
}
const dimensions_t weights_tensor_dims =
{ 4, { 1, 1, 1, dmem->size/sizeof(float) } };
tensor_desc_t weights_tensor_desc = { OIHW, weights_tensor_dims, FP32 };
ie_blob_t* ov_model_weight_blob = NULL;
status = ie_blob_make_memory_from_preallocated(
&weights_tensor_desc, dmem->data, dmem->size, &ov_model_weight_blob);
if (status != OK)
{
// error handling
}
// char* model_xml_desc - the model's XML string
uint8_t* ov_model_xml_content = (uint8_t*)model_xml_desc;
ie_network_t* ov_network = NULL;
size_t xml_sz = strlen(ov_model_xml_content);
status = ie_core_read_network_from_memory(
ov_core, ov_model_xml_content, xml_sz, ov_model_weight_blob, &ov_network);
if (status != OK)
{
// Always get "GENERAL_ERROR (-1)"
}
The code works fine down to the ie_core_read_network_from_memory() call which results in "GENERAL_ERROR".
I have tried two models that were converted from Tensorflow. One is a simple [X] -> [Y] regression model (single input value, single output value). The other is also a regression model [X_1, X_2, ..., X_9] -> [Y] (nine input values, single output value). They work fine when reading them from file with ie_core_read_network(), but for my use case I must provide the network as a binary memory buffer and XML string.
I would appreciate any help, either by pointing out what I am getting wrong or directing me to some code samples that use ie_core_read_network_from_memory().
System information:
Windows 10
OpenVINO v2021.4.689
Microsoft Visual Studio 2019
UPDATE: An Intel employee reached out to me in another forum and pointed out that there is a unit test for ie_core_read_network_from_memory(). The unit test successfully reads a network from memory and made clear that I was in fact using a faulty tensor description to produce the weight blob, just as I suspected. Apparently the weight blob descriptor should be one dimensional, have memory layout ANY and datatype U8 even though the model weights are fp32.
From the unit test:
std::string bin_std = TestDataHelpers::generate_model_path("test_model", "test_model_fp32.bin");
const char* bin = bin_std.c_str();
//...
std::vector<uint8_t> weights_content(content_from_file(bin, true));
tensor_desc_t weights_desc { ANY, { 1, { weights_content.size() } }, U8 };
However, simply changing the tensor descriptor was not enough to get my code to work so it remains for me to properly translate the C++ code from the unit test to my C environment before the issue to can be considered solved.
Thanks
Refer to tensor_desc struct and standard layout format.
Apart from that, it is recommended to use the Benchmark_app tool to test the inference performance.

Parsing the payload of the AT commands from the full response string

I want to parse the actual payload from the output of AT commands.
For instance: in the example below, I'd want to read only "2021/11/16,11:12:14-32,0"
AT+QLTS=1 // command
+QLTS: "2021/11/16,11:12:14-32,0" // response
OK
In the following case, I'd need to only read 12345678.
AT+CIMI // command
12345678 // example response
So the point is: not all commands have the same format for the output. We can assume the response is stored in a string array.
I have GetAtCmdRsp() already implemented which stores the response in a char array.
void GetPayload()
{
char rsp[100] = {0};
GetAtCmdRsp("AT+QLTS=1", rsp);
// rsp now contains +QLTS: "2021/11/16,11:12:14-32,0"
// now, I need to parse "2021/11/16,11:12:14-32,0" out of the response
memset(rsp, 0, sizeof(rsp));
GetAtCmdRsp("AT+CIMI", rsp);
// rsp now contains 12345678
// no need to do additional parsing since the output already contains the value I need
}
I was thinking of doing char *start = strstr(rsp, ":") + 1; to get the start of the payload but some responses may only contain the payload as it's the case with AT+CIMI
Perhaps could regex be a good idea to determine the pattern +<COMMAND>: in a string?
In order to parse AT command responses a good starting point is understanding all the possible formats they can have. So, rather than implementing a command specific routine, I would discriminate commands by "type of response":
Commands with no payload in their answers, for example
AT
OK
Commands with no header in their answers, such as
AT+CIMI
12345678
OK
Commands with a single header in their answers
AT+QLTS=1
+QLTS: "2021/11/16,11:12:14-32,0"
OK
Command with multi-line responses.Every line could of "single header" type, like in +CGDCONT:
AT+CDGCONT?
+CGDCONT: 1,"IP","epc.tmobile.com","0.0.0.0",0,0
+CGDCONT: 2,"IP","isp.cingular","0.0.0.0",0,0
+CGDCONT: 3,"IP","","0.0.0.0",0,0
OK
Or we could even have mixed types, like in +CGML:
AT+CMGL="ALL"
+CMGL: 1,"REC READ","+XXXXXXXXXX","","21/11/25,10:20:00+00"
Good morning! How are you?
+CMGL: 2,"REC READ","+XXXXXXXXXX","","21/11/25,10:33:33+00"
I'll come a little late. See you. Bruce Wayne
OK
(please note how it could have also "empty" lines, that is \r\n).
At the moment I cannot think about any other scenario.In this way you'll be able to define an enum like
typedef enum
{
AT_RESPONSE_TYPE_NO_RESPONSE,
AT_RESPONSE_TYPE_NO_HEADER,
AT_RESPONSE_TYPE_SINGLE_HEADER,
AT_RESPONSE_TYPE_MULTILINE,
AT_RESPONSE_TYPE_MAX
}
and pass it to your GetAtCmdRsp( ) function in order to parser the response accordingly. If implement the differentiation in that function, or after it (or in an external function is your choice.
A solution without explicit categorization
Once you have clear all the scenarios that might ever occur, you can think about a general algorithm working for all of them:
Get the full response resp after the command echo and before the closing OK or ERROR. Make sure that the trailing \r\n\r\nOK is removed (or \r\nERROR. Or \r\nNO CARRIER. Or whatever the terminating message of the response might be).Make also sure to remove the command echo
If strlen( resp ) == 0 we belong to the NO_RESPONSE category, and the job is done
If the response contains \r\ns in it, we have a MULTILINE answer. So, tokenize it and place every line into an array element resp_arr[i]. Make sure to remove trailing \r\n
For every line in the response (for every resp_arr[i] element), search for <CMD> : pattern (not only :, that might be contained in the payload as well!). Something like that:
size_t len = strlen( resp_cur_line );
char *payload;
if( strstr( "+YOURCMD: ", resp_cur_line) == NULL )
{
// We are in "NO_HEADER" case
payload = resp_cur_line;
}
else
{
// We are in "HEADER" case
payload = resp_cur_line + strlen( "+YOURCMD: " );
}
Now payload pointer points to the actual payload.
Please note how, in case of MULTILINE answer, after splitting the lines into array elements every loop will handle correctly also the mixed scenarios like the one in +CMGL, as you'll be able to distinguish the lines containing the header from those containing data (and from the empty lines, of course). For a deeper analysis about +CMGL response parsing have a look to this answer.

Need a little help to fix an Arduino RFID program

I just extracted the problematic part of my program, I use RFID.h and SPI.h,
I just want to know how to read on a RFID card (written with an android phone)
I only write one letter : R, G, B, Y, ... (represent color) , on an Android tool I can See at sector 04 : ?TenR? When the "R" after Ten is the string that I wanna read :
char buffer_data[8];
rfid.read(0x04,buffer_data);
String myString = String(buffer_data);
Serial.println(myString);
I only want to know how to output => "R" (text on the RFID card at sector 04) : It output something like that :
22:05:15.885 ->
22:05:15.885 -> &⸮
22:05:15.885 -> ⸮⸮
With other cards (Y, B char inside) same output...
Screenshot with card data (Mifare classic 1k (716B writable)):
The lib RFID.h with rfid.read doest not work...
https://github.com/song940/RFID-RC522
don't use this lib !
The lib https://github.com/miguelbalboa/rfid is better, up to date, and can read most of tag types !
This is the fixed code to read the first text char on NTAG215 :
if (rfid.PICC_IsNewCardPresent()) {
if ( ! rfid.PICC_ReadCardSerial()) {
return;
}
Serial.println("");
String str;
byte buffer_data[18];
byte size_data = sizeof(buffer_data);
rfid.MIFARE_Read(4,buffer_data,&size_data);
str=String((char *)buffer_data);
Serial.println(str.charAt(9));
}
Ouput the first letter on the tag (if you write text data with Android NFC tools app ) only on NTAG215 (other tag = different adresses/position)!
I assume that the "square" refers to the ASCII number printed to stdout.
I would want to find out, what read_char is in HEX, so instead of printing it as a character to stdout, print the hex representation of it and see what value you get. It's difficult to give you more accurate troubleshooting steps with the limited system information available.

C example of using AntLR

I am wondering where I can find C tutorial/example of using AntLR. All I found is using Java language.
I am focusing to find a main function which use the parser and lexer generated by AntLR.
Take a look at this document
And here is an example:
// Example of a grammar for parsing C sources,
// Adapted from Java equivalent example, by Terence Parr
// Author: Jim Idle - April 2007
// Permission is granted to use this example code in any way you want, so long as
// all the original authors are cited.
//
// set ts=4,sw=4
// Tab size is 4 chars, indent is 4 chars
// Notes: Although all the examples provided are configured to be built
// by Visual Studio 2005, based on the custom build rules
// provided in $(ANTLRSRC)/code/antlr/main/runtime/C/vs2005/rulefiles/antlr3.rules
// there is no reason that this MUST be the case. Provided that you know how
// to run the antlr tool, then just compile the resulting .c files and this
// file together, using say gcc or whatever: gcc *.c -I. -o XXX
// The C code is generic and will compile and run on all platforms (please
// report any warnings or errors to the antlr-interest newsgroup (see www.antlr.org)
// so that they may be corrected for any platform that I have not specifically tested.
//
// The project settings such as additional library paths and include paths have been set
// relative to the place where this source code sits on the ANTLR perforce system. You
// may well need to change the settings to locate the includes and the lib files. UNIX
// people need -L path/to/antlr/libs -lantlr3c (release mode) or -lantlr3cd (debug)
//
// Jim Idle (jimi cut-this at idle ws)
//
// You may adopt your own practices by all means, but in general it is best
// to create a single include for your project, that will include the ANTLR3 C
// runtime header files, the generated header files (all of which are safe to include
// multiple times) and your own project related header files. Use <> to include and
// -I on the compile line (which vs2005 now handles, where vs2003 did not).
//
#include <C.h>
// Main entry point for this example
//
int ANTLR3_CDECL
main (int argc, char *argv[])
{
// Now we declare the ANTLR related local variables we need.
// Note that unless you are convinced you will never need thread safe
// versions for your project, then you should always create such things
// as instance variables for each invocation.
// -------------------
// Name of the input file. Note that we always use the abstract type pANTLR3_UINT8
// for ASCII/8 bit strings - the runtime library guarantees that this will be
// good on all platforms. This is a general rule - always use the ANTLR3 supplied
// typedefs for pointers/types/etc.
//
pANTLR3_UINT8 fName;
// The ANTLR3 character input stream, which abstracts the input source such that
// it is easy to provide input from different sources such as files, or
// memory strings.
//
// For an ASCII/latin-1 memory string use:
// input = antlr3NewAsciiStringInPlaceStream (stringtouse, (ANTLR3_UINT64) length, NULL);
//
// For a UCS2 (16 bit) memory string use:
// input = antlr3NewUCS2StringInPlaceStream (stringtouse, (ANTLR3_UINT64) length, NULL);
//
// For input from a file, see code below
//
// Note that this is essentially a pointer to a structure containing pointers to functions.
// You can create your own input stream type (copy one of the existing ones) and override any
// individual function by installing your own pointer after you have created the standard
// version.
//
pANTLR3_INPUT_STREAM input;
// The lexer is of course generated by ANTLR, and so the lexer type is not upper case.
// The lexer is supplied with a pANTLR3_INPUT_STREAM from whence it consumes its
// input and generates a token stream as output.
//
pCLexer lxr;
// The token stream is produced by the ANTLR3 generated lexer. Again it is a structure based
// API/Object, which you can customise and override methods of as you wish. a Token stream is
// supplied to the generated parser, and you can write your own token stream and pass this in
// if you wish.
//
pANTLR3_COMMON_TOKEN_STREAM tstream;
// The C parser is also generated by ANTLR and accepts a token stream as explained
// above. The token stream can be any source in fact, so long as it implements the
// ANTLR3_TOKEN_SOURCE interface. In this case the parser does not return anything
// but it can of course specify any kind of return type from the rule you invoke
// when calling it.
//
pCParser psr;
// Create the input stream based upon the argument supplied to us on the command line
// for this example, the input will always default to ./input if there is no explicit
// argument.
//
if (argc < 2 || argv[1] == NULL)
{
fName =(pANTLR3_UINT8)"./input"; // Note in VS2005 debug, working directory must be configured
}
else
{
fName = (pANTLR3_UINT8)argv[1];
}
// Create the input stream using the supplied file name
// (Use antlr3AsciiFileStreamNew for UCS2/16bit input).
//
input = antlr3AsciiFileStreamNew(fName);
// The input will be created successfully, providing that there is enough
// memory and the file exists etc
//
if ( input == NULL)
{
fprintf(stderr, "Failed to open file %s\n", (char *)fName);
exit(1);
}
// Our input stream is now open and all set to go, so we can create a new instance of our
// lexer and set the lexer input to our input stream:
// (file | memory | ?) --> inputstream -> lexer --> tokenstream --> parser ( --> treeparser )?
//
lxr = CLexerNew(input); // CLexerNew is generated by ANTLR
// Need to check for errors
//
if ( lxr == NULL )
{
fprintf(stderr, "Unable to create the lexer due to malloc() failure1\n");
exit(1);
}
// Our lexer is in place, so we can create the token stream from it
// NB: Nothing happens yet other than the file has been read. We are just
// connecting all these things together and they will be invoked when we
// call the parser rule. ANTLR3_SIZE_HINT can be left at the default usually
// unless you have a very large token stream/input. Each generated lexer
// provides a token source interface, which is the second argument to the
// token stream creator.
// Note that even if you implement your own token structure, it will always
// contain a standard common token within it and this is the pointer that
// you pass around to everything else. A common token as a pointer within
// it that should point to your own outer token structure.
//
tstream = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT, TOKENSOURCE(lxr));
if (tstream == NULL)
{
fprintf(stderr, "Out of memory trying to allocate token stream\n");
exit(1);
}
// Finally, now that we have our lexer constructed, we can create the parser
//
psr = CParserNew(tstream); // CParserNew is generated by ANTLR3
if (psr == NULL)
{
fprintf(stderr, "Out of memory trying to allocate parser\n");
exit(ANTLR3_ERR_NOMEM);
}
// We are all ready to go. Though that looked complicated at first glance,
// I am sure, you will see that in fact most of the code above is dealing
// with errors and there isn't really that much to do (isn't this always the
// case in C? ;-).
//
// So, we now invoke the parser. All elements of ANTLR3 generated C components
// as well as the ANTLR C runtime library itself are pseudo objects. This means
// that they are represented as pointers to structures, which contain any
// instance data they need, and a set of pointers to other interfaces or
// 'methods'. Note that in general, these few pointers we have created here are
// the only things you will ever explicitly free() as everything else is created
// via factories, that allocated memory efficiently and free() everything they use
// automatically when you close the parser/lexer/etc.
//
// Note that this means only that the methods are always called via the object
// pointer and the first argument to any method, is a pointer to the structure itself.
// It also has the side advantage, if you are using an IDE such as VS2005 that can do it
// that when you type ->, you will see a list of tall the methods the object supports.
//
psr->translation_unit(psr);
// We did not return anything from this parser rule, so we can finish. It only remains
// to close down our open objects, in the reverse order we created them
//
psr ->free (psr); psr = NULL;
tstream ->free (tstream); tstream = NULL;
lxr ->free (lxr); lxr = NULL;
input ->close (input); input = NULL;
return 0;
}
contrapunctus.net/blog/2012/antlr-c a simple google would suffice. Note however, the example is C++ I don't think ANTLR supports PURE C – Aniket Jan 1 at 1:56

Resources