Parsing the payload of the AT commands from the full response string - c

I want to parse the actual payload from the output of AT commands.
For instance: in the example below, I'd want to read only "2021/11/16,11:12:14-32,0"
AT+QLTS=1 // command
+QLTS: "2021/11/16,11:12:14-32,0" // response
OK
In the following case, I'd need to only read 12345678.
AT+CIMI // command
12345678 // example response
So the point is: not all commands have the same format for the output. We can assume the response is stored in a string array.
I have GetAtCmdRsp() already implemented which stores the response in a char array.
void GetPayload()
{
char rsp[100] = {0};
GetAtCmdRsp("AT+QLTS=1", rsp);
// rsp now contains +QLTS: "2021/11/16,11:12:14-32,0"
// now, I need to parse "2021/11/16,11:12:14-32,0" out of the response
memset(rsp, 0, sizeof(rsp));
GetAtCmdRsp("AT+CIMI", rsp);
// rsp now contains 12345678
// no need to do additional parsing since the output already contains the value I need
}
I was thinking of doing char *start = strstr(rsp, ":") + 1; to get the start of the payload but some responses may only contain the payload as it's the case with AT+CIMI
Perhaps could regex be a good idea to determine the pattern +<COMMAND>: in a string?

In order to parse AT command responses a good starting point is understanding all the possible formats they can have. So, rather than implementing a command specific routine, I would discriminate commands by "type of response":
Commands with no payload in their answers, for example
AT
OK
Commands with no header in their answers, such as
AT+CIMI
12345678
OK
Commands with a single header in their answers
AT+QLTS=1
+QLTS: "2021/11/16,11:12:14-32,0"
OK
Command with multi-line responses.Every line could of "single header" type, like in +CGDCONT:
AT+CDGCONT?
+CGDCONT: 1,"IP","epc.tmobile.com","0.0.0.0",0,0
+CGDCONT: 2,"IP","isp.cingular","0.0.0.0",0,0
+CGDCONT: 3,"IP","","0.0.0.0",0,0
OK
Or we could even have mixed types, like in +CGML:
AT+CMGL="ALL"
+CMGL: 1,"REC READ","+XXXXXXXXXX","","21/11/25,10:20:00+00"
Good morning! How are you?
+CMGL: 2,"REC READ","+XXXXXXXXXX","","21/11/25,10:33:33+00"
I'll come a little late. See you. Bruce Wayne
OK
(please note how it could have also "empty" lines, that is \r\n).
At the moment I cannot think about any other scenario.In this way you'll be able to define an enum like
typedef enum
{
AT_RESPONSE_TYPE_NO_RESPONSE,
AT_RESPONSE_TYPE_NO_HEADER,
AT_RESPONSE_TYPE_SINGLE_HEADER,
AT_RESPONSE_TYPE_MULTILINE,
AT_RESPONSE_TYPE_MAX
}
and pass it to your GetAtCmdRsp( ) function in order to parser the response accordingly. If implement the differentiation in that function, or after it (or in an external function is your choice.
A solution without explicit categorization
Once you have clear all the scenarios that might ever occur, you can think about a general algorithm working for all of them:
Get the full response resp after the command echo and before the closing OK or ERROR. Make sure that the trailing \r\n\r\nOK is removed (or \r\nERROR. Or \r\nNO CARRIER. Or whatever the terminating message of the response might be).Make also sure to remove the command echo
If strlen( resp ) == 0 we belong to the NO_RESPONSE category, and the job is done
If the response contains \r\ns in it, we have a MULTILINE answer. So, tokenize it and place every line into an array element resp_arr[i]. Make sure to remove trailing \r\n
For every line in the response (for every resp_arr[i] element), search for <CMD> : pattern (not only :, that might be contained in the payload as well!). Something like that:
size_t len = strlen( resp_cur_line );
char *payload;
if( strstr( "+YOURCMD: ", resp_cur_line) == NULL )
{
// We are in "NO_HEADER" case
payload = resp_cur_line;
}
else
{
// We are in "HEADER" case
payload = resp_cur_line + strlen( "+YOURCMD: " );
}
Now payload pointer points to the actual payload.
Please note how, in case of MULTILINE answer, after splitting the lines into array elements every loop will handle correctly also the mixed scenarios like the one in +CMGL, as you'll be able to distinguish the lines containing the header from those containing data (and from the empty lines, of course). For a deeper analysis about +CMGL response parsing have a look to this answer.

Related

Bus Error on void function return

I'm learning to use libcurl in C. To start, I'm using a randomized list of accession names to search for protein sequence files that may be found hosted here. These follow a set format where the first line is a variable length (but which contains no information I'm trying to query) then a series of capitalized letters with a new line every sixty (60) characters (what I want to pull down, but reformat to eighty (80) characters per line).
I have the call itself in a single function:
//finds and saves the fastas for each protein (assuming on exists)
void pullFasta (proteinEntry *entry, char matchType, FILE *outFile) {
//Local variables
URL_FILE *handle;
char buffer[2] = "", url[32] = "http://www.uniprot.org/uniprot/", sequence[2] = "";
//Build full URL
/*printf ("u:%s\nt:%s\n", url, entry->title); /*This line was used for debugging.*/
strcat (url, entry->title);
strcat (url, ".fasta");
//Open URL
/*printf ("u:%s\n", url); /*This line was used for debugging.*/
handle = url_fopen (url, "r");
//If there is data there
if (handle != NULL) {
//Skip the first line as it's got useless info
do {
url_fread(buffer, 1, 1, handle);
} while (buffer[0] != '\n');
//Grab the fasta data, skipping newline characters
while (!url_feof (handle)) {
url_fread(buffer, 1, 1, handle);
if (buffer[0] != '\n') {
strcat (sequence, buffer);
}
}
//Print it
printFastaEntry (entry->title, sequence, matchType, outFile);
}
url_fclose (handle);
return;
}
With proteinEntry being defined as:
//Entry for fasta formatable data
typedef struct proteinEntry {
char title[7];
struct proteinEntry *next;
} proteinEntry;
And the url_fopen, url_fclose, url_feof, url_read, and URL_FILE code found here, they mimic the file functions for which they are named.
As you can see I've been doing some debugging with the URL generator (uniprot URLs follow the same format for different proteins), I got it working properly and can pull down the data from the site and save it to file in the proper format that I want. I set the read buffer to 1 because I wanted to get a program that was very simplistic but functional (if inelegant) before I start playing with things, so I would have a base to return to as I learned.
I've tested the url_<function> calls and they are giving no errors. So I added incremental printf calls after each line to identify exactly where the bus error is occurring and it is happening at return;.
My understanding of bus errors is that it's a memory access issue wherein I'm trying to get at memory that my program doesn't have control over. My confusion comes from the fact that this is happening at the return of a void function. There's nothing being read, written, or passed to trigger the memory error (as far as I understand it, at least).
Can anyone point me in the right direction to fix my mistake please?
EDIT: As #BLUEPIXY pointed out I had a potential url_fclose (NULL). As #deltheil pointed out I had sequence as a static array. This also made me notice I'm repeating my bad memory allocation for url, so I updated it and it now works. Thanks for your help!
If we look at e.g http://www.uniprot.org/uniprot/Q6GZX1.fasta and skip the first line (as you do) we have:
MNAKYDTDQGVGRMLFLGTIGLAVVVGGLMAYGYYYDGKTPSSGTSFHTASPSFSSRYRY
Which is a 60 characters string.
When you try to read this sequence with:
//Grab the fasta data, skipping newline characters
while (!url_feof (handle)) {
url_fread(buffer, 1, 1, handle);
if (buffer[0] != '\n') {
strcat (sequence, buffer);
}
}
The problem is sequence is not expandable and not large enough (it is a fixed length array of size 2).
So make sure to choose a large enough size to hold any sequence, or implement the ability to expand it on-the-fly.

Checking for a blank line in C - Regex

Goal:
Find if a string contains a blank line. Whether it be '\n\n',
'\r\n\r\n', '\r\n\n', '\n\r\n'
Issues:
I don't think my current regex for finding '\n\n' is right. This is my first time really using regex outside of simple use of * when removing files in command line.
Is it possible to check for all of these cases (listed above) in one regex? or do I have to do 4 seperate calls to compile_regex?
Code:
int checkForBlankLine(char *reader) {
regex_t r;
compile_regex(&r, "*\n\n");
match_regex(&r, reader);
return 0;
}
void compile_regex(regex_t *r, char *matchText) {
int status;
regcomp(r, matchText, 0);
}
int match_regex(regex_t *r, char *reader) {
regmatch_t match[1];
int nomatch = regexec(r, reader, 1, match, 0);
if (nomatch) {
printf("No matches.\n");
} else {
printf("MATCH!\n");
}
return 0;
}
Notes:
I only need to worry about finding one blank line, that's why my regmatch_t match[1] is only one item long
reader is the char array containing the text I am checking for a blank line.
I have seen other examples and tried to base the code off of those examples, but I still seem to be missing something.
Thank you kindly for the help/advice.
If anything needs to be clarified please let me know.
It seems that you have to compile the regex as extended:
regcomp(&re, "\r?\n\r?\n", REG_EXTENDED);
The first atom, \r? is probably unnecessary, because it doesn't add to the blank-line condition if you don't capture the result.
In the above, blank line really means empty line. If you want blank line to mean a line that has no characters except for white space, you can use:
regcomp(&re, "\r?\n[ \t]*\r?\n", REG_EXTENDED);
(I don't think you can use the space character pattern, \s here instead of [ \t], because that would include carriage return and new-line.)
As others have already hinted at, the "simple use of * in the command line` is not a regular expression. This wildcard-matching is called file globbing and has different semantics.
Check what the * in a regex means. It's not like the wildcard "anything" in the command line. The * means that the previous component can appear any amount of times. The wildcard in regex is the .. So if you want to say match anything you can do .*, which would be anything, any amount of times.
So in your case you can do .*\n\n.* which would match anything that has \n\n.
Finally, you can use or in a regex and ( ) to group stuff. So you can do something like .*(\n\n|\r\n\r\n).* And that would match anything that has a \n\n or a \r\n\r\n.
Hope that helps.
Rather than looking for only \r or \n, look for not \r or \n?
Your regex would simply be
'[^\r\n]'
and a match result of false indicates a blank line to your specification.

How to get metadata from Libextractor into a struct

I want to use Libextractor to get keywords/metadata for files.
The basic example for it is -
struct EXTRACTOR_PluginList *plugins
= EXTRACTOR_plugin_add_defaults (EXTRACTOR_OPTION_DEFAULT_POLICY);
EXTRACTOR_extract (plugins, argv[1],
NULL, 0,
&EXTRACTOR_meta_data_print, stdout);
EXTRACTOR_plugin_remove_all (plugins);
However, this calls the function EXTRACTOR_meta_data_print which "prints" it to "stdout"
I'm looking at a way to get this information to another function - i.e. pass or store this in memory for further working. The documentation was not clear to me. Any help or experience regarding this?
I've tried to install libextractor and failed to get it working (it always returns a NULL plugin pointer upon call to EXTRACTOR_plugin_add_defaults()), so what I will write next is NOT TESTED:
from : http://www.gnu.org/software/libextractor/manual/libextractor.html#Extracting
Function Pointer: int
(*EXTRACTOR_MetaDataProcessor)(void *cls,
const char *plugin_name,
enum EXTRACTOR_MetaType type,
enum EXTRACTOR_MetaFormat format,
const char *data_mime_type,
const char *data,
size_t data_len)
and
Type of a function that libextractor calls for each meta data item found.
cls
closure (user-defined)
plugin_name
name of the plugin that produced this value;
special values can be used (i.e. '<zlib>' for
zlib being used in the main libextractor library
and yielding meta data);
type
libextractor-type describing the meta data;
format basic
format information about data
data_mime_type
mime-type of data (not of the original file);
can be NULL (if mime-type is not known);
data
actual meta-data found
data_len
number of bytes in data
Return 0 to continue extracting, 1 to abort.
So you would just have to write your own function called whatever you want, and have this declaration be like:
int whateveryouwant(void *cls,
const char *plugin_name,
enum EXTRACTOR_MetaType type,
enum EXTRACTOR_MetaFormat format,
const char *data_mime_type,
const char *data,
size_t data_len)
{
// Do your stuff here
if(stop)
return 1; // Stops
else
return 0; // Continues
}
and call it via:
EXTRACTOR_extract (plugins, argv[1],
NULL, 0,
&whateveryouwant,
NULL/* here be dragons */);
Like described in http://www.gnu.org/software/libextractor/manual/libextractor.html#Generalities "3.3 Introduction to the libextractor library"
[here be dragons]: That is a parameter left for the user's use (even if it's redundant to say so). As defined in the doc: "For each meta data item found, GNU libextractor will call the ‘proc’ function, passing ‘proc_cls’ as the first argument to ‘proc’."
Where "the proc function" being the function you added (whateveryouwant() here) and proc_cls being an arbitrary pointer (can be anything) for you to pass data to the function. Like a pointer to stdout in the example, in order to print to stdout. That being said, I suspect that the function writes to a FILE* and not inevitably to stdout; so if you open a file for writing, and pass its "file decriptor" as last EXTRACTOR_extract()'s parameter you would probably end with a file filled with the information you can currently read on your screen. That wouldn't be a proper way to access the information, but if you're looking into a quick and dirty way to test some behavior or some feature; that could do it, until you write a proper function.
Good luck with your code!

Trying to make match on a rule that uses "recursive" identifier in flex

I have this line:
0, 6 -> W(1) L(#);
or
\# -> #shift_right R W(1) L
I have to parse this line with flex, and take every element from every part of the arrow and put it in a list. I know how to match simple things, but I don't know how to match multiple things with the same rule. I'm not allowed to increase the limit for rules. I have a hint: parse the pieces, pieces will then combine, and I can use states, but I don't know how to do that, and I can't find examples on the net. Can someone help me?
So, here an example:
{
a -> W(b) #invert_loop;
b -> W(a) #invert_loop;
-> L(#)
}
When this section begins I have to create a structure for each line, where I put what is on the left of -> in a vector, those are some parameters, and the right side in a list, where each term is kinda another structure. For what is on the right side I wrote rules:
writex W([a-zA-Z0-9.#]) for W(anything).
So I need to parse these lines, so I can put the parameters and the structures int the big structure. Something like this(for the first line):
new bigStruc with param = a and list of struct = W(anything), #invert(it is a notation for a reference to another structure)
So what I need is to know how to parse these line so that I can create and create and fill these bigStruct, also using to rules for simple structure(i have all I need for these structures, but I don't how to parse so that I can use these methods).
Sorry for my English and I hope this time I was more clear on what I want.
Last-minute editing: I have matched the whole line with a rule, and then work on it with strtok. There is a way to use previous rules to see what type of structure i have to create? I mean not to stay and put a lots of if, but to use writex W([a-zA-Z0-9.#]) to know that i have to create that kind of structure?
Ok, lets see how this snippet works for you:
// these are exclusive rules, so they do not overlap, for inclusive rules, use %s
%x dataStructure
%x addRules
%%
<dataStructure>-> { BEGIN addRules; }
\{ { BEGIN dataStructure; }
<addRules>; { BEGIN dataStructure; }
<dataStructure>\} { BEGIN INITIAL; }
<dataStructure>[^,]+ { ECHO; } //this will output each comma separated token
<dataStructure>. { } //ignore anything else
<dataStructure>\n { } //ignore anything else
<addRules>[^ ]+ { ECHO; } //this will output each space separated rule
<addRules>. { } //ignore anything else
<addRules>\n { } //ignore anything else
%%
I'm not entirely sure what it it you want. Edit your original post to include the contents of your comments, with examples, and please structure your English better. If you can't explain what you want without contradicting yourself, I can't help you.

SetProp problem

Can anybody tell me why the following code doesn't work? I don't get any compiler errors.
short value = 10;
SetProp(hCtl, "value", (short*) value);
The third parameter is typed as a HANDLE, so IMO to meet the explicit contract of the function you should save the property as a HANDLE by allocating a HGLOBAL memory block. However, as noted in the comments below, MSDN states that any value can be specified, and indeed when I try it on Windows 7 using...
SetProp(hWnd, _T("TestProp"), (HANDLE)(10)); // or (HANDLE)(short*)(10)
...
(short)GetProp(hWnd, _T("TestProp"));
... I get back 10 from GetProp. I suspect somewhere between your SetProp and GetProp one of two things happens: (1) the value of hWnd is different -- you're checking a different window or (2) a timing issue -- the property hasn't been set yet or had been removed.
If you wanted to use an HGLOBAL instead to follow the specific types of the function signature, you can follow this example in MSDN.
Even though a HANDLE is just a pointer, it's a specific data type that is allocated by calls into the Windows API. Lots of things have handles: icons, cursors, files, ... Unless the documentation explicitly states otherwise, to use a blob of data such as a short when the function calls for a HANDLE, you need a memory handle (an HGLOBAL).
The sample code linked above copies data as a string, but you can instead set it as another data type:
// TODO: Add error handling
hMem = GlobalAlloc(GPTR, sizeof(short));
lpMem = GlobalLock(hMem);
if (lpMem != NULL)
{
*((short*)lpMem) = 10;
GlobalUnlock(hMem);
}
To read it back, when you GetProp to get the HANDLE you must lock it to read the memory:
// TODO: Add error handling
short val;
hMem = (HGLOBAL)GetProp(hwnd, ...);
if (hMem)
{
lpMem = GlobalLock(hMem);
if (lpMem)
{
val = *((short*)lpMem);
}
}
I would create the short on the heap, so that it continues to exist, or perhaps make it global, which is perhaps what you did. Also the cast for the short address needs to be void *, or HANDLE.

Resources