I'm trying to store API request query parameters in JSON format, in a way that preserves the inferred original types of the parameters' values. I do this without knowing what these APIs look like beforehand.
The code below deals with each query argument (delimited by &) one by one.
for (int i = 0; i < url_arg_cnt; i++) {
const http_arg_t *arg = http_get_arg(http_info, i);
if (cJSON_GetObjectItem(query, arg->name.p) == NULL) {
// Currently just treating as a string.
cJSON_AddItemToObject(query, arg->name.p, cJSON_CreateString(arg->value.p));
SLOG_INFO("name:value is %s:%s\n", arg->name.p, arg->value.p);
} else {
//duplicate key.
}
With the above code, for input
?start=0&count=2&format=policyid|second&id%5Bkey1%5D=1&id[key2]=2&object=%7Bone:1,two:2%7D&nested[][foo]=1&nested[][bar]=2
I get these prints:
name:value is start:0
name:value is count:2
name:value is format:policyid|second
name:value is id[key1]:1
name:value is id[key2]:2
name:value is object:{one:1, two:2}
name:value is nested[][foo]:1
name:value is nested[][bar]:2
According to this document and other places I've researched,
https://swagger.io/docs/specification/serialization/
There is no consensus on how the query parameters are passed, therefore no guarantee what I could encounter here. So my goal is to support as many variations as possible.
These possibilities seem to be the most common:
Arrays:
?x = 1,2,3
?x=1&x=2&x=3
?x=1%202%203
?x=1|2|3
?x[]=1&x[]=2
String:
?x=1
Object, could be nested:
?x[key1]=1&x[key2]=2
?x=%7Bkey1:1,key2:2%7D
?x[][foo]=1&x[][bar]=2
?fields[articles]=title,body&fields[people]=name
?x[0][foo]=bar&x[1][bar]=baz
Any ideas how to best go about this? Basically for these query parameters I want to aggregate ('exploded') arguments that belong together and save to query proper intended json objects. Line in question:
cJSON_AddItemToObject(query, arg->name.p, cJSON_CreateString(arg->value.p));
Converting the URI query to JSON
This post will provide more generic (canonical) approach toward the problem of extraction of the variables from the URI string.
The query is defined across several descriptive standards (RFCs and specifications), so tho have canonical approach, we need to use the specifications to create a normalized form of the query before we can build the object.
TL;DR
To assure that we can be implement the specifications with the ability to cater for future extensions, the algorithm to convert the query to JSON should be separated in steps, each one gradually building the normalized form of the query, before it can be converted to JSON object. To do so, we need the following steps:
Extract the query from the URI
Split to key=value
Normalize the key (build the object hierarchy)
Normalize the value (populate the object attributes and build the attribute arrays)
Build JSON object based on the normalized key=value
Such separation of the steps will allow much easier adoption of future changes in the specifications. The parsing of the values can be done with RegEx or with a parser (BNF, PEG, etc.).
Conversion steps
First thing to be done is to extract the query string from the URI. This is described in the RFC3986 and will be explained in it's own section Extracting the query string. The extraction of the query segment, as we will see later, can be easily done with RegEx.
After query string is extracted from the URI, one needs to interpret the information conveyed by the query. As we will see below, the query has a very loose definition in the RFC3986, and the case where the query is conveying variables is further elaborated in RFC6570. During the extraction, the algorithm should extract the values (that are in form of key=value) and store them in a map structure (one approach would be to use strict as described in following SO post. The section Interpreting the query string provides overview of the process.
After the variables are separated and placed in form of key=value, next stage is to normalize the key. Proper interpretation of the key will allow us to build the hierarchical structure of the JSON object from the key=value structure. The RFC6570 is not providing much information on how the key can be normalized, however the OpenAPI specification provides a good information how to handle different types of key. The normalization will be further elaborated in section Normalizing the key
Next we need to normalize the variables by continuing to build on the RFC6570 which defines the types of the variables in several levels. This will be further elaborated in section Normalizing the value
Final stage is to build the JSON object with cJSON_AddItemToObject(query, name, cJSON_CreateString(value));. More details will be discussed in the Building the JSON Object section.
During implementation, some of the steps can be merged to a single step to optimize the implementation.
Extracting the query string
The RFC3986 which is the main descriptive standard that is governing the URI is defining the URI as:
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
The query part is defined in the section 3.4 of the RFC as the segment of the URI such as:
... The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI. ...
The formal syntax of the query segment is defined as:
query = *( pchar / "/" / "?" )
pchar = unreserved / pct-encoded / sub-delims / ":" / "#"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded = "%" HEXDIG HEXDIG
sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
/ "*" / "+" / "," / ";" / "="
This means that the query can contain more instances of ? and / before the # is met. Actually, as long as the characters after first occurrence of the? are in the set of characters that do not have special meaning, everything that is found until first # is encountered is the query.
At the same time, this also implies that the sub-delimiter &, as well as the ? has no special meaning according to this RFC when is encountered inside the query string, as long as it's in the proper form and position in the URI. This implies that each implementation can define its own structure. The language of RFC in chapter 3.4 confirms such implications by leaving space for other interpretations by using often instead of always
... However, as query components
are often used to carry identifying information in the form of
"key=value" pairs ...
In addition, the RFC also provides the following RegEx that can be used to extract the query part from the URI:
regex : ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
segments: 12 3 4 5 6 7 8 9
Where the capture #7 is the query from the URI.
The easiest approach for extracting the query, provided that we are not interested in the remaining parts of the URI, is to use the RegEx to split the URI and extract the query string that will not contain the leading ? nor the terminating #.
This RFC3986 is further extended with the RFC3987 in order to cover the international characters, however the RegEx defined by the RFC3986 remains valid
Extracting variables from the query string
To decompose the query string to key=value pairs, we need to do reverse engineering of the RFC6570 which establishes a descriptive standard for the expansion of the variables and constructing the valid query. As the RFC is stating
... A URI Template provides both a structural description of a URI space
and, when variable values are provided, machine-readable instructions
on how to construct a URI corresponding to those values. ...
From the RFC, we can extract the following syntax for a variable in the query:
query = variable *( "&" variable )
variable = varname "=" varvalue
varvalue = *( valchar / "[" / "] / "{" / "}" / "?" )
varname = varchar *( ["."] varchar )
varchar = ALPHA / DIGIT / "_" / pct-encoded
pct-encoded = "%" HEXDIG HEXDIG
valchar = unreserved / pct-encoded / vsub-delims / ":" / "#"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
vsub-delims = "!" / "$" / "'" / "(" / ")"
/ "*" / "+" / ","
The extraction can be performed with a parser that implements the above grammar, or by iterating over the query with the following RegEx and extracting the (key, value) pairs.
([\&](([^\&]*)\=([^\&]*)))
In case we use RegEx, note that in previous section we had omitted the "?" at the start of the query and "#" at the end, so we need don't need to handle this characters in the separation of the variables.
Normalizing the key
There descriptive standard RFC6570 provides generic rules of the format of the key, the RFC is not helping much when it comes to the rules for the interpretation of the key when an object is constructed. Some of the specifications such as the OpenAPI specification, JSON API Specification), etc. can help with the interpretation, but they are not providing the full set of rules, rather a subset. To make the things wort, some of the SDKs (ex. PHP SDK) have its own rules for building the keys.
In such situation, the best approach is to create a hierarchical rules for key normalization that will convert the key to a unified format, similar to json path dot notation. The hierarchical rules will allow us to control how the ambiguous situations (in case of collisions between specifications), but controlling the order of the rules. The json path notation will allow us to build the object in the final step without the necessity to have proper order of the key=value pairs.
Following is the grammar of the normalized format:
key = sub-key *("." sub-key )
sub-key = name [ ("[" index "]") ]
name = *( varchar )
index = NONZERO-DIGIT *( DIGIT )
This grammar will allow for keys such as foo, foo.baz, foo[0].baz, foo.baz[0], foo.bar.baz etc.
Following are a good starting point to set of rules and the transformation
Flat key (key -> key)
Attribute key (key.atr -> key.atr)
Array key (key[] -> key[0])
Object Array key (key[attribute] -> key.attribute), (key[][attribute] -> key[0].attribute), (key[attribute][] -> key.attribute[0])
More rules can be added to address special cases. During the transformation, the algorithm should pass from the most specific rules (the bottom rules) to the most generic rules and try to find a full match. If a full match if found, the key will be overwritten with the normal form and the remaining rules will be skipped.
Normalizing the value
Similar to the normalization of the key, the value should also be normalized in cases where the value represents a list. We will need to convert the value from the arbitrary list format to the form format (coma separated list) which is defined by the following grammar:
value = singe-value *( "," singe-value )
singe-value = *( unreserved / pct-encoded )
This grammar will allow us the value to take form a, a,b, a,b,c, etc.
Extracting the list of the values from the value string can be done with splitting the string by the valid delimiters (",",";","|", etc.) and producing the list in a normalized form.
Building the JSON Object
Once the keys and the values are normalized, converting the flat list (the map structure) to a JSON Object can be done by a singe pass trough all of the keys in the list. The normalized format of the key will help us, since the key conveys the whole information about his hierarchy in the object, so even if we had not encountered some of the intermediate attributes, we are able to build the object.
Similar, we can recognize if the value of the attribute should be a flat string or an array from the variable itself, so here as well, no additional information is required to create the proper representation.
Alternative approach
As alternative approach, we can construct a full grammar that will create the AST (abstract syntax tree), and use the tree to produce the JSON object, however due to the multiple variations of the formats and ability to have future extensions, this approach will be less flexible.
Useful links
The grammar in the text is following ABNF grammar rules
JSON Path
GNU Bison is example of BNF parser
C PEG parser library is example of PEG parser
I recently ran into the same issue and will share some wisdom gained from the episode.
I'm assuming you are implementing this on a MITM device (web firewall, etc.).
As notedly in the question, there is no consensus in how the query parameters are passed. Not one standard or a set of rules that govern this -- in fact, any server may implement its own syntax, as long as the syntax is supported by the server code. The best one can do is to 1) decide what query parameter forms to support (do the best you can, maybe as many as possible) and 2) support only those forms, treat the rest (ones not supported) as String values, like your current code does.
It's not worth it to fret too much about the accuracy of the preservation/inference of type in question, or formalizing/generalizing it for a heavyweight solution because 1) the arbitrariness of syntax you may encounter (not necessarily conforming to any standard, web servers can really do whatever they want, therefore the query parameters often don't conform to the, say, swagger standard referenced) and 2) looking at the query parameters only gives you so much information -- the benefit/value of implementing anything more than vague approximations (per rules defined by yourself, as stated before) is hard to be seen. Think about even the simplest of cases, how vague they can be: you sorta have to pretend in the x=something&x=something exploded case, arrays have to have at least two elements. If only one element -- x=something -- you treat it as a string, for how else do you know whether it's an array or a string? How about the x=1 case, is 1 a string or a number, the original / intended type? Also, how about x=foo&y=1 | 2 | 3? or when you see "1, 2, 3", with spaces? Are the spaces supposed to be ignored, are they array delimiters themselves, or are they actually a part of the array elements. Finally, how do you even know the intended string is not "1 | 2 | 3" itself, meaning it's not an array!
So the best one can do in parsing these strings and trying to support/ infer all these variations (different rules) is to define ones own rules (what one is okay/happy with) and support only those.
I have this SQL sentence for retrieving a specific role (by the column RoleID) from the table Roles:
DBBroker.getInstance.read("SELECT * FROM Roles WHERE RoleID='" & role.roleID & "';")
The thing is that when my program runs the sentence, I obtain this error:
I work with a Microsoft Access Database in which RoleID is defined as Autonumber while in my program it is defined as String. Anyway I tried changing Types but it still fails.
I have too much code in the program so I cannot include it here, but I'm open for any requests regarding a specific part of the program.
Thanks
By the way I worked before on another similar database and the exact same clause did work indeed.
Though you solved your problem,i am answering this just to describe and help guys who visit this page in future
Here's a one line explanation of the issue :
The data type you set for the column in the table of the database is different than the type of value you are passing
For example : In simple words, you have a word and a number , can you add them ? I mean in a mathematical way ? The answer is NO.
Now assuming that your data-type for the RoleID column is Integer but role.RoleId is of type/returns value of type String , then there will be a data-type mismatch as one is an integer and the other is a String.
Now, going through the comments , i see that you've solved your issue.Now, let me explain how you solved it.
Your sql query looks like this :
"SELECT * FROM Roles WHERE RoleID='" & role.roleID & "';"
Let's point out the main relevant part :
RoleID='" & role.roleID & "'
Here,before you close the string RoleID= with double quotes, you use '(single quote).In sql-queries, single quotes are used to declare/give a value(of type STRING) to the required/given parameter(of the query).
In order to pass an Integer value ,you can pass it without the ' single quote like this :
RoleID=12345
So, the answer is simple:You were passing some data of type String to a column which expects data/even if the passed value is of type intger but because you were using ' single quotes when passing the values,the quote had converted it to a String ..So, all u have to do(had to do-in your case) is remove the two single quotes(')
Hope this helps to enrich your knowledge :)
I've got an array of filepaths and I've got a NSPredicateEditor setup in my UI where the user can combine a NSPredicate to find a file. He should be able to filter by name, type, size and date.
There are several problems I have now:
I can only get one predicate object from the editor. When I use
"predicateForRow:" it returns (null)
If the user wants to filter the file by name AND size or date, I
can't just use this predicate on my array anymore because those
information are not contained in it
Can I split up a predicate into different predicates without
converting it into a NSString object, then search for every #" OR " |
#" AND " and seperating the components into an array and then
converting every NSString into a new predicate?
In the NSPredicateEditor settings I've some options for the "left Expression":
Keypaths, Constant Values, Strings, Integer Numbers, Floating Point Numbers and Dates. I want to display a dropdown menu to the user with "name", "type", "date", "size". But then the generated predicate automatically looks like this:
"name" MATCHES[c] "nameTest" OR "type" MATCHES[c] "jpg" OR size == 100
Because the array is filled with strings, a search for "name", "type" etc. and those strings do not respond to #"myString"*.name*m the filter always returns 0 objects. Is there a way to show the Name, Type, Size and Date in the Menu, but write "self" into the predicate without doing it by hand?
I've already searched in the official Apple tutorials, on Stackoverflow, Google, and even Youtube to find a clue. This problem troubles me for almost one week now. Thanks for you time! If you need more information please let me know!
You have come to the right place! :)
I can only get one predicate object from the editor.
Correct. It is an NSPredicateEditor, not an NSPredicatesEditor. ;)
When I use "predicateForRow:" it returns (null)
I'm not sure I would use that method. My general rule of thumb is to largely ignore that NSPredicateEditor is a subclass of NSRuleEditor, mainly because it's such a highly specialized subclass that many of the superclass methods don't make that much sense on a predicate editor (like all the stuff about criteria, row selection, etc). It's possible that they're somehow relevant, but if they are, I haven't figured out how yet.
To get the predicate from the editor, you do:
NSPredicate *predicate = [myPredicateEditor objectValue];
If the user wants to filter the file by name AND size or date
You mean (name = [something]) AND (size = [something] OR date = [something])?
If so, NSPredicateEditor can do that if you've set the nesting mode to "Compound".
I can't just use this predicate on my array anymore because those information are not contained in it
What information do you need?
Can I split up a predicate into different predicates without converting it into a NSString object, then search for every #" OR " | #" AND " and seperating the components into an array and then converting every NSString into a new predicate?
Yes, but that is a BAD idea. It's bad because NSPredicate already contains all the information you need, and converting it to a different format and doing string manipulations just isn't necessary and can potentially lead to complications (like if someone can type in a value for "name", what happens if they type in " OR "?).
I'm having a hard time trying to figure out what it is you're trying to do. It sounds like you have an array of NSString objects that you want to filter based on a predicate that the user creates? If so, then what do these name, date, and size key paths mean? What are you trying to do?
I'm working on a servlet in Google App Engine. This servlet retrieves the data from the GAE's datastore; everything works fine when querying like "SELECT * FROM...". But when I want to filter it by a certain column, it does not work since the name of the column has a hypen. It is like the following:
Query query = new Query("tableName");
query.addFilter("col-name", Query.FilterOperator.EQUAL, filterValue);
How do I pass the propertyName with a hyphen?
java only accepts letters and digits the dollar sign "$", or the underscore character "_" like legal identifiers.
So i believe that's not posible. Also did't work in python
http://java.sun.com/docs/books/tutorial/java/nutsandbolts/variables.html#naming
The AppEngine datastore doesn't have rows or columns; it has models and properties.
The Defining Data Classes talks about defining your models; the important thing to note is that the Java rules for identifier names matter, because each property of a model will at some point be turned into a java object with the same name.
You've described this yourself:
if I filter by a column called
"field-1", it is kind of I was trying
to subtract 1 from every returned
value of the column called field
Is the addFilter method correctly enclosing the column name in single quotes? You might want to try adding them yourself. One can filter by things which are not keys in the database in GQL so this might be something that is expected of you.
What you want is for users to just type in their search criteria just like they would in Google. Some words, maybe some quoted phrases, maybe a few operators, and have it just work.
A .Net solution is available here:
http://ewbi.blogs.com/develops/2007/05/normalizing_sql.html
I am looking for a pure T-SQL version with where support also. (Or VbScript/javascript)
Example: "dog" food price:20..45
should look like this (for mssql):
select * from table t join containstable(desc, '"dog" and food*') k on k.key=t.id
where t.price between 20 and 45
Operators: and, or, near, "", not, * , etc.
I don't see how you could have this functionality short of writing a complete parser that is programmed with the table relationships and column datatypes that exist on your database.