I am trying to write a PostgreSQL (11.2) server side function to read the key-value pairs of an input JSONB object. I did this (in print_kv_pair below) by trying to
extract the JsonPairs from the input jsonb object and
iterate through the keys and values and print them.
For example, for '{"a":1, "b": 2}', I expect it to print
k = "a", v = 1
k = "b", v = 2
However, the code output strange characters for the key, and the values (1 and 2) are not a numeric type as I expect. Please see sample output at the end of the question.
Can someone explain how to fix the code and correctly iterate through the key-value pairs?
PG_FUNCTION_INFO_V1(print_kv_pair);
Datum
print_kv_pair(PG_FUNCTION_ARGS)
{
//1. extracting JsonbValue
Jsonb *jb1 = PG_GETARG_JSONB_P(0);
JsonbIterator *it1;
JsonbValue v1;
JsonbIteratorToken r1;
JsonbParseState *state = NULL;
if (jb1 == NULL)
PG_RETURN_JSONB_P(jb1);
if (!JB_ROOT_IS_OBJECT(jb1))
ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("Can only take objects")));
it1 = JsonbIteratorInit(&jb1->root);
r1 = JsonbIteratorNext(&it1, &v1, false);
if (r1 != WJB_BEGIN_OBJECT)
ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("Iterator was not an object")));
JsonbValue *object = &v1;
Assert(object->type == jbvObject);
//2. iterating through key-value pairs
JsonbPair *ptr;
for (ptr = object->val.object.pairs;
ptr - object->val.object.pairs < object->val.object.nPairs; ptr++)
{
//problem lines!!!
char *buf = pnstrdup(ptr->key.val.string.val, ptr->key.val.string.len);
elog(NOTICE, "print_kv_pair(): k = %s", buf); //debug
if (ptr->value.type != jbvNumeric) {
ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("value must be numeric")));
}
elog(NOTICE, "print_kv_pair(): v = %s", DatumGetCString(DirectFunctionCall1(numeric_out,
NumericGetDatum(ptr->value.val.numeric))) ); //debug
}
elog(NOTICE, "print_kv_pair(): ok4");
PG_RETURN_BOOL(true);
}
Sample output with problem line disabled:
=> select print_kv_pair('{"a":1.0, "b": 2.0}'::jsonb);
NOTICE: print_kv_pair(): k = $�K
ERROR: value must be numeric
It seems that part 1. extracting JsonbVaule isn't working properly, and the extracted value points to invalid memory.
(I'm not very familiar with JSONB or the server side PostgreSQL programming.) Any suggestion is appreciated.
Related
I'm trying to add a string to the next index of my array, however I'm having this odd issue where when Im adding to any index other than 1 (yes, I'm indexing my array from one, just to match the actual number of the item for simplicity, I'm sorry!). Essentially, if I ass to index 1, everything is fine, however when I add to index 2, the value of index 1 seems to change to the value I'm planning to put into index 2, even before the lines of code where I add the value to index 2. For example:
Say I try to add 'test1' to my array queuedHashed[100][101] at index 1, and I run through all printing all elements of the array, next to index number, I get:
1: test1
However when I go to add 'test2' at index 2 to the array, I would get:
1: test2
2: test2
I've been pulling my hair out for hours trying to fix this, and can't see where I'm going wrong. I'm currently passing my variables around using structs (due to GTK limitations), however this issue also persisted when the variables were global variables, before I changed them to local ones.
Here is my code:
queue_hash function:
static void queue_hash (GtkButton *button, gpointer user_data) {
struct data *dataStruct = user_data;
GtkWidget *hashWid = dataStruct->hash;
GtkWidget *hashTypeWid = dataStruct->hashType;
GtkWidget *hashEntryLabel = dataStruct->hashEntryLabel;
GtkListStore *store;
GtkTreeIter iter;
const char* hash = gtk_entry_get_text(GTK_ENTRY(hashWid));
int hashLen = gtk_entry_get_text_length(GTK_ENTRY(hashWid));
int hashTypeIndex = gtk_combo_box_get_active(GTK_COMBO_BOX(hashTypeWid));
store = GTK_LIST_STORE(gtk_tree_view_get_model(GTK_TREE_VIEW(list)));
// TODO: Update max length to 128
if (hashLen > 100) {
gtk_widget_set_name(hashWid, "fieldsError");
gtk_label_set_text(GTK_LABEL(hashEntryLabel), "Hash exceeds max length of 100");
g_print("Hash length exceeds 100: Exiting.");
return;
}
gtk_widget_set_name(hashWid, "");
gtk_widget_set_name(hashTypeWid, "");
gtk_widget_set_name(hashEntryLabel, "");
gtk_label_set_text(GTK_LABEL(hashEntryLabel), "Hash to be cracked:");
if ((strcmp(hash, "") == 0) || (hashTypeIndex == -1)) {
if (strcmp(hash, "") == 0) {
gtk_widget_set_name(hashWid, "fieldsError");
}
if (hashTypeIndex == -1) {
gtk_widget_set_name(hashTypeWid, "fieldsError");
}
g_print("Invalid Entry \n");
} else {
// Check for spaces in hash - return if found
// TODO: Check for other non-alphabetical chars/symbols
for (int i = 0; i < hashLen; i++) {
if (hash[i] == ' ') {
gtk_widget_set_name(hashWid, "fieldsError");
gtk_widget_set_name(hashEntryLabel, "errorLabel");
gtk_label_set_text(GTK_LABEL(hashEntryLabel), "Please remove all spaces");
g_print("Space found in hash: Exiting\n");
return;
}
}
g_print("//////////////////////////////\n");
g_print("Before: (HashCount: %i)\n", dataStruct->hashCount);
//test_queue(dataStruct->queuedHashes, dataStruct->hashCount);
for (int i = 1; i <= dataStruct->hashCount; i++) {
g_print("%i: %s\n", i, dataStruct->queuedHashes[i][0]);
}
sleep(1);
// Save hash to array
++dataStruct->hashCount;
g_print("After Increment: %i\n", dataStruct->hashCount);
g_print("Hash: %s\n", hash);
dataStruct->queuedHashes[dataStruct->hashCount][0] = hash; // Line to actually add new string to array
dataStruct->queuedHashTypes[dataStruct->hashCount] = hashTypeIndex;
g_print ("Queue Hash: %s %i\n", dataStruct->queuedHashes[dataStruct->hashCount][0], dataStruct->queuedHashTypes[dataStruct->hashCount]);
sleep(1);
g_print("After: (HashCount: %i)\n", dataStruct->hashCount);
//test_queue(dataStruct->queuedHashes, dataStruct->hashCount);
g_print("Manual 1: %s, 2: %s\n", dataStruct->queuedHashes[1][0], dataStruct->queuedHashes[2][0]);
for (int i = 1; i <= dataStruct->hashCount; i++) {
g_print("%i: %s\n", i, dataStruct->queuedHashes[i][0]);
}
}
Part of calling function that calls the above function:
struct data *hash_data = g_new0(struct data, 1);
hash_data->hash = hashEntry;
hash_data->hashType = hashSelect;
hash_data->hashEntryLabel = hashEntryLabel;
g_signal_connect(queueButton, "clicked", G_CALLBACK (queue_hash), hash_data);
Global struct definition:
struct data {
char* queuedHashes[100][101];
int queuedHashTypes[100];
int hashCount;
GtkWidget *hash;
GtkWidget *hashType;
GtkWidget *hashEntryLabel;
GtkTreeSelection *selectedHash;
};
I have a lot of print statements in there to help illustrate where things seem to be changing unexpectedly, here is the output of the program when run, and two values are entered:
//////////////////////////////
Before: (HashCount: 0)
After Increment: 1
Hash: 12357890
Queue Hash: 12357890 1
After: (HashCount: 1)
Manual 1: 12357890, 2: (null)
1: 12357890
//////////////////////////////
Before: (HashCount: 1)
1: asdfghjkl <----- This should be "1: 12357890", as the array has not yet been changed
After Increment: 2
Hash: asdfghjkl
Queue Hash: asdfghjkl 2
After: (HashCount: 2)
Manual 1: asdfghjkl, 2: asdfghjkl
1: asdfghjkl <----- This should be "1: 12357890"
2: asdfghjkl
Here is my full code for all relevant functions: https://pastebin.com/41W3n5W2
Any help would be greatly appreciated, thanks!
The symptom described in the question usually indicates that the code isn't making a copy of the string. As a result, every "string" stored in the array is just a pointer to the same input buffer, and therefore every entry in the array appears to be the last string received from the user.
The fix is to make a copy of the string. On a POSIX system, you can use the strdup function to make a copy. The strdup function is essentially a call to malloc following by a call to strcpy, which makes a copy of the string, and returns a pointer to the copy. So if your implementation doesn't support strdup, you can easily write your own function to allocate memory, and copy the string.
I am using the C-API for SQLite3 and the json1 extension. Within the database, a list of integers is stored as a json_array. I want to create a C integer array from the json_array using the json_extract function. I am looping over each value in the json array by incrementing the index in the SQL statement. As an example, consider:
CREATE TABLE mytable ( label INTEGER PRIMARY KEY, list TEXT);
INSERT INTO mytable VALUES ( 1, json(json_array(1,2,3)) );
SELECT json_extract( list, '$[index]' ) FROM mytable WHERE label == 1;
---Example: the result for index=0 is the integer: 1
In the C program, I am currently creating a character string to represent the single-quoted portion of the command, '$[index]', as a bound parameter, as shown in the snippet below.
Can or should I avoid using sprintf to set the index? Or, is this an acceptable solution?
char *sql = "select json_extract(list, ?) from mytable where label == 1";
char *index_param = (char *)malloc(80);
// OTHER STUFF: prepare sql stmt, etc, etc...
for (int i=0; i<n; i++) { /* n is the number of values in the json list */
/* Is sprintf the best thing to do here? */
index_length = sprintf(index_param, "$[%d]", i);
sqlite3_bind_text(stmt, 1, index_param, index_length+1, SQLITE_STATIC);
result = sqlite3_step(stmt);
values[i] = sqlite3_column_int(stmt, 0);
sqlite3_reset(stmt);
}
You could construct the path in SQL so that you have only an integer parameter:
SELECT json_extract(list, '$[' || ? || ']') FROM ...
But it would be a better idea to read the array values directly with the json_each() function:
const char *sql = "SELECT value FROM MyTable, json_each(MyTable.list) WHERE ...";
// prepare ...
for (;;) {
rc = sqlite3_step(stmt);
if (rc != SQLITE_ROW)
break;
values[i++] = sqlite3_column_int(stmt, 0);
}
in libconfig - is it possible to dymanically enumerate keys?
As an example, in this example config file from their repo - if someone invented more days in the hours section, could the code dynamically enumerate them and print them out?
Looking at the docs, I see lots of code to get a specific string, or list out an array, but I can't find an example where it enumerates the keys of a config section.
Edit
Received some downvotes, so thought I'd have another crack at being more specific.
I'd like to use libconfig to track some state in my application, read in the last known state when the app starts, and write it out again when it exits. My app stores things in a tree (of depth 2) - so this could be niceley represented as an associative array in a libconfig compatible file as below. The point is that the list of Ids (1234/4567) can change. I could track them in another array, but if I could just enumerate the 'keys' in the ids array below - that would be neater.
so
ids = {
"1234" = [1,2,3]
"4567" = [9,10,11,23]
}
e.g (psuedocode)
foreach $key(config_get_keys_under(&configroot)){
config_get_String($key)
}
I can't see anything obvious in the header file.
You can use config_setting_get_elem function to get n-th element of the group, array or list, and then (if it's group) use config_setting_name to get it's name. But AFAIK you can't use digits in key names. So consider following config structure:
ids = (
{
key = "1234";
value = [1, 2, 3];
},
{
key = "4567";
value = [9, 10, 11, 23];
}
);
Then you can easily enumerate through all members of the ids getting the values you want using the following code:
#include <stdio.h>
#include <libconfig.h>
int main(int argc, char **argv) {
struct config_t cfg;
char *file = "config.cfg";
config_init(&cfg);
/* Load the file */
printf("loading [%s]...\n", file);
if (!config_read_file(&cfg, file)) {
printf("failed\n");
return 1;
}
config_setting_t *setting, *member, *array;
setting = config_lookup(&cfg, "ids");
if (setting == NULL) {
printf("no ids\n");
return 2;
}
int n = 0, k, v;
char const *str;
while (1) {
member = config_setting_get_elem(setting, n);
if (member == NULL) {
break;
}
printf("element %d\n", n);
if (config_setting_lookup_string(member, "key", &str)) {
printf(" key = %s\n", str);
}
array = config_setting_get_member(member, "value");
k = 0;
if (array) {
printf(" values = [ ");
while (1) {
if (config_setting_get_elem(array, k) == NULL) {
break;
}
v = config_setting_get_int_elem(array, k);
printf("%s%d", k == 0 ? "" : ", ", v);
++k;
}
printf(" ]\n");
}
++n;
}
printf("done\n");
/* Free the configuration */
config_destroy(&cfg);
return 0;
}
I am writing to an array from a loop within a loop. The values are writing over themselves.
A couple of background notes, keyname = GET_STRING_VALUE(Ds, Os, Fs, Rs, Is); is pulling string values from a database. An example would be 71001093. Those keys will be different for each Rs (record number for this database). The Fk and Fs relate to different columns in that database. The code is supposed to loop through (first 5) the records of the database and find the name relating it to a key. For the names that match keycmp, add them to the array.
The Issue
The ArrayCheck print outs at the bottom all display the last key entered into the array. The key & counter print out display the correct iterative number and associated key within the loop.
Code
char* status_keys [ 2 ][ 200 ];
int Ds, Os, Fs, Rs, Is, Fk, a, c;
char* keyname;
char* keycmp;
char* stationlookup[4];
char* key;
keycmp = "STRING";
for ( Rs = 1; Rs < 5; Rs++ ) {
keyname = GET_STRING_VALUE(Ds, Os, Fs, Rs, Is);
printf("keyname: %s\n", keyname);
do {
strncpy(stationlookup, keyname, 4);
stationlookup[4] = '\0';
key = GET_STRING_VALUE(Ds, Os, Fk, Rs, Is);
printf("key : %s\n",key);
printf("counter : %d\n",a);
status_keys[0][a] = key;
status_keys[1][a] = stationlookup;
a++;
} while (strstr(keyname,keycmp) != NULL);
}
printf("ArrayCheck 0: %s\n", status_keys[0][0]);
printf("ArrayCheck 1: %s\n", status_keys[0][1]);
printf("ArrayCheck 2: %s\n", status_keys[0][2]);
printf("ArrayCheck 3: %s\n", status_keys[0][3]);
Example Output:
Appreciate the stationlookup help, but this code still provides the writing over issue.
for ( Rs = 1; Rs < 5 ; Rs++ ) {
keyname = GET_STRING_VALUE(Ds, Os, Fs, Rs, Is);
printf("keyname: %s\n", keyname);
do {
status_keys[0][a] = GET_STRING_VALUE(Ds, Os, Fk, Rs, Is);
printf("key : %s\n",status_keys[0][a]);
printf("counter : %d\n",a);
a++;
} while (strstr(keyname,keycmp) != NULL);
}
You declared:
char* stationlookup[4];
Which means that valid indicies are [0], [1], [2] and [3].
So a line of code like this:
stationlookup[4] = '\0';
creates an array-overrun.
I suspect your problems come from these lines:
status_keys[0][a] = key;
status_keys[1][a] = stationlookup;
assuming that GET_STRING_VALUE() returns the same string pointer on each call. The above lines don't copy the strings but instead copy the string pointers which results in what looks like your array elements being overwritten. You can check this by printing the pointer value with:
printf("keyptr : %p\n", key);
To correct this you'll need to change your code to something like:
char status_keys [ 2 ][ 200 ][64];
...
strncpy(status_keys[0][a], key, 62); // or strlcpy() if you have it
strncpy(status_keys[1][a], stationlookup, 62);
or something like:
char* status_keys [ 2 ][ 200 ];
...
status_keys[0][a] = malloc(strlen(key)+1);
strcpy(status_keys[0][a], key);
// Same for stationlookup
// Make sure to free() the strings at some point
if you prefer dynamic memory access.
I am having a perl script that has 2 arrays, 1 with keys and 1 with substring.
I need to check if substring of 1 array have matches in the keys array.
The amount of records is huge, something that can be counted in millions so I use Inline:C to speed up the search, however it is still taking hours to treat the records.
--Perl part
//%h contains {"AAAAA1" => 1, "BBBBBB" => 1, "BB1234" =>1, "C12345" => 1.... }
my #k=sort keys %h;
//#k contains ["AAAAA1", "BBBBBB", "BB1234", "C12345".... ]
my #nn;
//#n contains [ "AAAAA1999", "AAAAABBB134", "D123edae", "C12345CCSAER"]
// "AAAAA1" (from #k) can be found in "AAAAA1999" (in #n) = OK
foreach(#n) {
my $res=array_search(\#k,$_);
if($res) {
$y++;
} else {
$z++;
push #nn,$_;
}
}
--C part
int fastcmp ( char *p1, char *p2 ) {
while( *p1 ){
char *a = p1, *b = p2;
if (*b != *a) return 0;
++p1; ++b;
}
return 1;
}
int array_search(AV *a1, SV *s1){
STRLEN bytes1;
char *p1,*p2,*n;
long a1_size,i,c;
a1_size = av_len(a1);
p1 = SvPV(s1,bytes1);
for(i=start;i<=a1_size;++i){
SV** elem = av_fetch(a1, i, 0);
SV** elem_next = (i<a1_size-1)?av_fetch(a1, i+1, 0):elem;
p2 = SvPV_nolen (*elem);
n = SvPV_nolen (*elem_next);
if (p1[0] == p2[0]) {
if (fastcmp(p1,p2)>0) {
return i;
}
}
if ((p1[0] == p2[0]) && (p2[0] != n[0])) { return -1; }
}
return -1;
}
If somebody could help to optimize the search, that could be nice.
Thanks.
Note: added comments to help what is inside each variables.
The implementation you have fails in many ways:
Fails for #a=chr(0xE9); utf8::upgrade($x=$a[0]); array_search(\#a, $x);
Fails for "abc"=~/(.*)/; array_search(["abc"], $1);
Fails for array_search(["a\0b"], "a\0c");
It also incorrectly assumes the strings are null-ternminated, which can lead to a SEGFAULT when they aren't.
Your approach scans #k for each element of #n, but if you build a trie (as the following code does), it can be scanned once.
my $alt = join '|', map quotemeta, keys %h;
my $re = qr/^(?:$alt)/;
my #nn = sort grep !/$re/, #n;
my $z = #nn;
my $y = #n - #nn;
For example, if there are 1,000 Ns and 1,000 Hs, your solution does up to 1,000,000 comparisons and mine does 1,000.
Note that 5.10+ is needed for the regex optimisation of alternations into a trie. Regexp::List can be used on older versions.
A proper C implementation will be a little faster because you can do a trie search using a function that does just that rather than using the regex engine.