Sort by field from jsonb column - npgsql

I'm using npgsql to store data about shipments and part of table is jsonb column storing some details about shipment including details about customer who made shipment.
Table for displaying data about shipments is displaying only Customer Name and if get that record via
CustomerName = shipment.Metadata.RootElement.GetProperty("customer").GetProperty("customerName").ToString(),
Request is that I make this column sortable so I would need to sort by this property while accessing database.
Is it even possible to do it in NpgSql?

You can easily sort by a property inside a JSON document - just specify that property in your OrderBy clause:
_ = await ctx.Shipments
.OrderBy(s => s.Metadata.RootElement.GetProperty("customer").GetProperty("customerName").GetString())
.ToListAsync();
This produces the following:
SELECT s."Id", s."Metadata"
FROM "Shipments" AS s
ORDER BY s."Metadata"#>>'{customer,customerName}'
You should probably be able to make this use an index as well.

Related

extracting name/value pairs into columns and values - Shopify Data In Bigquery

We use FiveTran to extract our data from shopify and store it in BigQuery. The field "properties" within the order_line table contains what looks like an array of key/value pairs. In this case name/value. The field type is string here is an example of the contents
order_line_id properties
9956058529877 [{"name":"_order_bump_rule_id","value":"4afx7cbw6"},{"name":"_order_bump_bump_id","value":"769d1996-b6fb-4bc3-8d41-c4d7125768c5"},{"name":"_source","value":"order-bump"}]
4467731660885 [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_delay","value":null},{"name":"charge_on_day_of_week","value":null},{"name":"charge_interval_frequency","value":null},{"name":"charge_on_day_of_month","value":null},{"name":"shipping_interval_frequency","value":null},{"name":"number_charges_until_expiration","value":null}]
4467738738773 [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_delay","value":null},{"name":"charge_on_day_of_week","value":null},{"name":"charge_interval_frequency","value":null},{"name":"charge_on_day_of_month","value":null},{"name":"shipping_interval_frequency","value":null},{"name":"number_charges_until_expiration","value":null}]
4578798600277 [{"name":"shipping_interval_unit_type","value":null},{"name":"charge_interval_frequency","value":null},{"name":"shipping_interval_frequency","value":null}]
I am trying to write a query that generate one row per record with a column for each of these name values:
shipping_interval_unit_type
charge_on_day_of_week
charge_interval_frequency
charge_on_day_of_month
subscription_id
number_charges_until_expiration
shipping_interval_frequency
and the corresponding "value". This field "properties" can contain many different "name" values and they can be in different order each time. The "name" values noted above are not always present in the "properties" field.
I've tried json functions but it doesn't seem to be properly formatted for json. I've tried unnesting it but that fails since it is a string.
Consider below approach
select * from (
select order_line_id,
json_extract_scalar(property, '$.name') name,
json_extract_scalar(property, '$.value') value
from your_table, unnest(json_extract_array(properties)) property
)
pivot (min(value) for name in (
'shipping_interval_unit_type',
'charge_on_day_of_week',
'charge_interval_frequency',
'charge_on_day_of_month',
'subscription_id',
'number_charges_until_expiration',
'shipping_interval_frequency'
))

Query of records based on relationships

I have an object Contract that has a look-up to another object Indexationtype. I have another object IndexationEntry that has master-detail to Indexationtype. Now I would like to get the value of the percentage field in the IndexationEntry onto Contract based on the yer fields. The year in the IndexationEntry matches Year in Contract. How should I achieve this?
From Contract "up" to IndexationType__c, then "down" to IndexationEntry__c?
If there's no direct link between them it's not going to be pretty. One way would be something like this
SELECT Id, Name,
(SELECT Id, ContractNumber FROM Contracts__r WHERE Year__c = '2021'),
(SELECT Id, Percent__c FROM IndexationEntries__r WHERE Year__c = '2021')
FROM IndexationType__c
You'd have to run it once for each year. Or (since you tagged it Apex) maybe you can prepare the reference data a bit, query Indexation Types + Entries and build something like Map<Id, Map<Integer, IndexationEntry__c>> (1st key is by Indexation Type Id, then by year). Query them, populate the Map, then loop through your contracts and use map.get() to fetch your values.

How to apply condition inside column. If column having array data?

Suppose I have category table and having name field.
Inside name column I have array data like ["News","Entairntment","Food"].
Here how will I get data which category.name = Food.
Here I can't make where or where_in condition.
Any help.
thanks!!

Is there an algorithm or pattern to merge several rows of the same record into one row?

Due to some unknown fault, every time I have sync'ed my Nokia's Contacts with my Outlook Contacts, via Nokia Suite, each contact on the phone gets added to Outlook again. I now have up to four copies of some contacts in Outlook. Some have different fields populated in different duplicates.
What I want to do is import my contacts into an a database table or in-memory object collection, from CSV, and then merge the properties of all copies of each 'unique' record into one record, and import back into an empty Contacts folder in outlook. Is there any elegant way to do this, either in plain C#, LINQ, or T-SQL?
Or do I just loop through all copies (rows) of the first column, and copy any values found into versions of that column that are blank or less up to date, then carry on iterating onto the second column through to the last?
My strategy would be to first group all rows on some key like new { FirstName, LastName } or EMail (I don't know what your data looks like). Now you have groups of rows that all belong to the same person. You now need to merge them (using any algorithm you like). You could either choose the newest one, or merge individual attributes like this:
from r in rows
group r by r.EMail into g
select new {
EMail = g.Key,
DateOfBirth = g.Select(x => x.DateOfBirth).Where(x => x != null).First(),
...
}
In this example I'm picking the first non-null value for DateOfBirth non-deterministically.

Best database design (model) for user tables

I'm developping a web application using google appengine and django, but I think my problem is more general.
The users have the possibility to create tables, look: tables are not represented as TABLES in the database. I give you an example:
First form:
Name of the the table: __________
First column name: __________
Second column name: _________
...
The number of columns is not fixed, but there is a maximum (100 for example). The type in every columns is the same.
Second form (after choosing a particular table the user can fill the table):
column_name1: _____________
column_name2: _____________
....
I'm using this solution, but it's wrong:
class Table(db.Model):
name = db.StringProperty(required = True)
class Column(db.Model):
name = db.StringProperty(required = True)
number = db.IntegerProperty()
table = db.ReferenceProperty(table, collection_name="columns")
class Value(db.Model):
time = db.TimeProperty()
column = db.ReferenceProperty(Column, collection_name="values")
when I want to list a table I take its columns and from every columns I take their values:
data = []
for column in data.columns:
column_data = []
for value in column.values:
column_data.append(value.time)
data.append(column_data)
data = zip(*data)
I think that the problem is the order of the values, because it is not true that the order for one column is the same for the others. I'm waiting for this bug (but until now I never seen it):
Table as I want: as I will got:
a z c a e c
d e f d h f
g h i g z i
Better solutions? Maybe using ListProperty?
Here's a data model that might do the trick for you:
class Table(db.Model):
name = db.StringProperty(required=True)
owner = db.UserProperty()
column_names = db.StringListProperty()
class Row(db.Model):
values = db.ListProperty(yourtype)
table = db.ReferenceProperty(Table, collection_name='rows')
My reasoning:
You don't really need a separate entity to store column names. Since all columns are of the same data type, you only need to store the name, and the fact that they are stored in a list gives you an implicit order number.
By storing the values in a list in the Row entity, you can use an index into the column_names property to find the matching value in the values property.
By storing all of the values for a row together in a single entity, there is no possibility of values appearing out of their correct order.
Caveat emptor:
This model will not work well if the table can have columns added to it after it has been populated with data. To make that possible, every time that a column is added, every existing row belonging to that table would have to have a value appended to its values list. If it were possible to efficiently store dictionaries in the datastore, this would not be a problem, but list can really only be appended to.
Alternatively, you could use Expando...
Another possibility is that you could define the Row model as an Expando, which allows you to dynamically create properties on an entity. You could set column values only for the columns that have values in them, and that you could also add columns to the table after it has data in it and not break anything:
class Row(db.Expando):
table = db.ReferenceProperty(Table, collection_name='rows')
#staticmethod
def __name_for_column_index(index):
return "column_%d" % index
def __getitem__(self, key):
# Allows one to get at the columns of Row entities with
# subscript syntax:
# first_row = Row.get()
# col1 = first_row[1]
# col12 = first_row[12]
value = None
try:
value = self.__dict__[Row.__name_for_column_index]
catch KeyError:
# The given column is not defined for this Row
pass
return value
def __setitem__(self, key, value):
# Allows one to set the columns of Row entities with
# subscript syntax:
# first_row = Row.get()
# first_row[5] = "New values for column 5"
self.__dict__[Row.__name_for_column_index] = value
# In order to allow efficient multiple column changes,
# the put() can go somewhere else.
self.put()
Why don't you add an IntegerProperty to Value for rowNumber and increment it every time you add a new row of values and then you can reconstruct the table by sorting by rowNumber.
You're going to make life very hard for yourself unless your user's 'tables' are actually stored as real tables in a relational database. Find some way of actually creating tables and use the power of an RDBMS, or you're reinventing a very complex and sophisticated wheel.
This is the conceptual idea I would use:
I would create two classes for the data-store:
table this would serve as a
dictionary, storing the structure of
the pseudo-tables your app would
create. it would have two fields :
table_name, column_name,
column_order . where column_order
would give the position of the
column within the table
data
this would store the actual data in
the pseudo-tables. it would have
four fields : row_id, table_name,
column_name , column_data. row_id
would be the same for data
pertaining to the same row and would
be unique for data across the
various pseudo-tables.
Put the data in a LongBlob.
The power of a database is to be able to search and organise data so that you are able to get only the part you want for performances and simplicity issues : you don't want the whole database, you just want a part of it and want it fast. But from what I understand, when you retrieve a user's data, you retrieve it all and display it. So you don't need to sotre the data in a normal "database" way.
What I would suggest is to simply format and store the whole data from a single user in a single column with a suitable type (LongBlob for example). The format would be an object with a list of columns and rows of type. And you define the object in whatever language you use to communicate with the database.
The columns in your (real) database would be : User int, TableNo int, Table Longblob.
If user8 has 3 tables, you will have the following rows :
8, 1, objectcontaintingtable1;
8, 2, objectcontaintingtable2;
8, 3, objectcontaintingtable3;

Resources