dynamoDB - what is better, 2 queries or 1 scan? - database

what is the better way to get data from dynamoDB (assuming that the table is huge)
in the following scenario:
OPTION1 - 1 scan
get all items, filter by userID in canAccess (I am afraid that the scan will be time consuming)
partitionkey
sortkey
attribute1
attribute2
mainId1
subId1#meta
canAccess
info
set[uesrId1,uesrId2]
string
mainId1
subId2#meta
canAccess
info
set[uesrId1]
string
OPTION2 - 2 queries
res = query GSI: get all canAccess==userId
batch query: get all data by pk in res[pk] filter by start with [extracted subid].
*2.(alternative) get all data by pk,sk in res[pk,sk]
partitionkey
sortkey
attribute1
attribute2
mainId1
subId1# meta
info
...
string
...
mainId1
subId1# uesrId1
canAccess-GSI
...
uesrId1
...
mainId1
subId1# uesrId2
canAccess-GSI
...
uesrId2
...
mainId1
subId2# meta
info
...
string
...
mainId1
subId2# uesrId1
canAccess-GSI
...
uesrId1
...

Related

How can I make search and sort logic between different data in MicroServices?

We have two data tables like item, itemMeta. Each table has CRUD APIs. and each data relations one to one.
<item table in A server>
id name created_at
------------------------
1 a_text 2022-08-23
2 b_text 2022-08-23
3 c_text 2022-08-23
4 d_text 2022-08-23
5 e_text 2022-08-23
...
xxxx hello_text 2022-08-23
...
<itemMeta table in B server>
id itemId price created_at
--------------------------------
1 1 10 2022-08-23
1 11 110 2022-08-23
1 24 420 2022-08-23
1 4 130 2022-08-23
1 5 1340 2022-08-23
....
yyyy xxxx 500 2022-08-23
....
When I want make endpoint like
/search-with-item-meta?search=o_text&page=4&sort=highprice-to-lowprice
I shoud call items with search text and call itemMeta with price sort infomations and then matching two datas with uniq id.
but item table hasn't price and itemMeta table hasn't title and also has pagination. Unfortunately, two table is different DB and seperate place. so It should call with APIs.
simply I will make complete with add field price at item, add field title at itemMeta. But It is not clear. and worried about to sync with two table and pagination.
How can I solve this issues?
We used Postgresql DB with typeorm and NestJS
I am writing this answer based on MySql.
Ensure that you have made the relationship between the entities then Create a queryBuilder and join the two tables in your items repository like below
relationalQuery(){
return itemRepository.createQueryBuilder("items")
.leftJoinAndSelect("items.itemMeta","itemMeta")
}
Now we need a function to pass all parameter to filter them and return paginated response
async findAll(page: number, limit: number, search: string, sort: string, _queryBuilder: SelectQueryBuilder<T>){
queryBuilder = _queryBuilder.take(limit).skip((page - 1) * limit)
if (sort) {
let order = "ASC"
if (sort === "highprice-to-lowprice") {
order = "DESC"
}
queryBuilder.addOrderBy("itemMeta.price", order)
}
if (search) {
queryBuilder.andWhere(new Brackets((qb: SelectQueryBuilder<T>) => {
condition = {
operator: like,
parameters: [`${qb.alias}_name}`, `:${name}`],
}
qb = qb.orWhere(qb['createWhereConditionExpression'](condition), {
name: `%${search}%`
}),
})
})
[items, totalItems] = await queryBuilder.getManyAndCount()
let totalPages = totalItems / limit
if (totalItems % limit) totalPages = Math.ceil(totalPages)
retrun{
items,
totalItems,
totalPages,
}
}
Calling this function
const query = relationalQuery()
findAll(1,15,"o_text","highprice-to-lowprice",query)
This is a demo of how to implement it. You can add a column-wise filter, sort and search.Then you have to pass a object for sort like
sort_by:{
name:"ASC",
id: "DESC",
"itemMeta.price":"DESC"
}
You have to make a customize pagination function like the above where you have to break down all the columns and operations of your own.

Salesforce(apex) Query return values with toLowerCase

I using Salesforce (apex), i need Query that will select values from table and return them in toLowerCase.
some think like this:
//only example (not working code)
for(Users user:[select Name.toLowerCase(),LastName.toLowerCase() from Users ] )
{
//code....
}
For example if i have table Users with
Name | LastName
Boby | Testovich1
Dany | Testovich2
Ron | Testovich3
Query need to return me all values with toLowerCase:
boby testovich1,dany testovich2,ron testovich3
I can do this like this
for(Users user:[select Name,LastName from Users ] )
{
string UserName=user.Name.toLowerCase();
}
but is there a way to to this with querying?
Is there a way to do this in Salesforce (apex) Query ?
No you can't transform the return value to lower case in the query, you'll need to do it after you've gotten the query results.
One alternative is to add a formula field that returns the lower case value and query that instead.

Hive query, better option to self join

So I am working with a hive table that is set up as so:
id (Int), mapper (String), mapperId (Int)
Basically a single Id can have multiple mapperIds, one per mapper such as an example below:
ID (1) mapper(MAP1) mapperId(123)
ID (1) mapper(MAP2) mapperId(1234)
ID (1) mapper(MAP3) mapperId(12345)
ID (2) mapper(MAP2) mapperId(10)
ID (2) mapper(MAP3) mapperId(12)
I want to return the list of mapperIds associated to each unique ID. So for the above example I would want the below returned as a single row.
1, 123, 1234, 12345
2, null, 10, 12
The mapper Strings are known, so I was thinking of doing a self join for every mapper string I am interested in, but I was wondering if there was a more optimal solution?
If the assumption that the mapper column is distinct with respect to a given ID is correct, you could collect the mapper column and the mapperid column to a Map using brickhouse collect. You can clone the repo from that link and build the jar with Maven.
Query:
add jar /complete/path/to/jar/brickhouse-0.7.0-SNAPSHOT.jar;
create temporary function collect as 'brickhouse.udf.collect.CollectUDAF';
select id
,id_map['MAP1'] as mapper1
,id_map['MAP2'] as mapper2
,id_map['MAP3'] as mapper3
from (
select id
,collect(mapper, mapperid) as id_map
from some_table
group by id
) x
Output:
| id | mapper1 | mapper2 | mapper3 |
------------------------------------
1 123 1234 12345
2 10 12

GQL Query (python) to retrieve data using timestamp

I have a table in Google Datastore that holds n values in n columns, and one of them is a timestamp.
The timestamp property is defined like this, inside the table class (Java):
#Persistent
private Date timestamp;
The table is like this:
id | value | timestamp
----------------------------------------------------------
1 | ABC | 2014-02-02 21:07:40.822000
2 | CDE | 2014-02-02 22:07:40.000000
3 | EFG |
4 | GHI | 2014-02-02 21:07:40.822000
5 | IJK |
6 | KLM | 2014-01-02 21:07:40.822000
The timestamp column was added later to the table, so some rows have not the corresponding timestamp value.
I'm trying, using Python Google App Engine to build an api that returns the total number of rows that have a timestamp >= to some value.
For example:
-- This is just an example
SELECT * FROM myTable WHERE timestamp >= '2014-02-02 21:07:40.822000'
I've made this class, in python:
import sys
...
import webapp2
from google.appengine.ext import db
class myTable(db.Model):
value = db.StringProperty()
timestamp = datetime.datetime
class countHandler(webapp2.RequestHandler):
def get(self, tablename, timestamp):
table = db.GqlQuery("SELECT __key__ FROM " + tablename + " WHERE timestamp >= :1", timestamp )
recordsCount = 0
for p in table:
recordsCount += 1
self.response.out.write("Records count for table " + tablename + ": " + str(recordsCount))
app = webapp2.WSGIApplication([
('/count/(.*)/(.*)', countHandler)
], debug=True)
I've successfully deployed it and I'm able to call it, but for some reason I don't understand it's always saying
Records count for table myTable: 0
I'm struggling with the data type for the timestamp.. I think the issue is there.. any idea? which type should it be declared?
Thank you!
You problem (as discussed in the comments as well) seems to be that you are passing a string (probably) to the GqlQuery parameters.
In order to filter your query by datetime you need to pass a datetime object in to the query params. For that take a look here on how to convert that.
Small example:
# not sure how your timestamps are formatted but supposing they are strings
# of eg 2014-02-02 21:07:40.822000
timestamp = datetime.datetime.strptime(timestamp, "%Y-%m-%d %H:%M:%S.%f" )
table = db.GqlQuery("SELECT __key__ FROM " + tablename + " WHERE timestamp >= :1", timestamp)

How to create And / Or relations in a database?

I have a Coupon table. A Coupon can be applicable to certain items only or to a whole category of items.
For example: a 5$ coupon for a Pizza 12" AND (1L Pepsi OR French fries)
The best I could come up with is to make a CouponMenuItems table containing a coupon_id and bit fields such as IsOr and IsAnd. It doesn't work because I have 2 groups of items in this example. The second one being a OR relation between 2 items.
Any idea of how I could do it so the logic to implement is as simple as possible?
Any help or cue appreciated!
Thanks,
Teebot
Often, you can simplify this kind of thing by using Disjunctive Normal Form.
You normalize your logic into a series disjunctions -- "or clauses". Each disjunct is set of "and clauses".
So your rules become the following long disjunction.
Pizza AND Pepsi
OR
Pizza AND french fries
(You can always do this, BTW, with any logic. The problem is that some things can be really complicated. The good news is that no marketing person will try bafflingly hard logic on you. Further, the rewrite from any-old-form to disjunctive normal form is an easy piece of algebra.)
This, you'll note, is always two layers deep: always a top-level list of the disjunctions (any one of which could be true) and a a lower-level list of conjuncts (all of which must be true).
So, you have a "Conditions" table with columns like id and product name. This defines a simple comparison between line item and product.
You have a Conjuncts ("mid-level and clause") table with columns like conjunct ID and condition ID. A join between conjunct and condition will produce all conditions for the conjunct. If all of these conditions are true, the conjunct is true.
The have a Disjuncts ("top-level or clause") table with columns like disjunct Id and conjunct ID. If one of these disjuncts is true, the disjunction is true.
A join between disjuncts, conjuncts and conditions produces the complete set of conditions you need to test.
One possible approach to consider. Assuming you created the following classes:
+----------+ 1 +---------------+ *
| Coupon |<#>------>| <<interface>> |<--------------+
+----------+ | CouponItem | |
| +value | +---------------+ |
+----------+ | +cost() | |
+---------------+ |
/|\ |
| |
+--------------------------------+ |
| | | |
LeafCouponItem AndCouponItem OrCouponItem |
<#> <#> |
| | |
+-------------+---------+
And:
class Coupon {
Money value;
CouponItem item;
}
interface CouponItem {
Money cost();
}
class AndCouponItem implements CouponItem {
List<CouponItem> items;
Money cost() {
Money cost = new Money(0);
for (CouponItem item : items) {
cost = cost.add(item.cost());
}
return cost;
}
}
class OrCouponItem implements CouponItem {
List<CouponItem> items;
Money cost() {
Money max = new Money(0);
for (CouponItem item : items) {
max = Money.max(max, item.cost);
}
return max;
}
}
class LeafCouponItem implements CouponItem {
Money cost;
Money cost() {
return cost;
}
}
And map to 2 tables:
COUPON COUPON_ITEM
------ -----------
ID ID
VALUE COUPON_ID (FK to COUPON.ID)
DISCRIMINATOR (AND, OR, or LEAF)
COUPON_ITEM_ID (FK to COUPON_ITEM.ID)
DESCRIPTION
COST
So for your example you would have:
> SELECT * FROM COUPON
ID 100
VALUE 5
And
> SELECT * FROM COUPON_ITEM
ID COUPON_ID DISCRIMINATOR COUPON_ITEM_ID DESCRIPTION COST
200 100 AND NULL NULL NULL
201 100 LEAF 200 PIZZA 10
202 100 OR 200 NULL NULL
203 100 LEAF 202 PEPSI 2
204 100 LEAF 202 FRIES 3
This single table approach is highly denormalised, and some would prefer to have separate tables for each CouponItem implementation.
Most ORM frameworks will be able to take care of the persitence of such a domain of classes.
You will need to group all your relationships together, define how they are grouped, and then assign coupons to those relationships. Essentially you need database entities to represent the parenthesis in your example, though you will need one more outer parenthesis:
(Pizza 12" AND (1L Pepsi OR French fries))
Coupon
CouponId
Name
...
Item
ItemId
Name
...
Group
GroupId
GroupMembership
GroupMembershipId
GroupId
ItemId
ItemAssociation
ItemAssociationId
Item1Id
Item2Id
IsOr : bit -- (default 0 means and)
GroupAssociation
GroupAssociationId
Group1Id
Group2Id
IsOr : bit -- (default 0 means and)
After brainstorming that structure out it looks like something that could be solved with a nodal parent/child relationship hierarchy. The ItemAssociation/GroupAssociation tables smell to me, I think a general Association table that could handle either may be desirable so you could write general purpose code to handle all relationships (though you'd lose referential integrity unless you also generalize Item and Group into a single entity).
Note: also naming an entity Group can create problems. :)
You could treat individual items as their own group (1 member) and just implement pure logic to map coupons to groups.
My suggestion:
Table
primary key
= = = = =
COUPONS
coupon_id
PRODUCT_GROUPS
group_id
ITEM_LIST
item_id
ITEM_GROUP_ASSOC
item_id, group_id
COUPON_GROUP_ASSOC
coupon_id, group_id
COUPON_ITEM_ASSOC
coupon_id, item_id
In the COUPON_ITEM_ASSOC table, have a field indicating how many items the coupon may apply to at once, with some special value indicating "infinite".

Resources