Postgres partitioning use case for business to be implemented - database

Please find my use case below and suggest the appropriate solution:
We have done declarative partitioning over our table A based on some key and further range sub partitioned each partition on the basis of createddate range of 90 days.
In the search query to display last 90 day data , I give both column B and createddate which may pick any of the required partitions.
But how can I display last 90 days data in which a record which is modified last is also shown at the top but its created date might not lie in the last 90 days range..
Basically, I need to partition my huge table based on some key and also on date to split the data into smaller dataset for faster query . The UI needs to display the latest modified records first but the created date also needs to be given in the query to pick the right partition...
Can we do partition differently to achieve this?
Should i find the max modified date and give it in the range of the created date.
Yes I have a btree index on the modified_date column currently.
SELECT * FROM partitioned_table where
A ='Value'
AND created_date >= '2021-03-01 08:16:13.589' and created_date <= '2021-04-02 08:16:13.589' ORDER BY viewpriority desc OFFSET 0 ROWS FETCH NEXT 200 ROWS ONLY;
Here viewpriority is basically a long value containing created_date in milliseconds.
Issue here is : I want to somehow include modified_date also in this query to get the records sorted by modified_date . But that sorting will happen only in the specified created_date range only. I want those latest modified records also whose created_date might not lie in this range.

Related

Specific time range between dates

I want to write a query in SQL Server 2014 that will show me all the rows in specified date range between specific times. I have column DateCreated which contains date and time together.
I can easily filter date but I need all rows from these days in specific time range.
Thank you
Adding to #Larnu's comment, you can add time range criteria in addition to the desired date range so that results are filtered by the time period as well. Below is an example using inclusive start and exclusive end range criteria:
WHERE
DateCreated >= '20200801' AND DateCreated < '20200831')
AND CAST(DateCreated AS time) >= '08:00:00' AND CAST(DateCreated AS time) < '16:00:00')

SQL Server full text search max performance limits within time window

Let's say we have a table with 100 million records. Which are transactions of sales form multiple sellers. Each record has around 14 columns
TABLE SellerTransactions
string SellerId,
string ProductId,
DateTime CreateDate,
string BankNumber,
string Name(name+' '+surname+' 'alias),
string Comments,
decimal Amount
etc...
Each year we will add around 60 million new records.. and the record count will increase by 10% yearly +-
Search will be done by seller Id, then by product Id or products Id's(for multiple products in a time period for that seller).
Now every search will be filtered by time usually 1 week period/ mostly the last week but its also possible to have worst case scenarios when we will search within all the data we have: years.
And also a search should be possible by bank number, or full text search by name or part of it.
SELECT *
FROM SellerTransactions
WHERE SelledId = 'Seller1Guid'
AND ProductId IN 'ProductGuid1,ProductGuid2..'
AND (CreateDate <= CurrentDate AND CreateDate >= (CurrendDate - 7 days))
AND CONTAINS(Name, "BoB Skynet")
ORDER BY CreateDateTime
TAKE 20 SKIP 20
So search time consumption in regards to these scenarios:
After filtering by id's & time range search in up to 10k records
After filtering by id's & time range search in up to 100k records
After filtering by id's & time range search in up to 1 million records
After filtering by id's & time range search in up to 1 to 10 million records
in these cases would it be 0.5s? or 1s or up to 20s?
Also how would the search performance would change if we would also add search in comments column as well as name?
CONTAINS(Name, "BoB Skynet") OR CONTAINS(Comments, "online")
In most cases the searching should be done on small records counts, in very rare cases we would go though millions of rows.. but how much time would it take?
When it would be an good idea to move to Elastic search for example?
Well we are storing a bit of data, but request count for this data is usually small.

Extract data by day from SQL Server

I need to get all the values from a SQL Server database by day (24 hours). I have timestamps column in TestAllData table and I want to select the data which only corresponds to a specific day.
For instance, there are timestamps of DateTime type like '2019-03-19 12:26:03.002', '2019-03-19 17:31:09.024' and '2019-04-10 14:45:12.015' so I want to load the data for the day 2019-03-19 and separately for the day 2019-04-10. Basically, it is needed to get DateTime values with the same date.
Is this possible to use some functions like DatePart or DateDiff for that?
And how can I solve such problem overall?
As in this case, I do not know the exact difference in hours between a timestamp and the end of the day (because there are various timestamps for 1 day) and I need to extract the day itself from the timestamp. After that, I need to group the data by days or something like this and get block by block. For example:
'2019-03-19' - 1200 records
'2019-04-10' - 3500 records
'2019-05-12' - 10000 records and so on
I'm looking for a more generic solution not supplying a timestamp (like '2019-03-19') as a boundary or in a where clause because the problem is not about simply filtering the data by some date!!
UPDATE: In my dataset, I have about 1,000,000 records and more than 100 unique dates. I was thinking about extracting the set of unique dates and then kind of run a query in the loop where the data would be filtered by the provided day. It would look in such a way:
select * from TestAllData where dayColumn = '2019-03-19'
select * from TestAllData where dayColumn = '2019-04-10'
select * from TestAllData where dayColumn = '2019-05-12'
...
I might use this query in my code, so I may run it in the loop from Scala function. However, I am not sure that in terms of performance it would be ok to run separate unique dates extraction query.
Depending on whether you want to be able to work with all the dates (rather than just a subset), one of the easiest ways to achieve this is with a cast:
;with cte as (SELECT cast(my_datetime as date) as my_date, * from TestAllData)
SELECT * FROM cte where my_date = '2019-02-14'
Note when casting datetime to date, times are truncated, ie just the date part is extracted.
As I say though, whether this is efficient, depends on your needs, as all datetime values from all records will be cast to date, before the data is filtered. If you want to select several dates (as opposed to just one or two), however, it may prove overall quicker, as it reads the whole table once and then gives you a column upon which you can much more efficiently filter.
If this is a permanent requirement, though, I would probably use a persisted computed column, which effectively would mean that the casting is done once initially and then only again if the corresponding value changed. For a large table I would also strongly consider an index on the computed column.

DAX Date Range Syntax - Count one Row More Than Once?

I'm working with a SQL 2014 tabular model and I want to create a measure for count based on a date range.
My fact table will have a start and ending date range which can span multiple months. I want the user to be able to select a date range to get a count of records. The catch is that each month a record spans needs to be captured as separate in the count.
For example: Record 1 - 1/1/2014 - 8/21/2014. If the user selects a date range of 3/1 to 5/1, I want the count to return as 3 (March, April and May). If the user selects 6/4/ - 6/4, I want the count to return as 1.
Is there a way I can do this with DAX or should I go the route of creating a record for each month?
Your model is not exactly clear; but assuming that your StartDate and EndDate are in the same row - you are in effect, looking to count the number of month between these 2 dates and that can be achieved as follows;
=(YEAR([EndDate])-YEAR([StartDate]))*12+MONTH([EndDate])-MONTH([StartDate])
Your measure would then SUM() the results of this calculated column.

Criteria to get last not null record present

I have a daily record table where records are stored date wise. I am using hibernate criteria to access data. How do i get the last date till which records are present continuously (date wise continuity) by providing a date range. For example, say records are there from 21-09-2012 to 25-09-2012 , again from 27-09-2012 to 31-09-2012. I want to form a query using criteria to get record of date 25-09-2012 because for 26-09-2012 there are no records (by passing date ge 21-09-2012 and date le 31-09-2012) . I want to know the last date till which records are present continuously. Say the table has three fields - 1.recordId (AI) 2.date 3.Integer record.
Its not a proper solution to your question. But it may be scenario specific.
How about getting the data for a date range and show then on a calender. Change the color of date if the corresponding value is null.
I think HQL will be better way to this in Hibernate:
http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/queryhql.html

Resources