Assistance with a 4 table Join operation - sql-server

In the attempt of being as clear as posible, I have 4 tables in my database as it follows
Join_Contrato_Medidor
ID_Union (identity)
ID_Contrato
ID_Medidor
Omitido (filger ?)
Promedios
ID_Contrato
ID_Medidor
ID_Marchamo
{Info I want}
Medidores
ID_Medidor
ID_Dispensario (filter ?)
Marchamo
ID_Marchamo
My current SQL Statement...
SELECT {Promedios.LI_1, Promedios.LF_1, Promedios.Total_1, Promedios.Qva_1, ...}
FROM (((
Join_Contrato_Medidor LEFT OUTER JOIN
Promedios ON Join_Contrato_Medidor.ID_Contrato = Promedios.ID_Contrato)
LEFT OUTER JOIN
Medidores ON Join_Contrato_Medidor.ID_Medidor = Medidores.ID_Medidor)
LEFT OUTER JOIN
Marchamo ON Promedios.ID_Marchamo = Marchamo.ID_Marchamo)
WHERE (Join_Contrato_Medidor.ID_Contrato = ?) AND (Medidores.ID_Dispensario = ?) AND (Join_Contrato_Medidor.Omitido <> TRUE)
The output im obtaining:
Information Columns | Omitido | ID_Union
Info | False | 806
Info | False | 806
Info | False | 806
Info | False | 806
*I wanted to include an image but I cannot do so until I have more reputation :( *
I have those 4 tables that I am Joining right now. I am currently getting all the columns desired to be output in the query, but the thing is that I would only like to get those records in which --Join_Contrato_Medidor.Omitido <> true-- instead of getting ALL records that match the ID_Contrato and ID_Dispensario conditions.
As a sample, I am outputing ID_Union, which is the identity field for the Join_Contrato_Medidor. It is marking all the records with a single ID_Union, which happens to be the only one record out of the 4 that has Omitido <> true. Also, the latest 3 records have their Omitido field set to true in the database nevertheless it is showing false in the query result.
If the question is unclear, please post me for clarification.
Thanks in advance

After working on other things until I had to face this issue again, I am back checking it. Your comment led me to try and see if switching the order of the tables would do the job, and it did! Thank you very much.
I started asking for hte Promedios table first and the nperform the rest of the query. This gave me access to the exact information that I wanted. Moreover, all the following queries I created them following this order and lead to better shorter queries.

Related

SQL recursive get BOM from PSP

I am having a MS SQL Server (2016) and a database which contains i.a. table like this : (it´s a view created in an Autodesk PSP Database - please don´t ask why ... :-) )
CHILD_AIMKEY
QUANTITY
PARENT_AIMKEY
StatusOfParent
StatusOfChild
5706657
1
5664344
100
103
5706745
1
5664344
100
103
5707104
1
5664344
100
103
5707109
1
5664344
100
100
5801062
1
5664344
100
103
The "children" can contain other "children" and in that case they would be their "parents".
So it´s a standard structured BOM table from a CAD PDM System.
If I do the following "Select Statement" I get all the children of the top level parent:
SELECT [CHILD_AIMKEY] , [POSITION], [QUANTITY] ,[PARENT_AIMKEY],[StatusOfParent],[StatusChild] FROM database_table where Parent_aimkey = '5664344'
(as shown in the table above)
My first question is : How to recursivly process all children of each parent from that table ? (Could be an other table or direct output)
The format should be: Parent_Aimkey, Child_Aimkey, Quantity
The second question is a bit more complicated:
I try it with some "pseudo code":
If Tree_Level_of_DIRECT_Parent < 3 then show CHILD_AIMKEY,QUANTITY in queryresult_above
If Tree_Level_of_DIRECT_Parent > 2 and StatusOf_DIRECT_Parent = 103 and StatusOf_DIRECT_Child = 103 then show CHILD_AIMKEY,QUANTITY in queryresult_above
Is that in some way possible ? (If there is a need to extend the database view of an other field or another table, that´s no problem)
I know this looks a bit confusing, but what I need is the Autodesk Inventor structured BOM in an SQL Statement or stored procedure.
Any would be really much appreciated
Thanks
Alex.

Decrease execution time of SQL query

I've got a question in terms of processing and making a query more efficient whilst maintaining its accuracy. Before I display the query I'd like to point out some basics of it.
I've got a case that manipulates the where-clause to get all childs of the parent. Basically I've got two types of data that I need to display; a red and a green type. The red type has a column (TRK_TrackerGroup_LKID2) set to NULL by default, whereas the green data has a value in said column (ranging from 5-7).
My problem is that I need to extract both types of data to accurately get a count of outstanding issues in a view, but doing so (by adding the case) the execution time goes from < 1 second to well over 15 seconds.
This is the query (with the mentioned case):
SELECT TS.id AS TrackerStartDateID,
TSM.mappingtypeid,
TSM.maptoid,
TFLK.trk_trackergroup_lkid,
Count(TF.id) AS Cnt
FROM [dbo].[trk_startdate] TS
INNER JOIN [dbo].[trk_startdatemap] TSM
ON TS.id = TSM.trk_startdateid
AND TSM.deletedflag = 0
INNER JOIN [dbo].[trk_trackerfeatures] TF
ON TF.trk_startdateid = TS.id
AND TF.deletedflag = 0
INNER JOIN [dbo].[trk_trackerfeatures_lk] TFLK
ON TFLK.id = TF.trk_feature_lkid
WHERE TS.deletedflag = 0
AND TF.applicabletoproject = 1
AND TF.readyforwork = CASE -- HERE IS THE PROBLEM
WHEN TF.trk_trackerstatus_lkid2 IS NULL THEN 0
ELSE 1
END
AND TF.datestamp = (SELECT Max(TF2.datestamp)
FROM [dbo].[trk_trackerfeatures] TF2
INNER JOIN [dbo].[trk_trackerfeatures_lk] TFLK2
ON TFLK2.id = TF2.trk_feature_lkid
WHERE TF.trk_startdateid = TF2.trk_startdateid
AND TFLK2.trk_trackergroup_lkid = TFLK.trk_trackergroup_lkid)
GROUP BY TS.id,
TSM.mappingtypeid,
TSM.maptoid,
TFLK.trk_trackergroup_lkid,
TF.datestamp
It functions as a 'parent' in the sense that it grabs the latest inserted data-set (using DateStamp) from every single child-group. This is necessary to produce a parent-report in SSRS report at a later time, but at the moment my problem (as mentioned above) is the execution time.
I'd like to hear if there are any suggestions on how to decrease the execution time whilst maintaining the accuracy of the query.
Expected output:
Without the case I get this:
Your problem is this condition cant use INDEX
AND TF.readyforwork = CASE -- HERE IS THE PROBLEM
WHEN TF.trk_trackerstatus_lkid2 IS NULL THEN 0
ELSE 1
END
Try to change it to
AND ( TF.readyforwork = 0 and TF.trk_trackerstatus_lkid2 IS NULL
OR TF.readyforwork = 1 and TF.trk_trackerstatus_lkid2 IS NOT NULL
)
But again you should check with EXPLAIN ANALIZE to test if your query is using index or not.
The most problematic bit of your query seems to be the correlated subquery, because you must call it for every possible row.
You should optimize this first. To do so you can add indexes that the engine could use to quickly calculate that value on each row.
Based on your query I would add these two indexes multiples :
On Table trackerfeatures, index fields : trk_startdateid, datestamp
On Table trk_trackerfeatures_lk, index fields : id, trk_trackergroup_lkid

Update Access database (multiple rows)

I am totally newbie to Access, I used to use Excel to handle my needs for a while.
But by now Excel has become too slow to handle such a big set of data, so I decided to migrate to Access.
Here is my problem
My columns are:
Number | Link | Name | Status
1899 | htto://example.com/code1 | code1 | Done
2 | htto://example.com/code23455 | code23455 | Done
3 | htto://example.com/code2343 | code2343 | Done
13500 | htto://example.com/code234cv | code234cv | Deleted
220 | htto://example.com/code234cv | code234cv | Null
400 | htto://example.com/code234cv | code234cv | Null
So I want a way to update Status of my rows according to numbers list.
For example I want to update Status column for multiple numbers to become Done
Simply I want to update "Null status" to become "Done" according to this number list
13544
17
13546
12
13548
13549
16000
13551
13552
13553
13554
13555
12500
13557
13558
13559
13560
30
13562
13563
Something like this
I tried "update query" but I don't know how to use criteria to solve this problem
In Excel I did that by "conditional formatting duplicates" -with my number list which I wanted to update-
Then "sort by highlighted color" then "fill copy" the status with the value
I know that Access is different but I hope that there is a way to do this task as Excel did.
Thanks in advance
From my understanding, You can try
Update TblA
Set TblA.Status="Done"
where Number in (13544,17,13546,....)
Or alternatively easy method is to pull these numbers in IN clause into its own table and use it like this
Update TblA
Set TblA.Status="Done" where Number in (select NumCol from NumTable )
or this solution may help you Here

Postgresql - Clean way to insert records if they don't exist, update if they do

Here's my situation. I have a table with a bunch of URLs and crawl
dates associated with them. When my program processes a URL, I want
to INSERT a new row with a crawl date. If the URL already exists, I
want to update the crawl date to the current datetime. With MS SQL or
Oracle I'd probably use a MERGE command for this. With mySQL I'd
probably use the ON DUPLICATE KEY UPDATE syntax.
I could do multiple queries in my program, which may or may not be
thread safe. I could write a SQL function which has various IF...ELSE
logic. However, for the sake of trying out Postgres features I've
never used before, I'm thinking about creating an INSERT rule -
something like this:
CREATE RULE Pages_Upsert AS ON INSERT TO Pages
WHERE EXISTS (SELECT 1 from Pages P where NEW.Url = P.Url)
DO INSTEAD
UPDATE Pages SET LastCrawled = NOW(), Html = NEW.Html WHERE Url = NEW.Url;
This seems to actually work great. It probably loses some points on
the "code readability" standpoint, as someone looking at my code for
the first time would have to magically know about this rule, but I
guess that could be solved with good code commenting and
documentation.
Are there any other drawbacks to this idea, or maybe a "your idea
sucks, you should do it /this/ way instead" comment? I'm on PG 9.0 if
that matters.
UPDATE: Query plan since someone wanted it :)
"Insert (cost=2.79..2.81 rows=1 width=0)"
" InitPlan 1 (returns $0)"
" -> Seq Scan on pages p (cost=0.00..2.79 rows=1 width=0)"
" Filter: ('http://www.foo.com'::text = lower((url)::text))"
" -> Result (cost=0.00..0.01 rows=1 width=0)"
" One-Time Filter: ($0 IS NOT TRUE)"
""
"Update (cost=2.79..5.46 rows=1 width=111)"
" InitPlan 1 (returns $0)"
" -> Seq Scan on pages p (cost=0.00..2.79 rows=1 width=0)"
" Filter: ('http://www.foo.com'::text = lower((url)::text))"
" -> Result (cost=0.00..2.67 rows=1 width=111)"
" One-Time Filter: $0"
" -> Seq Scan on pages (cost=0.00..2.66 rows=1 width=111)"
" Filter: ((url)::text = 'http://www.foo.com'::text)"
Ok, I managed to create a testcase. The result is that the update part is always executed, even on a fresh insert. COPY seems to bypass the rule system.
[For clarity I have put this into a separate reply]
DROP TABLE pages CASCADE;
CREATE TABLE pages
( url VARCHAR NOT NULL PRIMARY KEY
, html VARCHAR
, last TIMESTAMP
);
INSERT INTO pages(url,html,last) VALUES ('www.example.com://page1' , 'meuk1' , '2001-09-18 23:30:00'::timestamp );
CREATE RULE Pages_Upsert AS ON INSERT TO pages
WHERE EXISTS (SELECT 1 from pages P where NEW.url = P.url)
DO INSTEAD (
UPDATE pages SET html=new.html , last = NOW() WHERE url = NEW.url
);
INSERT INTO pages(url,html,last) VALUES ('www.example.com://page2' , 'meuk2' , '2002-09-18 23:30:00':: timestamp );
INSERT INTO pages(url,html,last) VALUES ('www.example.com://page3' , 'meuk3' , '2003-09-18 23:30:00':: timestamp );
INSERT INTO pages(url,html,last) SELECT pp.url || '/added'::text, pp.html || '.html'::text , pp.last + interval '20 years' FROM pages pp;
COPY pages(url,html,last) FROM STDIN;
www.example.com://pageX stdin 2000-09-18 23:30:00
\.
SELECT * FROM pages;
The result:
url | html | last
-------------------------------+------------+----------------------------
www.example.com://page1 | meuk1 | 2001-09-18 23:30:00
www.example.com://page2 | meuk2 | 2011-09-18 23:48:30.775373
www.example.com://page3 | meuk3 | 2011-09-18 23:48:30.783758
www.example.com://page1/added | meuk1.html | 2011-09-18 23:48:30.792097
www.example.com://page2/added | meuk2.html | 2011-09-18 23:48:30.792097
www.example.com://page3/added | meuk3.html | 2011-09-18 23:48:30.792097
www.example.com://pageX | stdin | 2000-09-18 23:30:00
(7 rows)
UPDATE: Just to prove it can be done:
INSERT INTO pages(url,html,last) VALUES ('www.example.com://page1' , 'meuk1' , '2001-09-18 23:30:00'::timestamp );
CREATE VIEW vpages AS (SELECT * from pages);
CREATE RULE Pages_Upsert AS ON INSERT TO vpages
DO INSTEAD (
UPDATE pages p0
SET html=NEW.html , last = NOW() WHERE p0.url = NEW.url
;
INSERT INTO pages (url,html,last)
SELECT NEW.url, NEW.html, NEW.last
WHERE NOT EXISTS ( SELECT * FROM pages p1 WHERE p1.url = NEW.url)
);
CREATE RULE Pages_Indate AS ON UPDATE TO vpages
DO INSTEAD (
INSERT INTO pages (url,html,last)
SELECT NEW.url, NEW.html, NEW.last
WHERE NOT EXISTS ( SELECT * FROM pages p1 WHERE p1.url = OLD.url)
;
UPDATE pages p0
SET html=NEW.html , last = NEW.last WHERE p0.url = NEW.url
;
);
INSERT INTO vpages(url,html,last) VALUES ('www.example.com://page2' , 'meuk2' , '2002-09-18 23:30:00':: timestamp );
INSERT INTO vpages(url,html,last) VALUES ('www.example.com://page3' , 'meuk3' , '2003-09-18 23:30:00':: timestamp );
INSERT INTO vpages(url,html,last) SELECT pp.url || '/added'::text, pp.html || '.html'::text , pp.last + interval '20 years' FROM vpages pp;
UPDATE vpages SET last = last + interval '-10 years' WHERE url = 'www.example.com://page1' ;
-- Copy does NOT work on views
-- COPY vpages(url,html,last) FROM STDIN;
-- www.example.com://pageX stdin 2000-09-18 23:30:00
-- \.
SELECT * FROM vpages;
Result:
INSERT 0 1
INSERT 0 1
INSERT 0 3
UPDATE 1
url | html | last
-------------------------------+------------+---------------------
www.example.com://page2 | meuk2 | 2002-09-18 23:30:00
www.example.com://page3 | meuk3 | 2003-09-18 23:30:00
www.example.com://page1/added | meuk1.html | 2021-09-18 23:30:00
www.example.com://page2/added | meuk2.html | 2022-09-18 23:30:00
www.example.com://page3/added | meuk3.html | 2023-09-18 23:30:00
www.example.com://page1 | meuk1 | 1991-09-18 23:30:00
(6 rows)
The view is necessary to prevent the rewrite system to go into recursion.
Construction of a DELETE rule is left as an exercise to the reader.
Some good points from someone who should know it or be very near to someone like that ;-)
What are PostgreSQL RULEs good for?
Short story:
Do the rules work well with SERIAL and BIGSERIAL ?
Do the rules work well with the RETURNING clauses of INSERT and UPDATE ?
Do the rules work well with stuff like random()?
All these things boils down to the fact, that the rule system is not row driven but transforms your statements in a way you never imagine.
Do yourself and your team mates a favour and stop using roles for things like that.
Edit: Your problem is well discussed in the PostgreSQL community. Search keywords are: MERGE, UPSERT.
I don't know if this gets too subjective but what I think about your solution is: It's all about semantics. When I do an insert, I expect an insert and not some fancy logic that maybe does an insert but maybe not. Indeed that's what functions are for.
At first I'd try checking for the URL in your program and then choosing whether to insert or update. If that turned out to be too slow, I'd use a function. If you name it like insert_or_update_url, you automatically get some documentation for free. The rewrite rule requires you to have some implicit knowledge and I generally try to avoid that.
On the plus side: If someone copies the data but forgets rules and functions, your solution might break silently (but that may depend on other constraints), but a missing function goes down screaming. Don't get me wrong, I think your solution is very creative and smart. Just a bit too obscure for my taste.
There's an example of implementing upsert / merge using simple function in Postgres documentation.
Never use rules — they're evil.
You cannot refer to other tables than old an new in the rule qualification.
You should instead do this in the rule body.
This is all because the rule is just a way to inform the rewrite system about what transformations it should and should not perform. Rules are not triggers, executing for every row, but they give the query planner a fine massage and ask it nicely to rewrite the plan.
From the docs:
What is a rule qualification? It is a restriction that tells when the actions of the rule should be done and when not. This qualification can only reference the pseudorelations NEW and/or OLD, which basically represent the relation that was given as object (but with a special meaning).

How to get last access/modification date of a PostgreSQL database?

On development server I'd like to remove unused databases. To realize that I need to know if database is still used by someone or not.
Is there a way to get last access or modification date of given database, schema or table?
You can do it via checking last modification time of table's file.
In postgresql,every table correspond one or more os files,like this:
select relfilenode from pg_class where relname = 'test';
the relfilenode is the file name of table "test".Then you could find the file in the database's directory.
in my test environment:
cd /data/pgdata/base/18976
ls -l -t | head
the last command means listing all files ordered by last modification time.
There is no built-in way to do this - and all the approaches that check the file mtime described in other answers here are wrong. The only reliable option is to add triggers to every table that record a change to a single change-history table, which is horribly inefficient and can't be done retroactively.
If you only care about "database used" vs "database not used" you can potentially collect this information from the CSV-format database log files. Detecting "modified" vs "not modified" is a lot harder; consider SELECT writes_to_some_table(...).
If you don't need to detect old activity, you can use pg_stat_database, which records activity since the last stats reset. e.g.:
-[ RECORD 6 ]--+------------------------------
datid | 51160
datname | regress
numbackends | 0
xact_commit | 54224
xact_rollback | 157
blks_read | 2591
blks_hit | 1592931
tup_returned | 26658392
tup_fetched | 327541
tup_inserted | 1664
tup_updated | 1371
tup_deleted | 246
conflicts | 0
temp_files | 0
temp_bytes | 0
deadlocks | 0
blk_read_time | 0
blk_write_time | 0
stats_reset | 2013-12-13 18:51:26.650521+08
so I can see that there has been activity on this DB since the last stats reset. However, I don't know anything about what happened before the stats reset, so if I had a DB showing zero activity since a stats reset half an hour ago, I'd know nothing useful.
PostgreSQL 9.5 let us to track last modified commit.
Check track commit is on or off using the following query
show track_commit_timestamp;
If it return "ON" go to step 3 else modify postgresql.conf
cd /etc/postgresql/9.5/main/
vi postgresql.conf
Change
track_commit_timestamp = off
to
track_commit_timestamp = on
Restart the postgres / system
Repeat step 1.
Use the following query to track last commit
SELECT pg_xact_commit_timestamp(xmin), * FROM YOUR_TABLE_NAME;
SELECT pg_xact_commit_timestamp(xmin), * FROM YOUR_TABLE_NAME where COLUMN_NAME=VALUE;
My way to get the modification date of my tables:
Python Function
CREATE OR REPLACE FUNCTION py_get_file_modification_timestamp(afilename text)
RETURNS timestamp without time zone AS
$BODY$
import os
import datetime
return datetime.datetime.fromtimestamp(os.path.getmtime(afilename))
$BODY$
LANGUAGE plpythonu VOLATILE
COST 100;
SQL Query
SELECT
schemaname,
tablename,
py_get_file_modification_timestamp('*postgresql_data_dir*/*tablespace_folder*/'||relfilenode)
FROM
pg_class
INNER JOIN
pg_catalog.pg_tables ON (tablename = relname)
WHERE
schemaname = 'public'
I'm not sure if things like vacuum can mess this aproach, but in my tests it's a pretty acurrate way to get tables that are no longer used, at least, on INSERT/UPDATE operations.
I guess you should activate some log options. You can get information about logging on postgreSQL here.

Resources