create sql hierarchyid siblings - sql-server

I have a database table to which I have just added a hierarchy column. The only other relevant column is the ID column (primary key). The entry with ID = 1 is my root (set to HierarchyID::GetRoot()). I can create a child in the hierarchy just fine, however I cannot seem to figure out a way to iterate through my existing data to make all of the remaining entries children of the root. All of my attempts end up with all of the other rows having the same Hierarchy value.
IE - the hierarchy should look like this:
ID | Hierarchy
-------------
1 | /
2 | /1
3 | /2
etc
My attempts all look like
ID | Hierarchy
-------------
1 | /
2 | /1
3 | /1
etc
Is there some form of simple update statement or cursor loop I can use to populate my table?
Even better is there a way to populate it so that the Hierarchy.ToString() makes the # in /# equal to the ID? (this would be nice but far from needed.
Thanks in advance.

You can build a string with ID and use it as a parameter to hierarchyid::Parse
update T
set Hierarchy = case when ID = 1
then hierarchyid::GetRoot()
else hierarchyid::Parse('/'+cast(ID as varchar(10))+'/')
end
SQL Fiddle

Related

How to update data in a Many-to-Many linking table?

For simplicity’s sake lets assume there’s a Post table and a Tags table (not the actual use case but this will keep it simple)
posts Table
id | title
--------------------------------
1 | Random Text Here
2 | Another Post About Stuff
tags Table
id | tag
--------------------------------
1 | javascript
2 | node
3 | unrelated-thing
posts_tags table
id| post_id | tag_id
--------------------------------
1 | 1 | 1
2 | 1 | 2
3 | 1 | 3
4 | 2 | 2
A Post can have many Tags and a single Tag could be associated with many Posts.
Web App Assumptions Lets pretend adding/removing a Tag doesn't trigger a single aysnchronous action within the web app against the linking table.
Instead the user would edit the Post (adding or removing any tags already created) then hit Save. The web app would submit JSON including an array of Tags ids associated with the Post to the server which would then process the update request in the code.
For example, post_id=1 is submitted with only tag_id=[1,2] so tag=3 needs to be removed as an association in the linking table.
If a Post or a Tag is deleted, I'd have an ON DELETE CASCADE set on
posts_tags.post_id
posts_tags.tag_id
But what is the best way to update the linking table data in the instance of updating the tags associated with a post?

Option 1:
Get all the Post-Tags for the edited Post 
SELECT * FROM posts_tags WHERE post_id = 1
Determine which tags have been added (and INSERT into linking table)
Determine which tags have been removed (and DELETE from linking table)
Option 2:
Delete ALL tags with the post_id in the linking table
Insert all submitted tags into the linking table
Option 3:
Something I'm not thinking about :)
Would Option 2 have a bigger performance impact on indexes as the table grows?
EDIT:
For clarity, the actual Post and Tag data isn't changed or removed. This is purely about Updating a post's associated tags
The database I'm using is PostgreSQL 9.6
Option 2 would be fine from a performance point of view - much better than option 1, because you have a single operation to delete the old associations, and then a bunch on insert statements. In option 1, you have more queries (your first query to retrieve the associations, and then the deletes if applicable).
As long as your table has an index on post_id, then delete * from posts_tags where post_id = ? will be lightning fast, even on a huge table.
There is an alternative...
posts_tags table
id| post_id | tag_id | version_id
--------------------------------
1 | 1 | 1 | 0
2 | 1 | 2 | 0
3 | 1 | 3 | 1
4 | 2 | 2 | 0
5 | 1 | 1 | 2
6 | 1 | 3 | 2
In this case, you use a versioning mechanism to determine the "current" associations (max(version_id)), so you never have to delete anything - you just insert new rows.
In practice, this is probably no faster, but it does save you that "delete" query.

Subquery for calculated field giving invalid argument to function error

I have a table with a list of stores and attributes that dictate the age of the store in weeks and the order volume of the store. The second table lists the UPLH goals based on age and volume. I want to return the stores listed in the first table along with its associated UPLH goal. The following works correctly:
SELECT store, weeksOpen, totalItems,
(
SELECT max(UPLH)
FROM uplhGoals as b
WHERE b.weeks <= a.weeksOpen AND 17000 between b.vMIn and b.vmax
) as UPLHGoal
FROM weekSpecificUPLH as
a
But this query, which is replacing the hard coded value of totalItems with the field from the first table, gives me the "Invalid argument to function" error.
SELECT store, weeksOpen, totalItems,
(
SELECT max(UPLH)
FROM uplhGoals as b
WHERE b.weeks <= a.weeksOpen AND a.totalItems between b.vMIn and b.vmax
) as UPLHGoal
FROM weekSpecificUPLH as a
Any ideas why this doesnt work? Are there any other options? I can easily use a dmax() and cycle through every record to create a new table but that seems the long way around something that a query should be able to produce.
SQLFiddle: http://sqlfiddle.com/#!9/e123a8/1
It appears that SQLFiddle output (below) was what i was looking for even though Access gives the error.
| store | weeksOpen | totalItems | UPLHGoal |
|-------|-----------|------------|----------|
| 1 | 15 | 13000 | 30 |
| 2 | 37 | 4000 | 20 |
| 3 | 60 | 10000 | 30 |
EDIT:
weekSpecificUPLH is a query not a table. If I create a new test table in Access, with identical fields, it works. This would indicate to me that it has something to do with the [totalItems] field which is actually a calculated result. So instead i replace that field with [a.IPO * a.OPW]. Same error. Its as if its not treating it as the correct type of number.
Ive tried:
SELECT store, weeksOpen, (opw * ipo) as totalItems,
(
SELECT max(UPLH)
FROM uplhGoals as b
WHERE 17000 between b.vMIn and b.vmax AND b.weeks <= a.weeksOpen
) as UPLHGoal
FROM weekSpecificUPLH as
a
which works. but replace the '17000' with 'totalitems' and same error. I even tried using val(totalItems) to no avail.
Try to turn it into
b.vmin < a.totalItems AND b.vmax > a.totalItems
Although there're questions to your DB design.
For future approaches, it would be very helpful if you reveal your DB structure.
For example, it seems you don't have the records in weekSpecificUPLH table related to the records in UPLHGoals table, do you?
Or, more general: these table are not related in any way except for rules described by data itself in Goals table (which is "external" to DB model).
Thus, when you call it "associated" you got yourself & others into confusion, I presume, because everyone immediately start considering the classical Relation in terms of Relational Model.
Something was changing the type of value of totalItems. To solve I:
Copied the weekSpecificUPLH query results to a new table 'tempUPLH'
Used that table in place of the query which correctly pulled the UPLHGoal from the 'uplhGoals' table

How to label result tables in multiple SELECT output

I wrote a simple dummy procedure to check the data that saved in the database. When I run my procedure it output the data as below.
I want to label the tables. Then even a QA person can identify the data which gives as the result. How can I do it?
**Update : ** This procedure is running manually through Management Studios. Nothing to do with my application. Because all I want to check is whether the data has inserted/updated properly.
For better clarity, I want to show the table names above the table as a label.
Add another column to the table, and name it so it will be distinguished by who reads them :)
Select 'Employee' as TABLE_NAME, * from Employee
Output will look like this:
| TABLE_NAME | ID | Number | ...
------------------------------
| Employee | 1 | 123 | ...
Or you can call the column 'Employee'
SELECT 'Employee' AS 'Employee', * FROM employee
The output will look like this:
| Employee | ID | Number | ...
------------------------------
| Employee | 1 | 123 | ...
Add an extra column, whiches name (not value!) is the label.
SELECT 'Employee' AS "Employee", e.* FROM employee e
The output will look like this:
| Employee | ID | Number | ...
------------------------------
| Employee | 1 | 123 | ...
By doing so, you will see the label, even if the result does not contain rows.
I like to stick a whole nother result set that looks like a label or title between the result sets with real data.
SELECT 0 AS [Our Employees:]
WHERE 1 = 0
-- Your first "Employees" query goes here
SELECT 0 AS [Our Departments:]
WHERE 1 = 0
-- Now your second real "Departments" query goes here
-- ...and so on...
Ends up looking like this:
It's a bit looser-formatted with more whitespace than I like, but is the best I've come up with so far.
Unfortunately there is no way of labeling any SELECT query output in SQL Server or SSMS. The very similar thing was once needed in my experience a few years ago. We settled for using a work around:
Adding another table which contains the list of table aliases.
Here is what we did:
We appended the list of tables with another table in the beginning of the data set. So the first Table will look as follows:
Name
Employee
Department
Courses
Class
Attendance
In c# while reading the tables, you can iterate through the first table first and assign TableName to all tables in the DataSet further.
This is best done using Reporting Services and creating a simple report. You can then email this report daily if you wish.

Enforce uniqueness on column based on contents from another column

I have the typical Invoice/InvoiceItems master/detail tables (the ones every book and tutorial out there uses as examples). I also have a "Proforma" table which holds data similar to invoices that are sometimes linked to invoices. Both are linked to each item in the invoice, with a column optionally referencing a proforma, something like this:
id | id_invoice | id_proforma | amount ....... and a bunch of irrelevant stuff
-----------------------------------------------
1 | 1 | null | 100
2 | 1 | null | 40
3 | 2 | 3 | 1000
4 | 3 | 4 | 473
5 | 3 | 4 | 139
Basically, each item in an invoice can be linked to a proforma. There is also a business rule that says that each proforma can be used in only one invoice (it's OK to use it in many items within the same invoice).
Currently that rule is enforced on the application side but this has problems with concurrency, as 2 users could take the same proforma at the same time and the system would let it pass. My intention is to have the DB validate this in addition to some front-end visual clues, but so far I've failed to come with an approach for this particular case.
Filtered unique indexes could serve well, except that the same proforma can be used twice if it's for the same invoice, so my question is, how can I make the DB server enforce that rule?
Database engine can be SQL 2012 or latter and any edition from express to enterprise.
You can create a user-defined scalar function that returns TRUE if the proforma id and invoice id combination are valid. Then put a check constraint on the table requiring the function to return true. Like this (tweak to fit your table name/needs):
-- Here's the function:
create function dbo.svfIsCombinationValid (
#id_invoice int
, #id_proforma int
)
returns bit
as
begin;
declare #return bit = 1;
if exists (
select 1
from dbo.YourInvoiceProformaXRefTable
where id_proforma = #id_proforma
and id_invoice <> #id_invoice
)
begin;
set #return = 0;
end;
return #return;
end;
After that, you can alter the table and add the check constraint:
alter table dbo.YourInvoiceProformaXRefTable
add constraint CK_YourInvoiceProformaXRefTable_UniqueInvoiceProforma
check (dbo.svfIsCombinationValid(id_invoice,id_proforma)=1);
This is OK with nulls (multiple id_invoice can have id_proforma NULL values). but if both values are not null, then the combination must either be NEW or the same as existing rows.

References in a table

I have a table like this, that contains items that are added to the database.
Catalog table example
id | element | catalog
0 | mazda | car
1 | penguin | animal
2 | zebra | animal
etc....
And then I have a table where the user selects items from that table, and I keep a reference of what has been selected like this
User table example
id | name | age | itemsSelected
0 | john | 18 | 2;3;7;9
So what I am trying to say, is that I keep a reference to what the user has selected as a string if ID's, but I think this seems a tad troublesome
Because when I do a query to get information about a user, all I get is the string of 2;3;7;9, when what I really want is an array of the items corresponing to those ID's
Right now I get the ID's and I have to split the string, and then run another query to find the elements the ID's correspond to
Is there any easier ways to do this, if my question is understandable?
Yes, there is a way to do this. You create a third table which contains a map of A/B. It's called a Multiple to Multiple foreign-key relationship.
You have your Catalogue table (int, varchar(MAX), varchar(MAX)) or similar.
You have your User table (int, varchar(MAX), varchar(MAX), varchar(MAX)) or similar, essentially, remove the last column and then create another table:
You create a UserCatalogue table: (int UserId, int CatalogueId) with a Primary Key on both columns. Then the UserId column gets a Foreign-Key to User.Id, and the CatalogueId table gets a Foreign-Key to Catalogue.Id. This preserves the relationship and eases queries. It also means that if Catalogue.Id number 22 does not exist, you cannot accidentally insert it as a relation between the two. This is called referential-integrity. The SQL Server mandates that if you say, "This column must have a reference to this other table" then the SQL Server will mandate that relationship.
After you create this, for each itemsSelected you add an entry: I.e.
UserId | CatalogueId
0 | 2
0 | 3
0 | 7
0 | 9
This also alows you to use JOINs on the tables for faster queries.
Additionally, and unrelated to the question, you can also optimize the Catalogue table you have a bit, and create another table for CatalogueGroup, which contains your last column there (catalog: car, animal) which is referenced via a Foreign-Key Relationship in the current Catalogue table definition you have. This will also save storage space and speed up SQL Server work, as it no longer has to read a string column if you only want the element value.

Resources