Designing a Stored procedure to create XML tree - sql-server

I need to write a Stored procedure in SQL server whose data returned will be used to generate a XML file.
My XML file to be in structure of
<root>
<ANode></ANode>
<BNode></BNode>
<CNode>
<C1Node>
<C11Node></C11Node>
<C12Node></C12Node>
</C1Node>
<C2Node>
<C21Node></C21Node>
<C22Node></C22Node>
</C2Node>
<C3Node>
<C31Node></C31Node>
<C32Node></C32Node>
</C3Node>
</CNode>
</root>
My question is, in the stored procedure we can select values for ANode and BNode as a simple SELECT statement like
Select ANodeVal,BNodeVal from Table
But how to design the stored procedure to get records for the CNode which is a subtree which has 3 or more(dynamic) separate nodes in it for each record in addition to the normal ANode and BNode.

See
Nesting XML-returning scalar valued functions
Once you get the hang of the nesting, and are willing to write the number of scalar-valued functions necessary to construct the node segments from the bottom up (I wouldn't want lots of these laying around), then it's not so hard.

I wouldn't recommend doing this in a stored proc. If created in language such as C#/Python or Java will make the code unit testable and more maintainable.

If you are able to modify the database design, consider keeping each node as a record, instead of as a column (as the sample select statement would indicate).
For example, each row might include the following fields:
RowId
ParentRowId
Name
RowData
I'm assuming that you are passing the data to an application befcause you indicated that the returned data will be used to generate the XML. In which case the Stored Procedure would simply be a SELECT statement, leaving the formatting to the application.
Most implementations of XML engines should allow you to add child nodes to existing parent nodes. The XML is built in memory and then "exported" by whatever method necessary to get the desired final result.

Related

Find Out List Of Objects Referenced In Snowflake Procedure

I want to get list of Objects referenced in Snowflake Procedure , let us say It is using Tables, Views Inside it , I want to find those items from Procedure definition , as currently there is no function in Snowflake that can provide this information.
GET_OBJECT_REFERENCES https://docs.snowflake.com/en/sql-reference/functions/get_object_references.html is function now only available for Views and not for Procedure.
Any pointers in scanning the definition of Procedure and figure out objects in it.
As Felipe pointed out, you can pass the name of a table or view as a parameter into the stored procedure. In that case there's no way to know what objects the SP will reference.
If your organization tends not to do that; if your SQL in stored procedures tends to be more along the lines of "select * from my_table" you can simply search for those references in the stored procedure code.
The following statement is crude, but effective. It could be developed and polished a lot, and it could miss references. It also only finds the first match, while a more useful query would return all and flatten out the array. I may have time to work on that a bit. It did find a lot in my test. It simply looks for the following pattern:
SQL Command ... Matching Clause for that Command ... Semicolon
The reason this works is that even if you don't terminate the SQL in the stored procedure, the JavaScript line should be terminated with a semicolon. JavaScript is comparatively forgiving of missing semicolons, but it should hit one eventually and match the SQL statement.
select PROCEDURE_CATALOG
,PROCEDURE_SCHEMA
,PROCEDURE_NAME
,ARGUMENT_SIGNATURE
,regexp_substr(PROCEDURE_DEFINITION, 'SELECT\\s.*FROM.*;|INSERT\\s.*INTO.*;|UPDATE\\s.*SET.*;|MERGE\\s.*INTO.*;|DELETE\\s.*FROM.*;|MERGE\\s.*USING.*;',1, 1, 'ims') STATEMENT
from MY_DATABASE.INFORMATION_SCHEMA.PROCEDURES
where STATEMENT is not null;
I write a lot of stored procedures, and to Felipe's point this returns a lot of rows like this for me:
select ${params.leftColumnList} from ${params.leftObject} order by ${leftTimestamp};`);
In those cases, you'd need to have someone who can read code figure out what it's referencing. In this case, the SP accepts parameters for those fields, so they could be any tables.

SQL Server - compare the results of two stored procedures that output multiple tables

So, similar to "SQL Server compare results of two queries that should be identical", I need to compare the output of two stored procedures to ensure the new version is generating equivalent output to the old version. The tricky part is that my SP outputs six tables of differing widths.
I started writing a hybrid version of them that would compare each of the tables individually, but it's a pretty complex SP, so I was hoping there was an easier way.
I tried using EXCEPT as in the linked question, but it looks like that will only compare one table to one other table.
Easy option 1: Output the stored procedure results to a text file (one per procedure version) and use a diff tool/editor to make sure they are the same.
Easy option 2: Write the stored procedure results to a table/temp table (per return table per procedure) and write sql to compare the results. Just count the rows in each result table and then do a count of the union (not union all) of both tables. Repeat for each result table.
You can capture multiple result sets in .NET (C# or VB) quite easily. You can create a DataAdapter and DataSet, and use the DataAdapter.Fill() method to populate the DataSet. Each result set will be stored as a DataTable within that DataSet. Then you just need to loop through the DataTables collection in each DataSet and compare them. You can find more info on this MSDN page: Populating a DataSet from a DataAdapter
This can be done in either SQLCLR if you want to run it as a stored procedure or user-defined function, OR it can be a stand-alone console application. Running it as a SQLCLR stored procedure is quite convenient, but given that you will be stored all results for all 6 result sets, and for both stored procedures that you are testing, that might require too much memory. In that case, the console app is the the way to go.
The only thing I can think of is add an additional parameter to your both of (New/old) stored procedures to handle which result it should return like.
Exec usp_proc #var1 , #var2 , #ResultSet = 1
The above execution should return the first result set and if you pass #ResultSet = 2 it should return second result set and so on.....
do this with both stored procedure and then compare the result sets group by group (using except will do the trick).

combine several queries into one

I would like to be able to query multiple of the same type of argument (for example, several IDs, just to keep the example simple) so I only have to execute a procedure once instead of one time for each individual ID. Where my single-instance proc returns, say, a name, my get-all proc would return a single-column table of names.
What I have now:
EXEC MyProc(123);
EXEC MyProc(456);
EXEC MyProc(789);
What I would like:
// Square brackets aren't correct syntax,
// they just represent a list that contains x number of IDs
EXEC MyProc([123, 456, 789]);
Can I do this, and if so, is there an easy mechanism for handling such a thing that doesn't involve cursors and various over-complicated things? Would this even be considered a good idea?
To execute the proc only once, you'll have to refactor your proc to work with multiple IDs, as there is no T-SQL function or syntactic sugar to do this for you.
If this is to be varadic in that there may be one or many IDs, you'll have to pass multiple IDs to your proc in one parameter. This passing of an array of sorts can be easier in more recent versions of SQL Server.
For example, you can try passing:
TVPs in SQL Server 2008+
delimited strings that are then split in the proc
xml which is then parsed in the proc
a table name which is then read by the proc dynamically
use a table name which is known by both the proc and the caller beforehand
A quick search for passing arrays is SQL Server will yield more results, among the best of them is Arrays and Lists in SQL Server as mentioned by #Andomar.

How can I call a stored procedure without table names in HQL?

I am trying to fetch the current timestamp through a stored procedure in HQL. This means my code looks something like the following:
var currentTimestamp =
session.CreateQuery("SELECT CURRENT_TIMESTAMP()")
.UniqueResult<DateTime>();
This doesn't work. Specifically, it throws a System.NullReferenceException deep inside of the NHibernate HqlParser.cs file. I played around with this a bit, and got the following to work instead:
var currentTimestamp =
session.CreateQuery("SELECT CURRENT_TIMESTAMP() FROM Contact")
.SetMaxResults(1)
.UniqueResult<DateTime>();
Now I have the data I want, but an HQL query I don't. I want the query to represent the question I'm asking -- like my original format.
An obvious question here is "Why are you using HQL?" I know I can easily do with this session.CreateSQLQuery(...), hitting our MySQL 5.1 database directly. This is simply an example of my core problem, with the root being that I'm using custom parameter-less HQL functions throughout my code base, and I want to have integration tests that run these HQL parameter-less functions in as much isolation as possible.
My hack also has some serious assumptions baked in. It will not return a result, for example, if there are no records in the Contact table, or if the Contact table ceases to exist.
The method to retrieve CURRENT_TIMESTAMP() (or any other database function) outside of the context of a database table varies from database to database - some allow it, some do not. Oracle, for example, does not allow a select without a table, so they provide a system table with a single row which is called DUAL.
I suspect that NHibernate is implementing in HQL primarily features which are common across all database implementations, and thus does not implement a table-less select.
I can suggest three approaches:
Create a view that hides the table-less select such as 'create view dtm as select current_timestamp() as datetime'
Follow the Oracle approach and create a utility table with a single row in it that you can use as the table in a select
Create a simple stored procedure which only executes 'select current_timestamp()'

Using CONTENT keyword while creating a table with an XML column from XML Schema Collection

While creating a table that has an XML type column, I am referring to a complex XML Schema Collection. When I specify the XML Schema, I have the option of mentioning either CONTENT or DOCUMENT keyword. The latter will ensure that the XML data is stored as a document in a single column.
According to a video tutorial the CONTENT will store the XML data in fragments.
Besides the above statement I don't find reference anywhere else regarding the usage of the CONTENT keyword and it's implication on schema & data.
I would like to know how the fragments are created and managed and whether and how they can be queried individually. Further, how the fragments are correlated. Next, when I amend the XML Schema Collection, what is the impact.
actually i think SQLServer 2005 XML is quite good documented.
CONTENT is the default and allows any valid XML. DOCUMENT is more specific and means that the XML-Data you can to store is only allowed to have a single Top-Level node.
Create:
CREATE TABLE XmlCatalog (
ID INT PRIMARY KEY,
Document XML(CONTENT myCollection))
Insert:
INSERT INTO XmlCatalog VALUES (2,
'<doc id="123">
<sections>
<section num="1"><title>XML Schema</title></section>
<section num="3"><title>Benefits</title></section>
<section num="4"><title>Features</title></section>
</sections>
</doc>')
Select:
SELECT xCol.query('/doc[#id = 123]//section')
FROM XmlCatalog
WHERE xCol.exist ('/doc[#id = 123]') = 1
...and so on. The query language exceeds more or less in a subset of xpath 1.0.
If you amend an XSD it is checked on Inserts and Updates and stored within the xml of each element. As far as i understand the doc it is also allowed to add multiple schemas for one column so that entries can reference to different schemas.
EDIT:
Ok, after reading the specific parts of the documentation i think i understand what your problem is. The reference isn't very clear on that point but as far as i understand it only Entries with one top level node can to be bound to XSD schemas.
Due to the fact that XSD-Schemas require a single top level node defining the used XSD file it won't be possible to validate fragments containing more than one top level element. I haven't tried but i think it can't be done.
However it seems to be valid to define a CONTENT column, amend an XSD and store both, XML with one top level node referencing the XSD as well as XML-fragments which will only checked for wellformedness. The fragments can be accessed using the XPath query language show in the select statement above.
I can't tell you much about performance implications. The reference mentions that XSDs are stored inline so this will need some extra space within the db. The XPath queries need to be executed too. Despite the fact that xpath usually is quite fast i guess it could decrease performace cause it needs to be performed on each row to get the result. To be sure i think you have to check the execution plan for your specific query depending on size and complexity of the stored xml as well as the xpath expression.

Resources