Creating SQL Server JSON Parsing/Query UDF - sql-server

First of all before I get into the question, I'll preface this with the fact that I know that this is a "bad" idea. But for business reasons it is something that I have to come up with a solution to, and I'm hoping that someone, somewhere might have some ideas on how to go about this.
I have a SQL Server 2008 R2 table that has a "OtherProperties" column. This column contains various other, somewhat arbitrary additional pieces of information that relate to the records. There is a business need to create a UDF that we can use to query the results, for example.
SELECT *
FROM MyTable
WHERE MyUDFGetValue(myTable.OtherProperties, "LinkedOrder[0]") IS NOT NULL
This would find a record where there was an array of LinkedOrder entries that contained a value at index 0
SELECT *
FROM MyTable
WHERE MyUDFGetValue(myTable.OtherProperties, "SubOrder.OrderId") = 25
This would find a property "orderId" and use its value in a comparison.
Anyone seen an implementation of this? I've seen implementations of functions. Like this JSONParser that take the values into a table which just will not get us what we need query wise. Complexity wise, I don't want to write a full fledged JSON parser, but I can if I need to.

Not sure if this will suit your needs but I read about a CLR JSON serializer/deserializer. You can find it here, http://www.sqlservercentral.com/articles/CLR/74160/

It's been a long time since you asked your question but there is now a solution you can use - JSON Select which provides various functions for different datatypes, for example the JsonInt() function. From one of your examples (assuming OrderId is an int, if not you could use a different function):
SELECT *
FROM MyTable
WHERE dbo.JsonInt(myTable.OtherProperties, 'SubOrder.OrderId') = 25
DISCLOSURE:
I am the author of JSON Select, and as such have an interest in you using it :)

If you cannot use SQL Server 2016 with built-in JSON support, you would need to use CLR e.g. JSONselect, json4sql, or custom code such as http://www.codeproject.com/Articles/1000953/JSON-for-SQL-Server-Part, etc.

Related

User Defined Table Function with Procedural Logic

Our company is setting up a new Snowflake instance, and we are attempting to migrate some processing currently being done is MS SQL Server. I need to migrate a Table-Valued SQL Function into snowflake. The source function has procedural logic in it, which is not allowed to my knowledge in snowflake UDTFs. I have been searching for a workaround with no success.
To be as specific as I can, I need a function that will take a string for input, decode that string, and return a table with the keys and their corresponding values. I cannot condense all of the logic to split the string and decode the keys into one SQL statement, so Snowflake SQL UDTFs will not work.
I looked into whether a UDTF can call a procedure and somehow I could simply return a result, but it does not look like that will work. Please let me know if there is any way to work around this.
I think Javascript UDTF is what you're looking for in Snowflake:
https://docs.snowflake.com/en/sql-reference/udf-js-table-functions.html
funny I just stumbled onto this as I'm running into the same thing myself. I found there is a SPLIT_TO_TABLE function that may be able to accomplish this. As Greg suggested nesting this together in the form of a CTE combined with a JOIN may allow you to accomplish what you need to do.

How to avoid SQL Server error on ORDER BY with duplicate columns

Although this question references PHP, it is not actually PHP-specific, so I have not flagged it as such.
We have a PHP framework which supports multiple DB back-ends.
There is a generic function in our data object class, which allows you to get records from the underlying table, with a specified criteria and sort order.
It looks something like this:
function GetAll($Criteria, $OrderBy = "") {
...
// Add primary key (column 1) to end of order by list,
// so that returned order is predictable.
if ($OrderBy != "") {
$OrderBy .= ", ";
}
$OrderBy .= "1";
...
// Build and run query, returning the result as an array.
}
If you specify an $OrderBy argument of StaffID on a Staff object, the resulting SQL looks something like the following:
SELECT * FROM adminStaff ORDER BY StaffID, 1;
This works fine on a MySQL back-end, and from my searching of the web it should also be fine on most other DB back-ends. However, when using SQL Server, we get the following error message:
A column has been specified more than once in the order by list.
Columns in the order by list must be unique.
This arises because SQL Server disallows the same column appearing multiple times in the ORDER BY clause. In this case StaffID is column 1 and therefore we have multiple instances of the same column.
Is there a way to disable this check in SQL Server? MySQL provides a lot of options to enable/disable strictness checks and incompatible features - does SQL Server provide anything of that nature that would allow the above query to run without errors?
If not, do you have any suggestions for how we could resolve this in our data-object layer? Bear in mind we need to maintain compatibility with existing projects which expect this behaviour, so it is not sufficient to only include the first column when $OrderBy is blank.
The situation is also slightly complicated in the fact that the field list is customisable elsewhere in the data object configuration, so we can't rely on * being used as the field list - it could contain pretty much anything that is valid in a normal SQL field list. However, if that is asking too much, a solution to the simpler case (as outlined above) would be a good start!
In SQL Server you are able to sort either by column name or by ordinal position of the column order in the SELECT list.
In your case the column StaffID became the ordinal position 1. Hence SQL Server cannot sort the same result set based on the same column twice.
If you remove the 1 from your query, the problem will be solved.
Avoid using the ordinal position of the column for sorting.
The basic question - is it possible to suppress this SQL Server restriction on ORDER BY column duplication - was answered by Venu: No it is not.
There are various suggestions (mostly from me) about how you could possibly code around this limitation in a generic manner. For any future readers, those answers are probably the most helpful if you are adapting an existing system. (If you are starting from scratch, just try and avoid this situation altogether.)
However, the actual solution that I came to was to add versioning to our internal API for our DBAL. The API version is now 2 but you can call setApiVersion(1) to instruct the back-end to use the old version of the API instead.
v2 is identical to v1* except it no longer automatically adds column 1 to the ORDER BY unless it is completely blank. Therefore, the SQL Server issue is resolved for new (v2) projects, whilst existing projects can be set to use the v1 API and therefore continue to work correctly (but without SQL server compatibility).
(* Actually, I've taken this opportunity to make some other breaking changes in v2, but that is not relevant to this answer.)
I've come up with a couple of potential solutions at the framework level. All of them have performance implications which would need to be profiled, and in practice that may rule some or all of them out. However, in theory at least, these are ways that a generic solution could be implemented.
Omit the ORDER BY altogether, and do the sorting in code. Would involve parsing the provided ORDER BY string. Would be problematic if ORDER BY contained expressions, but I can't remember ever seeing that in our projects, so can probably be ignored. Probably the slowest solution.
Perform the query without the ORDER BY, limiting the results set to a single row. Use resulting column list to work out whether column 1 is already in the ORDER BY clause, and therefore whether to add it. Then run the full query. Would require parsing the provided ORDER BY string. Query caching may mean this won't add as much overhead as it appears.
Parse the field list to get the first column name and see if this appears in the ORDER BY clause. If field list contains * or table.* would require a schema lookup. May be too difficult if we need to deal with table aliases and wildcards in combination.
Parse ORDER BY string and see if it contains any primary key. If so it is already uniquely ordered and doesn't require the addition of an extra field. Would require a schema look-up.
Use a sub-select to give us a new instance of the column that we can sort on instead. Not sure whether SQL Server would still complain that this is the 'same' column, though.
Could you just append '--' to your OrderBy parameter when working with SQL Server and just explicitly define the Order By fields where necessary?

Define a String constant in SQL Server?

Is it possible in SQL Server to define a String constant? I am rewriting some queries to use stored procedures and each has the same long string as part of an IN statement [a], [b], [c] etc.
It isn't expected to change, but could at some point in future. It is also a very long string (a few hundred characters) so if there is a way to define a global constant for this that would be much easier to work with.
If this is possible I would also be interested to know if it works in this scenario. I had tried to pass this String as a parameter, so I could control it from a single point within my application but the Stored Procedure didn't like it.
You can create a table with a single column and row and disallow writes on it.
Use that as you global string constant (or additional constants, if you wish).
You are asking for one thing (a string constant in MS SQL), but appear to maybe need something else. The reason I say this is because you have given a few hints at your ultimate objective, which appears to be using the same IN clause in multiple stored procedures.
The biggest clue is in the last sentence:
I had tried to pass this String as a
parameter, so I could control it from
a single point within my application
but the Stored Procedure didn't like
it.
Without details of your SQL scripts, I am going to attempt to use some psychic debugging techniques to see if I can get you to what I believe is your actual goal, and not necessarily your stated goal.
Given your Stored Procedure "didn't like that" when you tried to pass in a string as a parameter, I am guessing the composition of the string was simply a delimited list of values, something like "10293, 105968, 501940" or "Juice, Milk, Donuts" (pay no attention to the actual list values - the important part is the delimited list itself). And your SQL may have looked something like this (again, ignore the specific names and focus on the general concept):
SELECT Column1, Column2, Column3
FROM UnknownTable
WHERE Column1 IN (#parameterString);
If this approximately describes the path you tried to take, then you will need to reconsider your approach. Using a regular T-SQL statement, you will not be able to pass a string of parameter values to an IN clause - it just doesn't know what to do with them.
There are alternatives, however:
Dynamic SQL - you can build up the
whole SQL statement, parameters and
all, then execute that in the SQL
database. This probably is not what
you are trying to achieve, since you
are moving script to stored
procedures. But it is listed here
for completeness.
Table of values -
you can create a single-column table
that holds the specific values you
are interested in. Then your Stored
Procedure can simply use the column
from this table for the IN clause).
This way, there is no Dynamic SQL
required. Since you indicate that
the values are not likely to change,
you may just need to populate the
table once, and use it wherever
appropriate.
String Parsing to
derive the list of values - You can
pass the list of values as a string,
then implement code to parse the
list into a table structure on the
fly. An alternative form of this
technique is to pass an XML
structure containing the values, and
use MS SQL Server's XML
functionality to derive the table.
Define a table-value function that
returns the values to use - I have
not tried this one, so I may be
missing something, but you should be
able to define the values in a
table-value function (possibly using
a bunch of UNION statements or
something), and call that function
in the IN clause. Again - this is an
untested suggestion and would need
to be worked through to determine
it's feasibility.
I hope that helps (assuming I have guessed your underlying quandary).
For future reference, it would be extremely helpful if you could include SQL script showing
your table structure and stored procedure logic so we can see what you have actually attempted. This will considerably improve the effectiveness of the answers you receive. Thanks.
P.S. The link for String Parsing actually includes a large variety of techniques for passing arrays (i.e. lists) of information to Stored Procedures - it is a very good resource for this kind of thing.
In addition to string-constants tables as Oded suggests, I have used scalar functions to encapsulate some constants. That would be better for fewer constants, of course, but their use is simple.
Perhaps a combination - string constants table with a function that takes a key and returns the string. You could even use that for localization by having the function take a 'region' and combine that with a key to return a different string!

Is it possible to write a database view that encompasses one-to-many relationships?

So I'm not necessarily saying this is even a good idea if it were possible, since the schema of the view would be extremely volatile, but is there any way to represent a has-many relationship in a single view?
For example, let's say I have a customer that can have any number of addresses in the database. Is there any way to list out each column of each address with perhaps a number as a part of the alias (e.g., columns like Customer Id, Name, Address_Street_1, Address_Street_2, etc)?
Thanks!
Not really - you really are doing a dynamic pivot. It's possible to use OPENROWSET to get to a dynamically generated query, but whether that's advisable, it's hard to say without seeing more about the business case.
First make a stored proc which does the dynamic pivot like I did on the StackExchange Data Explorer.
Basically, you generate dynamic SQL which builds the column list. This can only really be done in a stored proc. Which is fine for applciation calls.
But what about if you want to re-use that in a lot of different joins or ad hoc queries?
Then, have a look at this article: "Using SQL Servers OPENROWSET to break the rules"
You can now call your stored proc by looping back into the server and then getting the results into a rowset - this can be in a view!
The late Ken Henderson has some good examples of this in his excellent book: "The Guru's Guide to SQL Server Stored Procedures, XML, and HTML" (you got to love the little "Covers .NET!" on the cover which captures well the zeitgeist for 2002!).
He only covers the loopback part (with views and user-defined functions), the less verbose PIVOT syntax was not available until 2005, but PIVOTs can also be generated using a CASE statement as a characteristic function.
Obviously, this technique has caveats (I can't even do this on our production server).
Yes - use:
CREATE VIEW customer_addresses AS
SELECT t.customer_id,
t.customer_name,
a1.street AS address_street_1,
a2.street AS address_street_2
FROM CUSTOMER t
LEFT JOIN ADDRESS a1 ON a1.customer_id = t.customer_id
LEFT JOIN ADDRESS a2 ON a2.customer_id = t.customer_id
If you provided more info, it'd be easier to give you a better answer. It's possible you're looking to pivot data (turn rows into columns).
Simply put, no. Not without dynamically recreating the view every time you want to use it at least, that is.
But, what you can do is predefine, say, 4 address columns in your view, then populate the first four results of your one-to-many relation into those columns. It's not quite the dynamic view you want, but it's also much more stable/usable in my opinion.

SSRS Multi value parameters - appropriate layer for implmentation of the filter

When using multivalue parameters in sql reporting services is it more appropriate to implement the list filter using a filter on the dataset itself, the data region control or change the actual query that drives the dataset?
SSRS will support any scenario, so then I ask, is there a reason beyond the obvious of why this should be done at one level over another?
It makes sense to me that modifying the query itself and asking the RDBMS to handle the filtering would be most efficient but maybe I am missing something with respect to how the SSRS Data Processing Extension may handle this scenario?
You are correct. The way to go is to pass the parameters through to the database engine.
Reporting Services should only be ideally used to render content. The less data that you need to pass back to the client web browser, the faster the report will render.
You may find my answer to a similar post regarding using mulit-value parameters to be of use.
Passing multiple values for a single parameter in Reporting Services
Hope this helps but please feel free to pose any further questions you may have.
Cheers,
John
Using table-valued UDF is a good approach, but there is still one issue - in case if this function is called in many places of query, and even inside inner select, there can be performance problem. You can resolve this issue using table variable (or temp table eather):
DECLARE #Param (Value INT)
INSERT INTO #Param (Value)
SELECT Param FROM dbo.fn_MVParam(#sParameterString,',')
...
where someColumn IN(SELECT Value FROM #Param)
so function will be called only once.
Othe thing, if you don't use stored procedure, but embedded SQL query instead, you can just put MVP into query:
...
where someColumn IN(#Param)
...
Use the RDBMS to do the main filtering
SSRS provides filtering for the purposes on data driven display and/or dynamic display. Especially useful for sub reports etc

Resources