Issue where clause FOR JSON AUTO has generated the incomplete answer [duplicate] - sql-server

This question already has answers here:
FOR JSON PATH results in SSMS truncated to 2033 characters
(10 answers)
SQL Server json truncated (even when using NVARCHAR(max) )
(10 answers)
Closed 9 months ago.
Getting JSON from SQL Server is great, but I ran into a problem.
Example. I have a LithologySamples table with a very basic structure:
[Id] [uniqueidentifier],
[Depth1] [real],
[Depth2] [real],
[RockId] [nvarchar](8),
In the database there are more or less 600 records of this table. I want to generate a JSON to transport data to another database, so I use FOR JSON AUTO. Which has worked perfectly with other tables with less records. But in this case I see that the response is generated incomplete. It has me baffled. I noticed when examining the output:
[{
"Id": "77769039-B2B7-E511-8279-DC85DEFBF2B6",
"Depth1": 4.2000000e+001,
"Depth2": 5.8000000e+001,
"RockId": "MIC SST"
}, {
"Id": "78769039-B2B7-E511-8279-DC85DEFBF2B6",
"Depth1": 5.8000000e+001,
"Depth2": 6.3000000e+001,
"RockId": "CGL"
}, {
"Id": "79769039-B2B7-E511-8279-DC85DEFBF2B6",
"Depth1": 6.3000000e+001,
"Depth2": 8.3000000e+001,
"RockId": "MIC SST"
}, {
// ... OK, continue fine, but it breaks off towards the end:
}, {
"Id": "85769039-B2B7-E511-8279-DC85DEFBF2B6",
"Depth1": 2.0500000e+002,
"Depth2": 2.1500000e+002,
"RockId": "MIC SST"
}, {
"Id": "86769039-
// inexplicably it cuts here !?
I've searched and I can't find any options for the answer to come out complete.
The SQL query is as follows:
SELECT*FROM LithologySamples FOR JSON AUTO;
AUTO or PATH are the same result
Does anyone know what I should do so that the statement generates the JSON of the entire table?

But in this case I see that the response is generated incomplete.
If you are checking this in SSMS, it truncates text in various ways depending on the output method you're using (PRINT, SELECT, results to text/grid). The string is complete, it's just the output that has been mangled.
One way to validate that the string is in fact complete is to:
SELECT * INTO #foo FROM
(SELECT * FROM LithologySamples FOR JSON AUTO) x(y);
Then checking LEN(y), DATALENGTH(y), RIGHT(y , 50) (see example db<>fiddle), or selecting from that table using CONVERT(xml (see this article for more info).
In your case it seems the problem is coming from how C# is consuming the output. If the consumer is treating the JSON as multiple rows, then assigning a variable there will ultimately assign one arbitrary row of <= 2033 characters, not the whole value. I talked about this briefly back in 2015. Let's say you are using reader[0] or similar to test:
CREATE TABLE dbo.Samples
(
[Id] [uniqueidentifier] NOT NULL DEFAULT NEWID(),
[Depth1] [real] NOT NULL DEFAULT 5,
[Depth2] [real] NOT NULL DEFAULT 5,
[RockId] [nvarchar](8)
);
INSERT dbo.Samples(RockId) SELECT TOP (100) LEFT(name, 8) FROM sys.all_columns;
-- pretend this is your C# reader:
SELECT * FROM dbo.Samples FOR JSON AUTO;
-- reader[0] here would be something like this:
-- [{"Id":"054EC9A2-760B-4EBA-BF06-...,"RockId":"ser
-- which is the first 2,033 characters
SELECT LEN('[{"Id":"054EC9A2-760B-4EBA-BF06-..."RockId":"ser')
-- instead, since you want C# to assign a scalar,
-- assign output to a scalar first:
DECLARE #json nvarchar(max) = (SELECT * FROM dbo.Samples FOR JSON AUTO);
SELECT json = #json;
-- now reader[0] will be the whole thing
Example db<>fiddle
The 2033 comes from the same place it comes from for XML (since SQL Server's JSON implementation is just a pretty wrapper under existing underlying XML functionality), as Charlie points out Martin explained here:
SELECT FOR XML AUTO and return datatypes

Related

Delete an object from nested array in openjson SQL Server 2016

I want to delete the "AttributeName" : "Manufacturer" from the below json in SQL Server 2016:
declare #json nvarchar(max) = '[{"Type":"G","GroupBy":[],
"Attributes":[{"AttributeName":"Class Designation / Compressive Strength"},{"AttributeName":"Size"},{"AttributeName":"Manufacturer"}]}]'
This is the query I tried which is not working
select JSON_MODIFY((
select JSON_Query(#json, '$[0].Attributes') as res),'$.AttributeName.Manufacturer', null)
Here is the working solution using the for json and open json. The point is to:
Identify the item you wish to delete and replace it with NULL. This is done by JSON_MODIFY(#json,'$[0].Attributes[2]', null). We're simply saying, take the 2nd element in Attributes and replace it by null
Convert this array to a row set. We need to somehow get rid of this null element and that's something we can filter easily in SQL by where [value] is not null
Assemble it all back to original JSON. That's done by FOR JSON AUTO
Please bear in mind one important aspect of such JSON data transformations:
JSON is designed for information exchange or eventually to store the information. But you should avoid more complicated data manipulation on SQL level.
Anyway, solution here:
declare #json nvarchar(max) = '[{"Type": "G","GroupBy": [],"Attributes": [{"AttributeName": "Class Designation / Compressive Strength"}, {"AttributeName": "Size"}, {"AttributeName": "Manufacturer"}]}]';
with src as
(
SELECT * FROM OPENJSON(
JSON_Query(
JSON_MODIFY(#json,'$[0].Attributes[2]', null) , '$[0].Attributes'))
)
select JSON_MODIFY(#json,'$[0].Attributes', (
select JSON_VALUE([value], '$.AttributeName') as [AttributeName] from src
where [value] is not null
FOR JSON AUTO
))

MSSQL Data type conversion

I have a pair of databases (one mssql and one oracle), ran by different teams. Some data are now being synchronized regularily by a stored procedure in the mssql table. This stored procedure is calling a very large
MERGE [mssqltable].[Mytable] as s
USING THEORACLETABLE.BLA as t
ON t.[R_ID] = s.[R_ID]
WHEN MATCHED THEN UPDATE SET [Field1] = s.[Field1], ..., [Brokenfield] = s.[BrokenField]
WHEN NOT MATCHED BY TARGET THEN
... another big statement
Field Brokenfield was a numeric one until today, and could take value NULL, 0, 1, .., 24
Now, the oracle team introduced a breaking change today for some reason, changed the type of the column to string and now has values NULL, "", "ALFA", "BRAVO"... in the column. Of course, the sync got broken.
What is the easiest way to fix the sync here? I (Mysql team lead, frontend expert but not so in databases) would usually apply one of our database expert guys here, but all of them are now ill, and the fix must go online today....
I thought of a stored procedure like CONVERT_BROKENFIELD_INT_TO_STRING or so, based on some switch-case, which could be called in that merge statement, but not sure how to do that.
Edit/Clarification:
What I need is a way to make a chunk of SQL code (stored procedure), taking an input of "ALFA" and returning 1, "BRAVO" -> 2, etc. and which can be reused, to avoid writing huge ifs in more then one place.
If you can not simplify the logic for correct values the way #RichardHansell desribed, you can create a crosswalk table for BrokenField to the correct values. Then you can use a common table expression or subquery with a left join to that crosswalk to use in the merge.
create table dbo.BrokenField_Crosswalk (
BrokenField varchar(32) not null primary key
, CorrectedValue int
);
insert into dbo.BrokenField_Crosswalk (BrokenField,CorrectedValue) values
('ALFA', 1)
, ('ALPHA', 1)
, ('BRAVO', 2)
...
go
And your code for the merge would look something like this:
;with cte as (
select o.R_ID
, o.Field1
, BrokenField = cast(isnull(c.CorrectedValue,o.BrokenField) as int)
....
from oracle_table.bla as o
left join dbo.BrokenField_Crosswalk as c
)
merge into [mssqltable].[Mytable] t
using cte as s
on t.[R_ID] = s.[R_ID]
when matched
then update set
[Field1] = s.[Field1]
, ...
, [Brokenfield] = s.[BrokenField]
when not matched by target
then
If they are using names with a letter at the start that goes in a sequence:
A = 1
B = 2
C = 3
etc.
Then you could do something like this:
MERGE [mssqltable].[Mytable] as s
USING THEORACLETABLE.BLA as t
ON t.[R_ID], 1)) - ASCII('A') + 1 = s.[R_ID]
WHEN MATCHED THEN UPDATE SET [Field1] = s.[Field1], ..., [Brokenfield] = s.[BrokenField]
WHEN NOT MATCHED BY TARGET THEN
... another big statement
Edit: but actually I re-read your question and you are talking about [Brokenfield] being the problem column, so my solution wouldn't work.
I don't really understand now, as it seems as though the MERGE statement is updating the oracle table with numbers, so surely you need the mapping to work the other way, i.e. 1 -> ALFA, 2 -> BETA, etc.?

SQL Server 2014 - XQuery - get comma-separated List

I have a database table in SQL Server 2014 with only an ID column (int) and a column xmldata of type XML.
This xmldata column contains for example:
<book>
<title>a nice Novel</title>
<author>Maria</author>
<author>Peter</author>
</book>
As expected, I have multiple books, therefore multiple rows with xmldata.
I now want to execute a query for all books, where Peter is an Author. I tried this in some xPath2.0 testers and got to the conclusion that:
/book/author/concat(text(), if(position() != last())then ',' else '')
works.
If you try to port this success into SQL Server 2014 Express it looks like this, which is correctly escaped syntax etc.:
SELECT id
FROM books
WHERE 'Peter' IN (xmldata.query('/book/author/concat(text(), if(position() != last())then '','' else '''')'))
SQL Server however does not seem to support a construction like /concat(...) because of:
The XQuery syntax '/function()' is not supported.
I am at a loss then however, why /text() would work in:
SELECT id, xmldata.query('/book/author/text()')
FROM books
which it does.
My constraints:
I am bound to use SQL Server
I am bound to xpath or something else that can be "injected" as the statement above (if the structure of the xml or the database changes, the xpath above could be changed isolated and the application logic above that constructs the Where clause will not be touched) SEE EDIT
Is there a way to make this work?
regards,
BillDoor
EDIT:
My second constraint boils down to this:
An Application constructs the Where clause by
expression <operator> value(s)
expression is stored in a database and is mapped by the xmlTag eg.:
| tokenname| querystring
| "author" | "xmldata.query(/book/author/text())"
the values are presented by the Requesting user. so if the user asks for the author "Peter" with operator "EQUALS" the application constructs:
xmaldata.query(/book/author/text()) = "Peter"
as where clause.
If the customer now decides that author needs to be nested in an <authors> element, i can simply change the expression in the construction-database and the whole machine keeps running without any changes to code, simply manageable.
So i need a way to achieve that
<xPath> <operator> "Peter"
or any other combination of this three isolated components (see above: "Peter" IN <xPath>...) gets me all of Peters' books, even if there are multiple unsorted authors.
This would not suffice either (its not sqlserver syntax, but you get the idea):
WHERE xmldata.exist('/dossier/client[text() = "$1"]', "Peter") = 1;
because the operator is still nested in the expression, i could not request <> "Peter".
I know this is strange, please don't question the concept as a whole - it has a history :/
EDIT: further clarification:
The filter-rules come into the app in an XML structure basically:
Operator: "EQ"
field: "name"
value "Peter"
evaluates to:
expression = lookupExpressionForField("name") --> "table2.xmldata.value('book/author/name[1]', 'varchar')"
operator = lookUpOperatorMapping("EQ") --> "="
value = FormatValues("Peter") --> "Peter" (if multiple values are passed FormatValues cosntructs a comma seperated list)
the application then builds:
- constructClause(String expression,String operator,String value)
"table2.xmldata.value('book/author/name[1]', 'varchar')" + "=" + "Peter"
then constructs a Select statement with the result as WHERE clause.
it does not build it like this, unescaped, unfiltered for injection etc, but this is the basic idea.
i can influence how the input is Transalted, meaning I can implement the methods:
lookupExpressionForField(String field)
lookUpOperatorMapping(String operator)
Formatvalues(List<String> values) | Formatvalues(String value)
constructClause(String expression,String operator,String value)
however i choose to do, i can change the parameter types, I can freely implement them. The less the better of course. So simply constructing a comma-seperated list with xPath would be optimal (like if i could somewhere just tick "enable /function()-syntax in xPath" in sqlserver and the /concat(if...) would work)
How about something like this:
SET NOCOUNT ON;
DECLARE #Books TABLE (ID INT NOT NULL IDENTITY(1, 1) PRIMARY KEY, BookInfo XML);
INSERT INTO #Books (BookInfo)
VALUES (N'<book>
<title>a nice Novel</title>
<author>Maria</author>
<author>Peter</author>
</book>');
INSERT INTO #Books (BookInfo)
VALUES (N'<book>
<title>another one</title>
<author>Bob</author>
</book>');
SELECT *
FROM #Books bk
WHERE bk.BookInfo.exist('/book/author[text() = "Peter"]') = 1;
This returns only the first "book" entry. From there you can extract any portion of the XML field using the "value" function.
The "exist" function returns a boolean / BIT. This will scan through all "author" nodes within "book", so there is no need to concat into a comma-separated list only for use in an IN list, which wouldn't work anyway ;-).
For more info on the "value" and "exist" functions, as well as the other functions for use with XML data, please see:
xml Data Type Methods

Parse json arrays using HIVE

I have many json arrays stored in a table (jt) that looks like this:
[{"ts":1403781896,"id":14,"log":"show"},{"ts":1403781896,"id":14,"log":"start"}]
[{"ts":1403781911,"id":14,"log":"press"},{"ts":1403781911,"id":14,"log":"press"}]
Each array is a record.
I would like to parse this table in order to get a new table (logs) with 3 fields: ts, id, log.
I tried to use the get_json_object method, but it seems that method is not compatible with json arrays because I only get null values.
This is the code I have tested:
CREATE TABLE logs AS
SELECT get_json_object(jt.value, '$.ts') AS ts,
get_json_object(jt.value, '$.id') AS id,
get_json_object(jt.value, '$.log') AS log
FROM jt;
I tried to use other functions but they seem really complicated.
Thank you! :)
Update!
I solved my issue by performing a regexp:
CREATE TABLE jt_reg AS
select regexp_replace(regexp_replace(value,'\\}\\,\\{','\\}\\\n\\{'),'\\[|\\]','') as valuereg from jt;
CREATE TABLE logs AS
SELECT get_json_object(jt_reg.valuereg, '$.ts') AS ts,
get_json_object(jt_reg.valuereg, '$.id') AS id,
get_json_object(jt_reg.valuereg, '$.log') AS log
FROM ams_json_reg;
I just ran into this problem, with the JSON array stored as a string in the hive table.
The solution is a bit hacky and ugly, but it works and doesn't require serdes or external UDFs
SELECT
get_json_object(single_json_table.single_json, '$.ts') AS ts,
get_json_object(single_json_table.single_json, '$.id') AS id,
get_json_object(single_json_table.single_json, '$.log') AS log
FROM ( SELECT explode (
split(regexp_replace(substr(json_array_col, 2, length(json_array_col)-2),
'"}","', '"}",,,,"'), ',,,,')
) FROM src_table) single_json_table;
I broke the lines up so that it would be a little easier to read.
I'm using substr() to strip the first and last characters, removing [ and ] . I'm then using regex_replace to match the separator between records in the json array and adding or changing the separator to be something unique that can then be used easily with split() to turn the string into a hive array of json objects which can then be used with explode() as described in the previous solution.
Note, the separator regex used here ( "}"," ) wouldn't work with the original data set...the regex would have to be ( "},\{" ) and the replacement would then need to be "},,,,{" eg..
split(regexp_replace(substr(json_array_col, 2, length(json_array_col)-2),
'"},\\{"', '"},,,,{"'), ',,,,')
Use explode() function
hive (default)> CREATE TABLE logs AS
> SELECT get_json_object(single_json_table.single_json, '$.ts') AS ts,
> get_json_object(single_json_table.single_json, '$.id') AS id,
> get_json_object(single_json_table.single_json, '$.log') AS log
> FROM
> (SELECT explode(json_array_col) as single_json FROM jt) single_json_table ;
Automatically selecting local only mode for query
Total MapReduce jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
hive (default)> select * from logs;
OK
ts id log
1403781896 14 show
1403781896 14 start
1403781911 14 press
1403781911 14 press
Time taken: 0.118 seconds, Fetched: 4 row(s)
hive (default)>
where json_array_col is column in jt which holds your array of jsons.
hive (default)> select json_array_col from jt;
json_array_col
["{"ts":1403781896,"id":14,"log":"show"}","{"ts":1403781896,"id":14,"log":"start"}"]
["{"ts":1403781911,"id":14,"log":"press"}","{"ts":1403781911,"id":14,"log":"press"}"]
because get_json_object doesn't support json array string, so you can concat to a json object, like this:
SELECT
get_json_object(concat(concat('{"root":', jt.value), '}'), '$.root')
FROM jt;

h2: "data conversion error" on array returned from stored procedure

This is a followup post to this post. I am writing an accounting system backed by an h2 database. The tree of accounts is stored in the ACCOUNTS table, with the PARENT_ID column storing the links in the tree.
To get the path to a given node in the tree, I have the following stored procedure:
public static Long[] getAncestorPKs(Long id)
whose job is to produce an array of integers, being the PARENT_ID values between the given node and the root of the tree. Let's imagine it is defined like this (because I have tried this and I get the same error):
public static Long[] getAncestorPKs(Long id)
{
return new Long[]{new Long(1), new Long(2), new Long(3)};
}
It is properly registered in the database and I can call it from within a SQL query. My problem is that h2 seems to be unable to deal with the return value: if I use it like this:
SELECT ID FROM ACCOUNTS WHERE ID IN (ANCESTOR_PKS(5))
then I get the following error:
Data conversion error converting "(1, 2, 3)"; SQL statement:
SELECT ID FROM ACCOUNTS WHERE ID IN (ANCESTOR_PKS(5)) [22018-167]
If, instead, I send the following to the database:
SELECT ID FROM ACCOUNTS WHERE ID IN (1, 2, 3)
I get back a result set with 3 rows, containing the three integers (exactly what I expect).
I really can't see what is the problem here! I am returning an array of Longs, which are to be used in comparing against a column which contains BIGINTS. Why is h2 refusing to convert this array? I have tried making the return value be Object[], because the h2 documentation is not entirely clear whether this is required on the return side as well as on the call side, but that makes no difference at all. I'm just banging my head against a brick wall here. This ain't rocket science! Surely someone has written similar code before?
Many thanks in advance, before I go mad!
If the method returns an array of objects, then for the database this is one value of data type ARRAY. And not a table with 3 rows. But of course you don't use the data type ARRAY in your table, you use INT or BIGINT. So your query is incorrect.
Either the method needs to return a ResultSet, or you need to convert the array value to a table. To do that, you could use the function TABLE(..) as follows:
select x from table(x bigint = getAncestorPKs(1));
So what you could do is:
drop table accounts;
create table accounts(id int);
insert into accounts values(1), (2), (10), (20);
drop alias getAncestorPKs;
create alias getAncestorPKs as 'Long[] getAncestorPKs(Long id) {
return new Long[]{new Long(1), new Long(2), new Long(3)};
}';
select * from accounts where id in
(select x from table(x bigint = getAncestorPKs(1)));

Resources