Is it possible in SQL Server to convert the type of a field based on the content of another field? - sql-server

I have a table, DD, which is a data dictionary, with fields (say):
ColumnID (longint PK), ColumnName (varchar), Datatype (varchar)
I have another table, V, where I have sets of records in the form:
ColumnID (longint FK), ColumnValue (varchar)
I want to be able to convert sets of records from V into another table, Results, where each field will be translated based on the value of DD.Datatype, so that the destination table might be (say):
ColumnID (longint FK), ColumnValue (datetime)
To be able to do this, ISTM that I need to be able to do something like
CONVERT(value of DD.Datatype, V.ColumnValue)
Can anyone give me any clues on whether this is even possible, and if so what the syntax would be? My google-fu has proved inadequate to find anything relevant

You could do something like this with dynamic sql, certainly. As long as you are aware of the limitation that the datatype is a property of the COLUMN in the resultset, and not each cell in the resultset. So all the rows in a given column must have the same datatype.

The only way to accomplish something like CONVERT(value of DD.Datatype, V.ColumnValue) in SQL is with dynamic SQL. That has it's own problems, such as basically needing to use stored procedures to keep queries efficient.
Alternately, you could fetch the datatype metadata with one query, construct a new query in your application, and then query the database again. Assuming you're using SQL Server 2012+, you could also try using TRY_CAST() or TRY_CONVERT(), and writing your query like:
SELECT TRY_CAST(value as VARCHAR(2)) FieldName
FROM table
WHERE datatype = 'VARCHAR' AND datalength = 2
But, again, you've got to know what the valid types are; you can't determine that dynamically with SQL without dynamic SQL. Variables and parameters are not allowed to be used for object or type names. However, no matter what you do, you need to remember that all data in a given column of a result set must be of the same datatype.
Most Entity-Attribute-Value tables like this sacrifice data integrity that strong typing brings by accepting that the data type is determined by the application and not the RDBMS. EAV does not allow you to have your cake (store data without a fixed schema) and eat it, too (enjoy DB enforced strong data typing, not having to typecast strings in the application, etc.).
EAV breaks data normalization pretty badly. It breaks First Normal Form; the most basic rule, and this is just one of the consequences. EAV tables will make querying the data anywhere from awkward to extremely difficult, and you're almost always going to sacrifice performance doing it because the RDBMS is built around the relational model.
That doesn't mean you shouldn't ever use EAV tables. They're relatively great for user defined fields. However, it does mean that they're always going to suck to query and manage. That's just the tradeoff. You broke First Normal Form. Querying and performance are going to suffer consequences of that choice.
If you really want to store your all your data like this, you should look at either storing data as blobs of XML or JSON (SQL Server 2016) -- but that's a general pain to query -- or use a NoSQL data store like MongoDB or Cassandra instead of an SQL RDBMS.

Related

Storing Serialized Information In SQL Server using F#

I am currently working on a project in F# that takes in data from Excel spreadsheets, determines if it is compatible with an existing table in SQL Server, and then adds the relevant rows to the existing table.
Some of the data I am working with is more specific than the types provided by T-SQL. That is, T-SQL has a type "date", but I need to distinguish between sets of dates that are at the beginning of each month or the end of each month. This same logic applies to many other types as well. If I have types:
Date(Beginning)
Date(End)
they will both be converted to the T-SQL type "date" before being added to the table, therefore erasing some of the more specific information.
In order to solve this problem, I am keeping a log of the serialized types in F#, along with which column number in the SQL Server table they apply to. My question is: is there any way to store this log somewhere internally in SQL Server so that I can access it and compare the serialized types of the incoming data to the serialized types of the data that already exists in the table before making new inserts?
Keeping metadata outside of the DB and maintaining them manually makes your DB "expensive" to manage plus increases the risk of errors that you might not even detect until something bad happens.
If you have control over the table schema, there are at least a couple of simple options. For example, you can add a column that stores the type info. For something simple with just a couple of possible values as you described, just add a new column to store the actual type value. Update the F# code to de-serialize the source into separate DATE and type (BEGINNING/END) values which are then inserted to the table. Simple, easy to maintain and easily consumed.
You could also create a user defined type for each date subtype but that can be confusing to another DBA/dev plus makes it more complicated when retrieving data from your application. This is generally not a good approach.
Yes, you can do that if you want to.

SQL Server XML Values Integrity, Consistency, Accuracy

EDIT The XML value is saved in a XML column in SQL server with the entire transaction
I have a general question I suppose regarding the integrity of XML values stored in a SQL Server database.
We are working with very imnportant data elements in regards to healthcare. We currently utilize a BizTalk server that parses very complex looped and segmented files for eligibility and BizTalk parses the file, pushes out an XML "value" does some validation and then pushes it to the data tables.
I have a request from a Director of mine to create a report off of those XML values.
So I have trouble doing this for a couple reasons:
1) I would like to understand what exactly the XML has, does this data retain it's integrity regardless of whether we store the value in a table or store it in the XML?
2) Consistency - Will this data be consistent? Or does the fact that we are looking at XML values over and over using XML values to join the existing table to the XML "table" make the consistency an issue?
3) Accuracy - I would like this data to be accurate and consistent. I guess I'm having a hard time trusting that this data is available in the same form the data in a table is...
Am I being too overcautious here? Or are there valid reasons why this would not be a good idea to create reports for external clients?
Let me know if I can provide anything else, I'm looking for high-level comments, code should be somewhat irrelevant other than we have to use a value in the XML to render other values in the XML for linking purposes.
Off the bat I can think that this may not be consistent in that it's not set up like a DB table. No Primary Key, No Duplicate checks, No Indexing, etc...Is this true also?
Thanks in advance!
I think this article will answer your concerns: http://msdn.microsoft.com/en-us/library/hh403385.aspx
If you are treating a row with an xml column as your grain, the database will keep it transactionally consistent. With the XML type, you can use XML indexes to speed up your queries, which would be an advantage over storing this as varchar(max). Does this answer your question?

ADO - Can I edit results of a complex query with multiple join statements?

I'm working on a data conversion utility which can push data from one master database out to a number of different databases. The utility its self will have no knowledge of how data is kept in the destination (table structure), but I would like to provide writing a SQL statement to return data from the destination using a complex SQL query with multiple join statements. As long as the data is in a standardized format that the utility can recognize (field names) in an ADO query.
What I would like to do is then modify the live data in this ADO Query. However, since there are multiple join statements, I'm not sure if it's possible to do this. I know at least with BDE (I've never used BDE), it was very strict and you had to return all fields (*) and such. ADO I know is more flexible, but I don't know quite how flexible in this case.
Is it supposed to be possible to modify data in a TADOQuery in this manner, when the results include fields from different tables? And even if so, suppose I want to append a new record to the end (TADOQuery.Append). Would it append to two different tables?
The actual primary table I'm selecting from has a complimentary table which is joined by the same primary key field, one is a "Small" table (brief info) and the other is a "Detail" table (more info for each record in Small table). So, a typical statement would include something like this:
select ts.record_uid, ts.SomeField, td.SomeOtherField from table_small ts
join table_detail td on td.record_uid = ts.record_uid
There are also a number of other joins to records in other tables, but I'm not worried about appending to those ones. I'm only worried about appending to the "Small" and "Detail" tables - at the same time.
Is such a thing possible in an ADO Query? I'm willing to tweak and modify the SQL statement in any way necessary to make this possible. I have a bad feeling though that it's not possible.
Compatibility:
SQL Server 2000 through 2008 R2
Delphi XE2
Editing these Fields which have no influence on the joins is usually no problem.
Appending is ... you can limit the Append to one of the Tables by
procedure TForm.ADSBeforePost(DataSet: TDataSet);
begin
inherited;
TCustomADODataSet(DataSet).Properties['Unique Table'].Value := 'table_small';
end;
but without an Requery you won't get much further.
The better way will be setting Values by Procedure e.g. in BeforePost, Requery and Abort.
If your View would be persistent you would be able to use INSTEAD OF Triggers
Jerry,
I encountered the same problem on FireBird, and from experience I can tell you that it can be made(up to a small complexity) by using CachedUpdates . A very good resource is this one - http://podgoretsky.com/ftp/Docs/Delphi/D5/dg/11_cache.html. This article has the answers to all your questions.
I have abandoned the original idea of live ADO query updates, as it has become more complex than I can wrap my head around. The scope of the data push project has changed, and therefore this is no longer an issue for me, however still an interesting subject to know.
The new structure of the application consists of attaching multiple "Field Links" on various fields from the original set of data. Each of these links references the original field name and a SQL Statement which is to be executed when that field is being imported. Multiple field links can be on one single field, therefore can execute multiple statements, placing the value in various tables, etc. The end goal was an app which I can easily and repeatedly export a common dataset from an original source to any outside source with different data structures, without having to recompile the app.
However the concept of cached updates was not appealing to me, simply for the fact pointed out in the link in RBA's answer that data can be changed in the database in the mean-time. So I will instead integrate my own method of customizable data pushes.

How to store XML result of WebService into SQL Server database?

We have got a .Net Client that calls a Webservice. We want to store the result in a SQL Server database.
I think we have two options here how to store the data, and I am a bit undecided as I can't see the pros and cons clearly: One would be to map the results into database fields. That would require us to have database fields corresponding to each possible result type, e.g. for each "normal" result type as well as those for faults.
On the other hand, we could store the resulting XML and query that via the SQL Server built in XML functions.
Personally, I am comfortable with dealing with both SQL and XML, so both look fine to me.
Are there any big pros and cons and what would I need to consider in terms of database design when trying to store the resulting XML for quite a few different possible Webservice operations? I was thinking about a result table for each operation that we call with different entries for the different possible outcomes / types and then store the XML in the right field, e.g. a fault in the fault field, a "normal" return type in the appropriate field etc.
We use a combination of both. XML for reference and detailed data, and text columns for fields you might search on. Searchable columns include order number, customer reference, ticket number. We just add them when we need them since you can extract them from the XML column.
I wouldn't recommend just the XML. If you store 10.000 messages a day, a query like:
select * from XmlLogging with (nolock) where Response like '%Order12%'
can become slow and interfere with other queries. You also can't display the logging in a GUI because retrieval is too slow.
I wouldn't recommend just the text columns either. If the XML format changes, you'd get an empty column. That's hard to troubleshoot without the XML message. In addition, if you need to "replay" the message stream, that's a lot easier with the XML messages. Few requirements demand replay, but it's really helpful when repairing the fallout of production problems.

SQLBulkCopy and Dates (1/1/1753)

I've got an application which has been working fine for quite a while, but there is an annoying item that continues to get in the way on occasion.
Let's say that I use an object such as OracleDataReader or MySQLDataReader to pass the data to the sqlbulkcopy object for insert. Let's assume that all the columns maps just fine and for the most part, it all works well.
Granted, I don't have control over the source application or database (which is either MySQL or Oracle). So some goof goes into a different application and puts in a date on the invoice table of 5/31/0210. He really meant to put in 5/31/2010, but the application he's using is not validating the data very tightly and the Oracle database accepts it. For all intensive purposes, the data of 5/31/0210 is a valid date for the Oracle db. It might be stupid in terms of data entry, but it is what it is at this point.
Now our OracleDataReader comes along and is transferring this invoice table over to SQL Server via the SQLBulkCopy. It is passing the data to perfectly matched table with the right column names and data types. You can see what is going to happen. This date of 05/31/0210 from Oracle is not accepted by the SQL Server db engine, as the DATETIME field only allows dates from 1/1/1753 to 12/31/9999.
When it encounters this record, it simply fails and gives an overflow error. It doesn't skip the record, it kills the feed. So if it happens a thousand records in on a million record table, you don't get the remaining 999,000 records.
Is there anyway to get around this issue so that the feed will continue?
Ideally, I'd like to move the receiving SQL Server DB to 2008 and use DATETIME2, which would allow for these goofy dates, but unfortunately not all my clients are ready to move to this version yet, so I'm stuck with DATETIME in SQL 2000/2005/2008.
Any ideas on how to get around this without changing the SQL? Ideally, I wouldn't mind if it just skipped the record. I know that I could do this in the SQL for the datareder, but this would be extremely complicated when you have twenty date fields in a single query. It would be maintenance nightmare.
Any thoughts would be appreciated.
One option would be to change the datetime column type to varchar. Then add a derived column for converting the string to datetime. The trick would be to use a function in the derived column to validate the date and put an arbitrary datetime if the coversion will fail. If you do heavy date comparisons, persist the computed column and/or index it.
I say all of this under the impression that sqlbulkcopy is not able to do transforms. Maybe you can. Hopefully, someone will chime in with a way to.
SSIS would be great in this situation, as you could do the transform and also get the performance benefits of the bulk update lock.

Resources