Is it possible to do XQuery through linq? - sql-server

I have an XML field which has about 5MB of data every record, but sometimes, I only need to read a small part of the XML field. As you can imagine, if I read the whole XML field and then using Linq-to-XML to parse the XML file and gather the value, it would be too slow and expensive. So I want know, is it possible to get a value directly using Linq instead of read the whole XML field?
My DB is SQL Server 2008

With the current information provided I think the best solution is to use an XML index in SQL Server.
There are four types of XML indexes:
Primary
Secondary for PATH
Secondary for PROPERTY
Secondary for VALUE
In your case it appears you know the path to the data you want, naturally a secondary PATH index seems to be the best fit.
Follow these steps to create this index:
Create primary index
create primary xml index XIX_XmlColumnName on XmlTable(XmlColumnName)
go
This will create the "base" index for your xml column, basically this means that the xml will be shredded to a hidden table and stored with values where every element is turned into one row.
Create secondary path index
create xml index XIX_XmlColumnName_Path on XmlTable(XmlColumnName)
using xml index XIX_XmlColumnName for path
go
This will create a secondary index using the path-column in the primary index (which we now know is a table).
Finally, run a (sql) query such as this in a procedure and call that from your application:
select XmlColumnName.query('/path/to/element')
from XmlTable
Granted, this is not a linq-query/solution, but imo it's always best to use a tool that fits, and not try to force it.
For more in-depth information about xml indexes, see this msdn-article.

Related

Why can't columnar databases like Snowflake and Redshift change the column order?

I have been working with Redshift and now testing Snowflake. Both are columnar databases. Everything I have read about this type of databases says that they store the information by column rather than by row, which helps with the massive parallel processing (MPP).
But I have also seen that they are not able to change the order of a column or add a column in between existing columns (don't know about other columnar databases). The only way to add a new column is to append it at the end. If you want to change the order, you need to recreate the table with the new order, drop the old one, and change the name of the new one (this is called a deep copy). But this sometimes can't be possible because of dependencies or even memory utilization.
I'm more surprised about the fact that this could be done in row databases and not in columnar ones. Of course, there must be a reason why it's not a feature yet, but I clearly don't have enough information about it. I thought it was going to be just a matter of changing the ordinal of the tables in the information_schema but clearly is not that simple.
Does anyone know the reason of this?
Generally, column ordering within the table is not considered to be a first class attribute. Columns can be retrieved in whatever order you require by listing the names in that order.
Emphasis on column order within a table suggests frequent use of SELECT *. I'd strongly recommend not using SELECT * in columnar databases without an explicit LIMIT clause to minimize the impact.
If column order must be changed you do that in Redshift by creating a new empty version of the table with the columns in the desired order and then using ALTER TABLE APPEND to move the data into the new table very quickly.
https://docs.aws.amazon.com/redshift/latest/dg/r_ALTER_TABLE_APPEND.html
The order in which the columns are stored internally cannot be changed without dropping and recreating them.
Your SQL can retrieve the columns in any order you want.
General requirement to have columns listed in some particular order is for the viewing purpose.
You could define a view to be in the desired column order and use the view in the required operation.
CREATE OR REPLACE TABLE CO_TEST(B NUMBER,A NUMBER);
INSERT INTO CO_TEST VALUES (1,2),(3,4),(5,6);
SELECT * FROM CO_TEST;
SELECT A,B FROM CO_TEST;
CREATE OR REPLACE VIEW CO_VIEW AS SELECT A,B FROM CO_TEST;
SELECT * FROM CO_VIEW;
Creating a view to list the columns in the required order will not disturb the actual table underneath the view and the resources associated with recreation of the table is not wasted.
In some databases (Oracle especially) ordering columns on table will make difference in performance by storing NULLable columns at the end of the list. Has to do with how storage is beiing utilized within the data block.

Behind the scene operations for ALTER COLUMN statement in SQL Server

I am altering the column datatype for a table with around 100 Million records using the below query:
ALTER TABLE dbo.TARGETTABLE
ALTER COLUMN XXX_DATE DATE
The column values are in the right date format as I inserted original date from a valid data source.
However, the query have been running for a long time and even when I attempt to cancel the query it seems to take forever.
Can anyone explain what is happening behind the scene in SQL Server when an ALTER TABLE STATEMENT is executed and why requires such resources?
There are a lot of variables that will make these Alter statements
make multiple passes through your table and make heavy use of TempDB
and depending on efficiency of TempDB it could be very slow.
Examples include whether or not the column you are changing is in the
index (especally clustered index since non-clustering key carries the
clustering index).
Instead of altering table...i will give you one simple exmaple...so you can try this....
Suppose your table name is tblTarget1
Create the another table (tblTarget2) with same structure...
Change the dataType of tblTarget2.....
Copy the data from tblTarget1 To tblTarget2 using Insert into query....
Drop the original table(tblTarget1)
Rename the tblTarget2 as tblTarget1
The main Reaseon is that....changing the data type will take a lot of data transfer and data page alignment....
For more Information you can follow this Link
Another approach to do this is the following:
Add new column to the table - [_date] date
Using batch update you can change transfer the values from the old to the new column without blocking the table for the other users.
Then in one transaction do the following:
update all of the new values inserted after the update is done
drop the old column
rename the new column
Note, if you have an index on this field you need to drop it before deleting the old column and create if after renaming the new one.

Creating Full Text index on temporary table

When I create a full text index on a temporary table #Table in a query? I got Invalid name object #Table.
Is creating full text index possible in sql server?
According to the documentation, no it is not possible:
A full-text index must be defined on a base table; it cannot be
defined on a view, system table, or temporary table.
This should be clarified to point out that since the version that documentation was written for, indexed views were added to SQL Server, and documentation there states that:
one full-text index is allowed per table or indexed view

Cannot use a CONTAINS or FREETEXT predicate on table or indexed view

I am trying to modify a stored procedure ( adding a new column in select statement) but I am getting this error:
Cannot use a CONTAINS or FREETEXT predicate on table or indexed view 'vwPersonSearch' because it is not full-text indexed.
When I try to create a Full text index on view 'vwPersonSearch' using SQL server 2008 R 2 management studio, I am getting this error:
A unique column must be defined on this table/view.
Please suggest solution to it
To create a full text index, you must specify a key index, which must be a unique, single-key, non-nullable column. An integer column type is recommended for best performance.
See http://technet.microsoft.com/en-us/library/ms187317.aspx for more details.
You may alter a column to be unique if that's one that could be or add an id of some sort to do that part.

How to convert a T-SQL column type and data in-place?

I have a SQL Server 2005 database for a web site that stores user passwords in plaintext, and I would like to hash and salt them. I know how to use the HashBytes function to get and compare hashes, but I don't know the best way to go about converting the existing password column data. It's currently stored as a varchar(50) column and I would like to use binary(20) since I'm planning on using SHA-1.
I was thinking about SELECT INTO a temporary table, ALTER the existing column type, then INSERT the hashed and salted passwords back where the user ID's match. Is this a valid approach? Is there a way to do it in-place without a temp table?
Thanks!
You could just store the binary info as a hex string. This has some benefits:
Allows easy in-place UPDATE query
No need for temp tables
No need to change the structure of the table (your end result is a string)
I suggest writing functions to help you with the salting/hashing/hex-conversion (and vice versa.)
Yes.
Bear in mind adding a column will very likely cause page splits, so consider running maintenance soon to rebuild/defrag indexes and data pages.
ALTER TABLE MyTable ADD 'BinaryPW' binary(20)
UPDATE MyTable
SET BinaryPW = MyHashFunction(VarCharPW)
UPDATE MyTable
SET VarCharPW = ''

Resources