MSSQL - Select NULL values on varbinary column - sql-server

I am trying to do a select on a table with around 35.000 rows, one of the columns is a varbinary with may have some NULL values and I need to exclude those NULL values from my results.
The problem is: when you do a select on a varbinary column, it takes a lot of time to complete, I've found a way that I'm not sure I can use or is the best way to do something like that and I would like some opinions.
Here is it:
SELECT REQUEST,REQLOCATION,DESCRIPT,BLOBNAME,BLOBSIZE,substring(BLOBVALUE,0,1) AS BLOBVALUE,BLOBMODE,BLOBPATH,BLOBID,
REDIRID,ANALYST,CLIENT,SEVENT,PACKAGE,INSERTDATE
FROM REQBLOB WHERE substring(BLOBVALUE,0,1) IS NOT NULL
The varbinary column is the "BLOBVALUE" one where I do a "substring" select and this query gave me a result of 20.000 rows instantly and I think it's returned only valid data, not NULLs, what you think about that?

Get rid of the SUBSTRING in the WHERE, it's making your query non-SARGable; that's why it's slow.
SELECT REQUEST,
REQLOCATION,
DESCRIPT,
BLOBNAME,
BLOBSIZE,
SUBSTRING(BLOBVALUE, 0, 1) AS BLOBVALUE,
BLOBMODE,
BLOBPATH,
BLOBID,
REDIRID,
ANALYST,
CLIENT,
SEVENT,
PACKAGE,
INSERTDATE
FROM REQBLOB
WHERE BLOBVALUE IS NOT NULL;
Why are you using substring on a varbinary anyway though..?

Related

How can I check and remove duplicate rows?

Have problem with quite big table, where are some null values in 3 columns - datetime2 (and 2 float columns).
Nice simple request from similar question returns only 2 rows where datetime2 is null, but nothing else (same as lot of others):
DELETE FROM MyTable
LEFT OUTER JOIN (
SELECT MIN(RowId) as RowId, allRemainingCols
FROM MyTable
GROUP BY allRemainingCols
) as KeepRows ON
MyTable.RowId = KeepRows.RowId
WHERE
KeepRows.RowId IS NULL
Seems to work without datetime2 column having nulls ??
There is manual workaround, but is there any way to create request or procedure using TSQL only ?
SELECT id,remainingColumns
FROM table
order BY remainingColumns
Compare all columns in XL (15 in my case, placed =ROW() in first column as a check and formula next to last column + auto filter for TRUEs): =AND(B1=B2;C1=C2;D1=D2;E1=E2;F1=F2;G1=G2;H1=H2;I1=I2;J1=J2;K1=K2;L1=L2;M1=M2;N1=N2;O1=O2;P1=P2)
Or compare 3 rows like this and select all non-unique rows
=OR(
AND(B1=B2;C1=C2;D1=D2;E1=E2;F1=F2;G1=G2;H1=H2;I1=I2;J1=J2;K1=K2;L1=L2;M1=M2;N1=N2;O1=O2;P1=P2);
AND(B2=B3;C2=C3;D2=D3;E2=E3;F2=F3;G2=G3;H2=H3;I2=I3;J2=J3;K2=K3;L2=L3;M2=M3;N2=N3;O2=O3;P2=P3)
)
Quite much work to find my particular data/answer...
Most of float numbers were slightly different.
Hard to find, but simple CAST(column as binary) can show these invisible differences...
Like 96,6666666666667 vs 0x0000000000000000000000000000000000000000000040582AAAAAAAAAAD vs 0x0000000000000000000000000000000000000000000040582AAAAAAAAAAB etc.
And visible 96.6666666666667 can return something different way again:
0x0000000000000000000000000000000000000F0D0001AB6A489F2D6F0300

How to merge columns in SQL of the same table

I have two date columns.
Sometimes they both have dates(Which will be same always in both the columns) and sometimes one is empty and one has date value.
So, instead of two columns, I am trying to get one column.
If one is empty it will take date value from other column and if both have values(which will always be same) it will just take any of the value from the two columns.
I have tried UNION commands but its not giving me the desired result.
SQL Server has a couple different options for this scenario. You can use COALESCE, ISNULL, or a CASE statement.
Based on the information you provided I would use COALESCE. It offers several benefits over ISNULL and is very simple to implement. A CASE statement seems like overkill for what you are trying to do. Check out the link above for more info on each solution.
Welcome to Stack Overflow!
You need Coalesce
Also, in the future, you should put sample data and metadata in text in your question, rather than as attachments.
You could use the ISNULL statement if it is SQL
SELECT ISNULL(ReturnDate,RepartureDate) as dateAct FROM AviationReservation_dev
UPDATE tableName
SET Date1Column = ISNULL(Date1Column, Date2Column);
Context: ISNULL ( check_expression , replacement_value ), if first argument is not null, it will use that argument.
After the update, delete the other column.
It seems there is no case for both column to be empty, then in such condition, you can do something like this:
SELECT
CASE
WHEN column1 IS NULL THEN column2
WHEN column2 IS NULL THEN column1
ELSE column1 orcolumn2

Convert Date Stored as VARCHAR into INT to compare to Date Stored as INT

I'm using SQL Server 2014. My request I believe is rather simple. I have one table containing a field holding a date value that is stored as VARCHAR, and another table containing a field holding a date value that is stored as INT.
The date value in the VARCHAR field is stored like this: 2015M01
The data value in the INT field is stored like this: 201501
I need to compare these tables against each other using EXCEPT. My thought process was to somehow extract or TRIM the "M" out of the VARCHAR value and see if it would let me compare the two. If anyone has a better idea such as using CAST to change the date formats or something feel free to suggest that as well.
I am also concerned that even extracting the "M" out of the VARCHAR may still prevent the comparison since one will still remain VARCHAR and the other is INT. If possible through a T-SQL query to convert on the fly that would be great advice as well. :)
REPLACE the string and then CONVERT to integer
SELECT A.*, B.*
FROM TableA A
INNER JOIN
(SELECT intField
FROM TableB
) as B
ON CONVERT(INT, REPLACE(A.varcharField, 'M', '')) = B.intField
Since you say you already have the query and are using EXCEPT, you can simply change the definition of that one "date" field in the query containing the VARCHAR value so that it matches the INT format of the other query. For example:
SELECT Field1, CONVERT(INT, REPLACE(VarcharDateField, 'M', '')) AS [DateField], Field3
FROM TableA
EXCEPT
SELECT Field1, IntDateField, Field3
FROM TableB
HOWEVER, while I realize that this might not be feasible, your best option, if you can make this happen, would be to change how the data in the table with the VARCHAR field is stored so that it is actually an INT in the same format as the table with the data already stored as an INT. Then you wouldn't have to worry about situations like this one.
Meaning:
Add an INT field to the table with the VARCHAR field.
Do an UPDATE of that table, setting the INT field to the string value with the M removed.
Update any INSERT and/or UPDATE stored procedures used by external services (app, ETL, etc) to do that same M removal logic on the way in. Then you don't have to change any app code that does INSERTs and UPDATEs. You don't even need to tell anyone you did this.
Update any "get" / SELECT stored procedures used by external services (app, ETL, etc) to do the opposite logic: convert the INT to VARCHAR and add the M on the way out. Then you don't have to change any app code that gets data from the DB. You don't even need to tell anyone you did this.
This is one of many reasons that having a Stored Procedure API to your DB is quite handy. I suppose an ORM can just be rebuilt, but you still need to recompile, even if all of the code references are automatically updated. But making a datatype change (or even moving a field to a different table, or even replacinga a field with a simple CASE statement) "behind the scenes" and masking it so that any code outside of your control doesn't know that a change happened, not nearly as difficult as most people might think. I have done all of these operations (datatype change, move a field to a different table, replace a field with simple logic, etc, etc) and it buys you a lot of time until the app code can be updated. That might be another team who handles that. Maybe their schedule won't allow for making any changes in that area (plus testing) for 3 months. Ok. It will be there waiting for them when they are ready. Any if there are several areas to update, then they can be done one at a time. You can even create new stored procedures to run in parallel for any updated app code to have the proper INT datatype as the input parameter. And once all references to the VARCHAR value are gone, then delete the original versions of those stored procedures.
If you want everything in the first table that is not in the second, you might consider something like this:
select t1.*
from t1
where not exists (select 1
from t2
where cast(replace(t1.varcharfield, 'M', '') as int) = t2.intfield
);
This should be close enough to except for your purposes.
I should add that you might need to include other columns in the where statement. However, the question only mentions one column, so I don't know what those are.
You could create a persisted view on the table with the char column, with a calculated column where the M is removed. Then you could JOIN the view to the table containing the INT column.
CREATE VIEW dbo.PersistedView
WITH SCHEMA_BINDING
AS
SELECT ConvertedDateCol = CONVERT(INT, REPLACE(VarcharCol, 'M', ''))
--, other columns including the PK, etc
FROM dbo.TablewithCharColumn;
CREATE CLUSTERED INDEX IX_PersistedView
ON dbo.PersistedView(<the PK column>);
SELECT *
FROM dbo.PersistedView pv
INNER JOIN dbo.TableWithIntColumn ic ON pv.ConvertedDateCol = ic.IntDateCol;
If you provide the actual details of both tables, I will edit my answer to make it clearer.
A persisted view with a computed column will perform far better on the SELECT statement where you join the two columns compared with doing the CONVERT and REPLACE every time you run the SELECT statement.
However, a persisted view will slightly slow down inserts into the underlying table(s), and will prevent you from making DDL changes to the underlying tables.
If you're looking to not persist the values via a schema-bound view, you could create a non-persisted computed column on the table itself, then create a non-clustered index on that column. If you are using the computed column in WHERE or JOIN clauses, you may see some benefit.
By way of example:
CREATE TABLE dbo.PCT
(
PCT_ID INT NOT NULL
CONSTRAINT PK_PCT
PRIMARY KEY CLUSTERED
IDENTITY(1,1)
, SomeChar VARCHAR(50) NOT NULL
, SomeCharToInt AS CONVERT(INT, REPLACE(SomeChar, 'M', ''))
);
CREATE INDEX IX_PCT_SomeCharToInt
ON dbo.PCT(SomeCharToInt);
INSERT INTO dbo.PCT(SomeChar)
VALUES ('2015M08');
SELECT SomeCharToInt
FROM dbo.PCT;
Results:

How can I search a nvarchar field to see if there are any lower case characters in the value?

I'm trying to query my SQL Server 2000 database to see if some of my columns contain lower case values.
For example, if a column contains the value THIS IS VALID then this is valid. If the column value is THIS IS VALID I Snuck In lol: SOME VALUES then this is not valid and would like to return the row.
I really don't want to do this manually because it would be quite error prone, time consuming (50k rows) and 20 columns.
SELECT *
FROM YourTable
where YourCol LIKE '%[a-z]%' COLLATE Latin1_General_BIN

Indexing on DateTime and VARCHAR fields in SQL Server 2000, which one is more effectient?

We have a CallLog table in Microsoft SQL Server 2000. The table contains CallEndTime field whose type is DATETIME, and it's an index column.
We usually delete free-charge calls and generate monthly fee statistics report and call detail record report, all the SQLs use CallEndTime as query condition in WHERE clause. Due to a lot of records exist in CallLog table, the queries are slow, so we want to optimize it starting from indexing.
Question
Will it more effictient if query upon an extra indexed VARCHAR column CallEndDate ?
Such as
-- DATETIME based query
SELECT COUNT(*) FROM CallLog WHERE CallEndTime BETWEEN '2011-06-01 00:00:00' AND '2011-06-30 23:59:59'
-- VARCHAR based queries
SELECT COUNT(*) FROM CallLog WHERE CallEndDate BETWEEN '2011-06-01' AND '2011-06-30'
SELECT COUNT(*) FROM CallLog WHERE CallEndDate LIKE '2011-06%'
SELECT COUNT(*) FROM CallLog WHERE CallEndMonth = '2011-06'
It has to be the datetime. Dates are essentially stored as a number in the database so it is relatively quick to see if the value is between two numbers.
If I were you, I'd consider splitting the data over multiple tables (by month, year of whatever) and creating a view to combine the data from all those tables. That way, any functionality which needs to entire data set can use the view and anything which only needs a months worth of data can access the specific table which will be a lot quicker as it will contain much less data.
I think comparing DateTime is much faster than LIKE operator.
I agree with DoctorMick on Spliting your DateTime as persisted columns Year, Month, Day
for your query which selects COUNT(*), check if in the execution plan there is a Table LookUp node. if so, this might be because your CallEndTime column is nullable. because you said that you have a [nonclustered] index on CallEndTime column. if you make your column NOT NULL and rebuild that index, counting it would be a INDEX SCAN which is not so slow.and I think you will get much faster results.

Resources