Integrate regular and full text index in SQL Server - sql-server

I have a table in SQL Server with the following columns:
id int (primary key)
text nvarchar(max) (full text indexed)
type int
and I have queries like this:
where Contains([text], #text)
or
where Contains([text], #text) AND [type] = 3
However the second query is slow. I think I should integrate full text index with [type] field.
Is there another solution?
Thanks

I'm assuming you're not running SQL 2008, as the integrated full text engine in that version should make better decisions for a query such as yours. For earlier versions, I've had success by embedding additional keys in the text with some form of a custom tag. You'll need some triggers to keep the text up to date with the keys.
e.g., "This is my sample text.
TypeKey_3"
Then your where clause becomes something like:
where Contains([text], #text AND "TypeKey_" + #type)

Given that you cannot add an integer field to a full text index your best bet is to add a regular index to [type].

Related

Getting length of binary column in snowflake using information schema

As the title suggests, I want to determine the length that I have specified while creating the column of type BINARY in Snowflake. I tried to get this information from Information_Schema.COLUMNS view. But on inspecting the result I did not see any columns that had this information. I thought CHARACTER_OCTET_LENGTH of this view might contain this info but it does not.
I am aware that I can also use SHOW COLUMNS IN TABLE <tab_name> but for my requirement I only want to use the information_schema.
Is this information not stored in the information_schema?
I know you don't want this solution, but I will just put it here for "other people" looking for the same thing.
create table test.test.test_len(bin_10 binary(10), bin_200 binary(200) );
show columns in table test.test.test_len;
select
"column_name" as name
,parse_json("data_type"):length::number as len
from table(RESULT_SCAN());
NAME
LEN
BIN_10
10
BIN_200
200

Customize Normalization in SQL Server Full Text Search by replacing characters

I want to customize SQL Server FTS to handle language specific features better.
In many language like Persian and Arabic there are similar characters that in a proper search behavior they should consider as identical char like these groups:
['آ' , 'ا' , 'ء' , 'ا']
['ي' , 'ی' , 'ئ']
Currently my best solution is to store duplicate data in new column and replace these characters with a representative member and also normalize search term and perform search in the duplicated column.
Is there any way to tell SQL Server to treat any members of these groups as an identical character?
as far as i understand ,this would be used for suggestioning purposes so the being so accurate is not important. so
in farsi actually none of the character in list above doesn't share same meaning but we can say they do have a shared short form in some writing cases ('آ' != 'اِ' but they both can write as 'ا' )
SCENARIO 1 : THE INPUT TEXT IS IN COMPLETE FORM
imagine "محمّد" is a record in a table formatted (id int,text nvarchar(12))named as 'table'.
after removing special character we can use following command :
select * from [db].[dbo].[table] where text REPLACE(text,' ّ ','') = REPLACE(N'محمد',' ّ ','');
the result would be
SCENARIO 2: THE INPUT IS IN SHORT FORMAT
imagine "محمد" is a record in a table formatted (id int,text nvarchar(12))named as 'table'.
in this scenario we need to do some logical operation on text before we query in data base
for e.g. if "محمد" is input as we know and have a list of this special character ,it should be easily searched in query as :
select * from [db].[dbo].[table] where REPLACE(text,' ّ ','') = 'محمد';
note:
this solution is not exactly a best one because the input should not be affected in client side it, would be better if the sql server configure to handle this.
for people who doesn't understand farsi simply he wanna tell sql that َA =["B","C"] and a have same value these character in the list so :
when a "dad" word searched, if any word "dbd" or "dcd" exist return them too.
add:
some set of characters can have same meaning some of some times not ( ['ي','أ'] are same but ['آ','اِ'] not) so in we got first scenario :
select * from [db].[dbo].[table] where text like N'%هی[أي]ت' and text like N'هی[أي]ت%';

SQL query result truncated [duplicate]

How do you view ALL text from an NTEXT or NVARCHAR(max) in SQL Server Management Studio? By default, it only seems to return the first few hundred characters (255?) but sometimes I just want a quick way of viewing the whole field, without having to write a program to do it. Even SSMS 2012 still has this problem :(
I was able to get the full text (99,208 chars) out of a NVARCHAR(MAX) column by selecting (Results To Grid) just that column and then right-clicking on it and then saving the result as a CSV file. To view the result open the CSV file with a text editor (NOT Excel). Funny enough, when I tried to run the same query, but having Results to File enabled, the output was truncated using the Results to Text limit.
The work-around that #MartinSmith described as a comment to the (currently) accepted answer didn't work for me (got an error when trying to view the full XML result complaining about "The '[' character, hexadecimal value 0x5B, cannot be included in a name").
Quick trick-
SELECT CAST('<A><![CDATA[' + CAST(LogInfo as nvarchar(max)) + ']]></A>' AS xml)
FROM Logs
WHERE IDLog = 904862629
In newer versions of SSMS it can be configured in the (Query/Query Options/Results/Grid/Maximum Characters Retrieved) menu:
Old versions of SSMS
Options (Query Results/SQL Server/Results to Grid Page)
To change the options for the current queries, click Query Options on the Query menu, or right-click in the SQL Server Query window and select Query Options.
...
Maximum Characters Retrieved
Enter a number from 1 through 65535 to specify the maximum number of characters that will be displayed in each cell.
Maximum is, as you see, 64k. The default is much smaller.
BTW Results to Text has even more drastic limitation:
Maximum number of characters displayed in each column
This value defaults to 256. Increase this value to display larger result sets without truncation. The maximum value is 8,192.
I have written an add-in for SSMS and this problem is fixed there. You can use one of 2 ways:
you can use "Copy current cell 1:1" to copy original cell data to clipboard:
http://www.ssmsboost.com/Features/ssms-add-in-copy-results-grid-cell-contents-line-with-breaks
Or, alternatively, you can open cell contents in external text editor (notepad++ or notepad) using "Cell visualizers" feature: http://www.ssmsboost.com/Features/ssms-add-in-results-grid-visualizers
(feature allows to open contents of field in any external application, so if you know that it is text - you use text editor to open it. If contents is binary data with picture - you select view as picture. Sample below shows opening a picture):
Return data as XML
SELECT CONVERT(XML, [Data]) AS [Value]
FROM [dbo].[FormData]
WHERE [UID] LIKE '{my-uid}'
Make sure you set a reasonable limit in the SSMS options window, depending on the result you're expecting.
This will work if the text you're returning doesn't contain unencoded characters like & instead of & that will cause the XML conversion to fail.
Returning data using PowerShell
For this you will need the PowerShell SQL Server module installed on the machine on which you'll be running the command.
If you're all set up, configure and run the following script:
Invoke-Sqlcmd -Query "SELECT [Data] FROM [dbo].[FormData] WHERE [UID] LIKE '{my-uid}'" -ServerInstance "database-server-name" -Database "database-name" -Username "user" -Password "password" -MaxCharLength 10000000 | Out-File -filePath "C:\db_data.txt"
Make sure you set the -MaxCharLength parameter to a value that suits your needs.
I was successful with this method today. It's similar to the other answers in that it also converts the contents to XML, just using a different method. As I didn't see FOR XML PATH mentioned amongst the answers, I thought I'd add it for completeness:
SELECT [COL_NVARCHAR_MAX]
FROM [SOME_TABLE]
FOR XML PATH(''), ROOT('ROOT')
This will deliver a valid XML containing the contents of all rows, nested in an outer <ROOT></ROOT> element. The contents of the individual rows will each be contained within an element that, for this example, is called <COL_NVARCHAR_MAX>. The name of that can be changed using an alias via AS.
Special characters like &, < or > or similar will be converted to their respective entities. So you may have to convert <, > and & back to their original character, depending on what you need to do with the result.
EDIT
I just realized that CDATA can be specified using FOR XML too. I find it a bit cumbersome though. This would do it:
SELECT 1 as tag, 0 as parent, [COL_NVARCHAR_MAX] as [COL_NVARCHAR_MAX!1!!CDATA]
FROM [SOME_TABLE]
FOR XML EXPLICIT, ROOT('ROOT')
PowerShell Alternative
This is an old post and I read through the answers. Still, I found it a bit too painful to output multi-line large text fields unaltered from SSMS. I ended up writing a small C# program for my needs, but got to thinking it could probably be done using the command line. Turns out, it is fairly easy to do so with PowerShell.
Start by installing the SqlServer module from an administrative PowerShell.
Install-Module -Name SqlServer
Use Invoke-Sqlcmd to run your query:
$Rows = Invoke-Sqlcmd -Query "select BigColumn from SomeTable where Id = 123" `
-MaxCharLength 2147483647 -ConnectionString $ConnectionString
This will return an array of rows that you can output to the console as follows:
$Rows[0].BigColumn
Or output to a file as follows:
$Rows[0].BigColumn | Out-File -FilePath .\output.txt -Encoding UTF8
The result is a beautiful un-truncated text written to a file for viewing/editing. I am sure there is a similar command to save back the text to SQL Server, although that seems like a different question.
EDIT: It turns out that there was an answer by #dvlsc that described this approach as a secondary solution. I think because it was listed as a secondary answer, is the reason I missed it in the first place. I am going to leave my answer which focuses on the PowerShell approach, but wanted to at least give credit where it was due.
If you only have to view it, I've used this:
print cast(dbo.f_functiondeliveringbigformattedtext(seed) as text)
The end result is that I get line feeds and all the content in the messages window of SMSS.
Of course, it only allows for a single cell - if you want to do a single cell from a number of rows, you could do this:
declare #T varchar(max)=''
select #T=#T
+ isnull(dbo.f_functiondeliveringbigformattedtext(x.a),'NOTHINGFOUND!')
+ replicate(char(13),4)
from x -- table containing multiple rows and a value in column a
print #T
I use this to validate JSON strings generated by SQL code. Too hard to read otherwise!
Use visual studio code with sql server plugin. Super usefull for jsons
Alternative 1: Right Click to copy cell and Paste into Text Editor (hopefully with utf-8 support)
Alternative 2: Right click and export to CSV File
Alternative 3: Use SUBSTRING function to visualize parts of the column. Example:
SELECT SUBSTRING(fileXml,2200,200) FROM mytable WHERE id=123456
The easiest way to quickly view large varchar/text column:
declare #t varchar(max)
select #t = long_column from table
print #t

Microsoft word Database quick part - How to use a mergefield as a filter for the database query

I am using mail merge to input data from an excel sheet. Everthing works great and I can access my variables using «MyMergefield»
Now I need for each letter generated to look into another excel file and do a query that will take the «MyMergefield» as a query filter SELECT FROM x WHERE field1 = «MyMergefield»
The way I am proceeding is "inserting a quick part" => "Field" in my word document.
In the quickpart dialog, I choose "DataBase", then I choose my excel file.
once the data source is chosen, There an option to change the request parameters, I click on it and I get the filter configuration popup where I can choose the field (from the excel sheet), the operator ("equals" in this case). Then there's the compare with field. In my case its not as simple as comparing to as string. Its comparing to a mail merge field.
I tried the following syntax:
«Myfield»
MERGEFIELD Myfield
MERGEFIELD "Myfield"
{MergeField Myfield}
{ MERGEFIELD Myfield}
None worked, it complained that it did not find any match so it did not insert the database (Of course it will not find any match to the syntax if I don't run mail merge)
I did look directly in the openxml file of an existing example (because I can't edit existing quickpart - Correct me if Im wrong) and the database query looked like:
FROM `Candidates$` WHERE ((`column` = '</w:instrText>
...
<w:instrText xml:space="preserve"> MERGEFIELD Myfield</w:instrText>
</w:r>
Any ideas? Thank you!

Hacked SQL Server database need regex

A database that a client of mine has was hacked. I am in the process of trying to rebuild the data. The site is running classic ASP with a SQL Server database. I believe I have found where the weak point was for the hackers and removed that entry point for now.
Every text colummn in the database was appended with some html markup and inline script/js tags.
Here is an example of a field:
all</title><script>
document.write("<style>.aq21{position:absolute;clip:rect(436px,auto,auto,436px);}</style>");
</script>
<div class=aq21>
<a href=http://samedaypaydayloansonlineelqmt.com >same day payday loans online</a>
<a href=http://samedaypaydayloan
This example was in the Users table in the UserRights column. The initial value was all, but then you can see the links that were appended.
I need to write a regex script that will search through all fields in each column of each table in the database and remove this extra markup.
Essentially, if I try to match </table>, then that string and everything that appends it can be replaced with a blank string.
All of these appended strings are the same for each field in the same column. However, there are multiple columns in each table.
This is what I have been doing so far, replacing the hacked part, but a nice regex would probably help me out, though my regex skills.... well suck.
UPDATE [databasename.[db].[databasetable]
set
UserRights = replace(UserRights,'</title><script>document.write("<style>.aq21{position:absolute;clip:rect(436px,auto,auto,436px);}</style>");</script><div class=aq21><a href=http://samedaypaydayloansonlineelqmt.com >same day payday loans online</a><a href=http://samedaypaydayloan','');
Any regex help and/or tips are appreciated.
This is what I ended up doing (big thanks to #Bohemian):
I went through each table and checked which column was affected. Then I ran the following script on each column:
UPDATE [tablename]
set columnname = substring(columnname, 1, charindex('/', columnname)-1)
where columnname like '%</%';
If the column had any markup in it, then I ended up manually updating those records manually. (lucky for me there was only a couple of records).
If anyone has any better solutions, please feel free to comment.
Thanks!
Since the bad stuff starts with a <, and that is an unusual character to typically find, I would use normal text functions, something like this:
update mytable set
mycol = substr(mycol, 1, charindex('<', mycol) - 1)
where mycol like '%<%';
And methodically do this with every column of every table.
Note that I'm only guessing at the right function to use, since I'm unfamiliar with SQL Server, but you get idea.
I welcome someone editing the SQL to improve it.

Resources