I have a SQL Server table which stores all the data across the world. But i want to retrieve only russian characters from table specifically.
I tried below query but this is returning all NON ENGLISH data.
select * from tablename where column like '%[^-A-Za-z0-9 /.+$]%'
Is there a way to get only russian characters.
Thanks in advance.
I would suggest you to analyze one random (first for example) character from the string, if its code lays between first letter in the alphabet and the last.
For example like this:
select *
from tablename
where unicode(substring(column, 1, 1)) between unicode('А') and unicode('я')
and of course using this approach you do not get "all Russian characters", but you will be able to get all rows, where text is written in Russian. I suppose that is what you're really asking for :)
Related
I want to customize SQL Server FTS to handle language specific features better.
In many language like Persian and Arabic there are similar characters that in a proper search behavior they should consider as identical char like these groups:
['آ' , 'ا' , 'ء' , 'ا']
['ي' , 'ی' , 'ئ']
Currently my best solution is to store duplicate data in new column and replace these characters with a representative member and also normalize search term and perform search in the duplicated column.
Is there any way to tell SQL Server to treat any members of these groups as an identical character?
as far as i understand ,this would be used for suggestioning purposes so the being so accurate is not important. so
in farsi actually none of the character in list above doesn't share same meaning but we can say they do have a shared short form in some writing cases ('آ' != 'اِ' but they both can write as 'ا' )
SCENARIO 1 : THE INPUT TEXT IS IN COMPLETE FORM
imagine "محمّد" is a record in a table formatted (id int,text nvarchar(12))named as 'table'.
after removing special character we can use following command :
select * from [db].[dbo].[table] where text REPLACE(text,' ّ ','') = REPLACE(N'محمد',' ّ ','');
the result would be
SCENARIO 2: THE INPUT IS IN SHORT FORMAT
imagine "محمد" is a record in a table formatted (id int,text nvarchar(12))named as 'table'.
in this scenario we need to do some logical operation on text before we query in data base
for e.g. if "محمد" is input as we know and have a list of this special character ,it should be easily searched in query as :
select * from [db].[dbo].[table] where REPLACE(text,' ّ ','') = 'محمد';
note:
this solution is not exactly a best one because the input should not be affected in client side it, would be better if the sql server configure to handle this.
for people who doesn't understand farsi simply he wanna tell sql that َA =["B","C"] and a have same value these character in the list so :
when a "dad" word searched, if any word "dbd" or "dcd" exist return them too.
add:
some set of characters can have same meaning some of some times not ( ['ي','أ'] are same but ['آ','اِ'] not) so in we got first scenario :
select * from [db].[dbo].[table] where text like N'%هی[أي]ت' and text like N'هی[أي]ت%';
There is a database and in it a data table contains lots of values in Arabic language. When I use like, I get a wrong answer.
Code that I used
Select *
From customers
Where cusname like ''%جديد%
But kindly note the data already contains the values
Use the Unicode prefix (N)
Select * from customer where cusname like N'%جديد٪'
I have created a db called AllWords.db in sqlite that contains a list of all english words (count:172820). When I issue a select all query, it returns a list of all 172820 words. Also, when I print the count of the table words like this :
SELECT COUNT(*) FROM words;
the output is 172820, so the database clearly has all the words included in it. However, when I try to check if a word exists (the only thing I'll want to do with this database), it doesn't print anything :
SELECT * FROM words WHERE word="stuff";
returns nothing.
The database is a single table with the only column being 'words', which has all the words as rows. Any help would be greatly appreciated, thanks.
Just to be sure you use a word in your database, look into your table with
select * from words limit 10
house
stuff
tree
...
and then select with one of the words you see
select * from words where word = 'stuff'
Edit: fixed where clause according to #MichaelEakins
Edit2: Unfortunately there's no difference between single and double quotes in this case, see SQL Fiddle
Answering my own question because I figured out what was wrong. To populate the table, I had written a python program to parse a file called words.txt (all words, separated by newlines), into sqlite. My problem was the query turned into :
INSERT INTO WORDS VALUES('englishWord\n')
And that messed up the database. I fixed that and it started to work, thanks to #ScoPi for the hint with using LIKE, it helped me figure out that there was a stray newline character.
Dear Friends,
I've faced with a problem never thought of ever. My problem seems too simple but I can't find a solution to it.
I have a sql server database column that is of type NVarchar and is filled with standard persian characters. when I'm trying to run a very simple query on it which incorporates the LIKE operator, the resultset becomes empty although I know the query term is present in the table. Here is the very smiple example query which doesn't act corectly:
SELECT * FROM T_Contacts WHERE C_ContactName LIKE '%ف%'
ف is a persian character and the ContactName coulmn contains multiple entries which contain that character.
Please tell me how should I rewrite the expression or what change should I apply. Note that my database's collation is SQL_Latin1_General_CP1_CI_AS.
Thank you very much
Also, if those values are stored as NVARCHAR (which I hope they are!!), you should always use the N'..' prefix for any string literals to make sure you don't get any unwanted conversions back to non-Unicode VARCHAR.
So you should be searching:
SELECT * FROM T_Contacts
WHERE C_ContactName COLLATE Persian_100_CI_AS LIKE N'%ف%'
Shouldn't it be:
SELECT * FROM T_Contacts WHERE C_ContactName LIKE N'%ف%'
ie, with the N in front of the comparing string, so it treats it like an nvarchar?
I am working with SQL Server 2008. My task is to investigate the issue where FTS cannot find the right result for Thai.
First, I have the table which enables the FTS on the column 'ItemName' which is nvarchar. The Catalog is created with the Thai Language. Note that the Thai language is one of the languages that doesn't separate the word by spaces, so 'หลวง' 'พ่อ' 'โสธร' are written like this in a sentence: 'หลวงพ่อโสธร'
In the table, there are many rows that include the word (โสธร); for example row#1 (ItemName: 'หลวงพ่อโสธร')
On the webpage, I try to search for 'โสธร' but SQL Server cannot find it.
So I try to investigate it by trying the following query in SQL Server:
select * from sys.dm_fts_parser(N'"หลวงพ่อโสธร"', 1054, 0, 0)
...to see how the words are broken. The first one is the text to be broken. The second parameter is to specify that we're using Thai (WorkBreaker, so on). Here is the result:
row#1 (display_item: 'ງลวง', source_item: 'หลวงพ่อโสธร')
row#2 (display_item: 'พຝโส', source_item: 'หลวงพ่อโสธร')
row#3 (display_item: 'ธร', source_item: 'หลวงพ่อโสธร')
Notice that the first and second row display the wrong display_item 'ງ' in the 'ງลวง' isn't even Thai characters. 'ຝ' in 'พຝโส' is not a Thai character either.
So the question is where did those alien characters come from? I guess this why I cannot search for 'โสธร' because the word breaker is broken and keeping the wrong character in the indexes.
Please help!
This should be due to the different Dialect of thai selected while the indexing was applied.
From FTS properties check what is your selected language / culture