SQL Server 2008 Full Text Search results - sql-server

I''ve a little problem while using SQL Server Full Text Search.
Let me explain,
I've a table with a BLOB inside (a PDF file).
I've created the full text index in that table like it should be.
I've the PDF iFilter from Adobe.
BUT, when I put some files in my table and execute a search like:
SELECT *
FROM MyTable
WHERE FREETEXT(*, N'thank');
It only returns the columns from my table (well, that's what I asked, right?).
But I wanted to return the sentence where the word 'thank' was found.
Is there any way to do this?
I've been fighting with this issue for almost 2 days...

Do you have any evidence that the PDF IFilter is working from within SQL Server at all?
Just as a test put an MS Word 2003 doc in there and see if it gets indexed properly.

Related

Is there a way to read the blob partially?

I want to read blob partially. For now I can read whole blob with select query e.g.
select image from table where condition
But the blob is too big, so I want to read blob partially, with begin pos and length, (e.g. select partial_read( image, 0, 1024 ) from table where condition)
I just try some new way, but I have no idea. simply I can make blob split. But I can't, because of legacy system, it is forbidden for me.
Can you tell me keyword? Or way?
I am using
SQL Server: 2008 R2
OS: Server 2012
Using SUBSTRING such as AlwaysLearning suggests, is perfect for my task (which is identifying PDF data falsely stored as TIF in blobs). Saves a ton of time.

Migrate Excel Data to SQL

I have a large table of information (around 11,000 rows, 4 columns) in Excel that uses a macro and I need to import it to an SQL server, Microsoft SQL Server Management Studio, which will be utilized by another server to get the new information.
Example:
If I type into SQL:
Insert Into ENT_LINK_OBJECTS (OBJ_NAME, ENTITY_KEY, IDENTITY_KEY)
Select 'TDS-C1487-81236', ITEM_KEY, 1
From ENT_ITEM_MASTER As M
Where M.ITEM_CODE = 'TL-123'
or M.ITEM_CODE = 'TL-456'
I can then open the program which holds all this information, called Matrix, which prompts me to enter an item key and/or code and/or type etc (which has all possible files listed below it) and hit search (image 1). If I type in TL-123 to the item code section (image 2), it narrows down the files to any containing TL-123 (image 3). When i double click, I can click on many tabs, one of which is "Links". In that tab under document name the information TDS-C1487-81236(image 4). How would I go about making that happen?
(1)
(2)
Then hit ENTER
(3)
(4)
The website below is a good explanation of what I am getting at but I do not know how to implement it. What would be the most efficient way to migrate the data from my excel document to the SQL server?
http://sqlmag.com/business-intelligence/excel-macro-creates-insert-statements-easy-data-migration
Have you tried DTSWizard ? Its a GUI based tool to do so.. and should be shipped with MS SQL server
Create a linked server or Use statement like OPENROWSET to access Excel Sheet. That would be the easiest and fastest method of accessing excel sheet via SQL.

SQL Server 2012 Full Text Search on RTF

I have my Database running on a SQL Server 2012. One Column of my Table contains RTF Text. The Datatype of the Column is nvarchar(MAX).
I want setup a full text search for this column which analyses the rtf and searches only in the real text, so that I don't get rtf Tags as result.
As I understand, parsing rtf should already be part of the SQL Server. But I don't get it working :-(
I did following:
Create a full text catalog
Select the column containing rtf and add a full_text Index
But I still get wrong results
SELECT * FROM myTable WHERE
CONTAINS(myRtfColumn,'rtf')
--> still get all columns, as 'rtf' is a keyword
Any Ideas what I doing wrong? Do I have to activate rtf-Search for my SQL Server or something similar?
A full text search works only on text columns. You are inserting into your database binary stuff -> rtf. When you have chosen nvarchar you told the sql server you want to store text, but you are storing binary stuff. For binary stuff use varbinary(max) instead.
The problem will still remain, because the index routines don't know how to interpret richtext - what are control chars what is content.
let us talk about the interpreter/filter
documentation says:
https://technet.microsoft.com/en-us/en-en/library/ms142531(v=SQL.105).aspx
varbinary(max) or varbinary data
A single varbinary(max) or varbinary column can store many types of documents. SQL Server 2008 supports any document type for which a filter is installed and available in the operative system. The document type of each document is identified by the file extension of the document. For example, for a .doc file extension, full-text search uses the filter that supports Microsoft Word documents. For a list of available document types, query the sys.fulltext_document_types catalog view.
Note that the Full-Text Engine can leverage existing filters that are installed in the operating system. Before you can use operating-system filters, word breakers, and stemmers, you must load them in the server instance, as follows:
Finally todo:
check if ".rtf" is as filter available.
EXEC sp_help_fulltext_system_components 'filter';
then add a calculated column to you table "typ" which always returns ".rtf"
alter table yourname add [Typ] AS (CONVERT([nvarchar](8),'.rtf',0));
This can used now for the index as type specification.

SQL Full Text Indexing, ASCII control characters

I am using SQL Server 2008 R2 Full text indexing. I noticed that some results from my search are not included in the result. On further investigation I found that the suspected data contains ASC II control characters (http://www.theasciicode.com.ar/ascii-control-characters/escape-ascii-code-27.html). My table is in a simple flat structure and if any row contains one of those characters, the results are not displayed.
As soon as I replace the character in the data, the result appears. I am using CONTAINS in the query.
I could not find a link that confirms this behaviour. I can remove those characters from the database, but would be nice to have confirmation and understanding of the reason. Any help would be appreciated.
I think I figured out the issue. On investigating the full text crawl log I found the database size was reached (it is express edition). After doing some clean up all the rows are being returned properly.
The link that helped me in troubleshooting: http://technet.microsoft.com/en-us/library/ms142495(v=sql.105).aspx

Is there any way to access inverted index on sql server full text search

I would like to get "content" () of full text search index as described in http://en.wikipedia.org/wiki/Inverted_index and http://en.wikipedia.org/wiki/Microsoft_SQL_Server#Full_Text_Search_Service. Content - name of word and occurences
This question is related with my previous question without answer Dynamic tags generation from sql server database using full text search
Is it possible?
In SQL Server 2008 you can query the sys.dm_fts_index_keywords and sys.dm_fts_index_keywords_by_document table valued functions to get this information.
For previous versions I think this is much less easily accessible (if at all)

Resources