Semantic Search without using filetable in sql server 2012 - sql-server

We have an file upload system and would like to use the new MSSQL2012 semantic search feature for sql server 2012. Is that possible without using filetables?
This is our schema:

I think there are two questions here.
Can you use Semantic Search without using filetable?
Yes, you can. It can be used on any table with Full-Text indexing turned on.
Here is the list of prerequisites:
link.
Basically you can use it on the data, which is loaded into the database.
The second question is whether your schema benefit from Semantic Search an to what extent.
Looking at your scheema I understand, that your database hold only paths to the documents and their "descriptions". Therefore, you can enable Semantic Search on the columns in your database. It will allow to use Semantic Search on FileName and Description, but not on documents' contents.
In order to use Semantic Search on the contents of these documents you'll need to store these documnets in SQL database. FileTable structure helps this task, although you can choose another way of storing whole documnets in your database.

Related

How to exclude items in Visual studio SQL schema compare based on string pattern or alternative

We have a multi tenant system where each tenant has their own database. Tenants also have the option to create their own data structures which will be their own table in the database.
This causes an issue where when we run the visual studio schema compare it will always flag these tables as differences and we will have to unselect them. This becomes a big issue as the schema compare has major performance issues when unselecting multiple differences.
These user defined tables will all have a certain naming pattern e.g. UserTable1,UserTable2 so what we really need is a way to perform the schema comparison while ignoring tables that contain a substring in this example it would be UserTable. Is this possible or is their a suitable alternative to using the Visual studio comparison tool?
For those coming here from Google looking for a solution to this.
All you have to do is right click on the section and ta-da, you can
Include or Exclude all objects depending on the existing state of the
objects.
In this case, section means the Delete, Change, and Add parent folders in the schema compare window.

Retrieving text from FileTable SQL Server

Is it possible to retrieve the actual text from a File Table in SQL Server 2014?
I want to implement some hit-highlighting functionality, but in order to do so, I need to retrieve the actual text in the file I indexed, since the content is in a varbinary column.
If it's not possible, I suppose the only alternative to do this is forgetting about FileTables and implementing an application-side "document reader", so that I'll have real text inside my "file_stream" column instead of the varbinary. Or maybe even defining an UDF that uses iFilters behind some C# code, right?
Please, any advice would be really useful.
Before you do start with your own implementation, take a look at the very similar question:
SQL Server 2012 FTS with Hit-Highlighting?
Also this blog entry from 2012 is still current:
Hit-Highlighting in Full-Text Search
I would take a look at the mentioned HitHighlight function (which is actually a commercial product, ThinkHighlight). Most likely it's not worth the effort to build your own solution. But if you do so - tell me ;)

Quick way to perform a fulltext-search on MS SQL Server

First of all: i don't need a full-text-search engine, i don't need full-text-search in my code. I have a database with ~2000 tables, and i need to find the table and column in which certain information is stored, for developing purposes. Is there any quick way (maybe an SQL Server Management Studio trick that i should know of) to do this? I think phpmyadmin provides such a feature for mysql dbs. At the moment i'm seriously thinking of dumping the database to an .sql file and use a text editor to search for the phrases i'm looking for.
Check the INFORMATION_SCHEMA. You can select on it - there is a table containing all the field names etc. and you can then do search on that one.
I don't see a way how to do it without dynamic SQL - get list of all tables and their columns from sys.tables and sys.columns (don't forget to add proper schema if you're using them), construct query that checks for the values you're trying to find and stores table and column name in temporary table, place all queries into (temp) table and finally cursor/loop over that table executing all queries.
PS. your idea of dumping everything into *.sql files should work as well, depends on the volume of data.

DB technology for efficient search in tabular data?

We have a repository of tables. Around 200 tables, each table can be thousands of rows, all tables are originally in Excel sheets.
Each table has a different scheme. All data is text or numbers.
We would like to create an application that allows free text search on all tables (we define which columns will be searched in each table) efficiently - speed is important.
The main dilemma is which DB technology we should choose.
We created a mock up by importing all tables to MS SQL Server, and creating a full text index over them. The search is done using the CONTAINS keyword. This solution works well for a small number of tables, but it doesn't scale.
We thought about a NoSQL solution, but we don't yet have any experience in it.
Our limitations (which unfortunately I can not effect): Windows servers only. But we can install on them whatever we want.
Thank you.
Check out ElasticSearch! It's a search server based on Apache Lucene and has a clean REST- and JavaScript-based API. Although it's used usually as a search-index for a primary database, it can also be used stand-alone. So you may want to write a backup routine for a few of your tables/data and try it out.
http://www.elasticsearch.org/
http://en.wikipedia.org/wiki/ElasticSearch
Comparison of ElasticSearch and Apache Solr (another Lucene-based search server):
https://docs.google.com/present/view?id=dc6zhtt5_1frfxwfff&pli=1

sql server - full-text search

So let's say I have two databases, one for production purposes and another one for development purposes.
When we copied the development database, the full-text catalog did not get copied properly, so we decided to create the catalog ourselves. We matched all the tables and indexes and created the database and the search feature seems to be working okay too (but been entirely tested yet).
However, the former catalog had a lot more files in its folder than the one we manually created. Is that fine? I thought they would have exact same number of files (but the size may vary)
First...when using full text search I would suggest that you don't manually try to create what the wizard does for you. I have to wonder about missing more than just some data. Why not just recreate the indexes?
Second...I suggest that you don't use freetext feature of sql server unless you have no other choice. I used to be a big believer in freetext but was shown an example of creating a Lucene(.net) index and searching it in comparison to creating an index in SQL Server and searching it. Creating a SQL Server index in comparison to creating a Lucene index is considerably slower and hard to maintain. Searching a SQL Server index is considerably less accurate (poor results) in comparison to Lucene. Lucene is like having your own personal Google for searching data.
How? Index your data (only the data you need to search) in Lucene and include the Primary Key of the data that you are indexing for use later. Then search the index using your language and the Lucene(.net) API (many articles written on this topic). In your search results make sure you return the PK. Once you have identified the records you are interested in you can then go get the rest of the data and/or any related data based on the PK that was returned.
Gotchas? Updating the index is also much quicker and easier. However, you have to roll your own for creating the index, updating the index, and searching the index. SUPER EASY to do...but still...there are no wizards or one handed coding here! Also, the index is on the file system. If the file is open and being searched and you try to open it again for another search you will obviously have some issues...so writing some form of infrastructure around opening and reading these indexes needs to be built.
How does this help in SQL Server? You can easily wrap your Lucene search in a CLR function or proc which can be installed in the database that you can then use as though it were native to your t-SQL queries.

Resources