In Azure Synapse, how can I check how a table is distributed. For example whether it is distributed in a round robin manner or with hash keys.
You can use the Dynamic Management View (DMV) sys.pdw_table_distribution_properties in a dedicated SQL pool to determine if a table is distributed via round robin, hash or replicated, eg
SELECT
OBJECT_SCHEMA_NAME( object_id ) schemaName,
OBJECT_NAME( object_id ) tableName,
*
FROM sys.pdw_table_distribution_properties;
It's the distribution_policy_desc column. Some sample results:
Don't confuse distribution and partitioning. I've updated the question.
pdw_table_distribution_properties is certainly a possibility as mentioned.
Or just generate the create DDL for that table using any Client (Data Studio, SSMS, VS Code with plugin, ...).
E.g. in Azure Data Studio, right click on the table and click "Script as Create".
Look for DISTRIBUTION in WITH clause.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[sa_logs]
(
[version_number] [float] NULL,
[request_start_time] [datetimeoffset](7) NULL,
...
[referrer_header] [varchar](256) NULL,
)
WITH
(
DISTRIBUTION = ROUND_ROBIN,
HEAP
)
GO
or for a table with HASH:
...
DISTRIBUTION = HASH ( [hash_column] ),
...
Related
I am migrating all my VM Sql Server 2012 database to Azure Sql Database. In my current structure I am using cross database queries to fetch data from different database tables.
I have created external table to my parent table using below query
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'yourPassword';
CREATE DATABASE SCOPED CREDENTIAL yourServeradminlogin
WITH IDENTITY = 'yourServeradminlogin',
SECRET = 'yourPassword';
CREATE EXTERNAL DATA SOURCE RefmyDemoDB2
WITH
(
TYPE=RDBMS,
LOCATION='testdbdemoserver.database.windows.net',
DATABASE_NAME='myDemoDB2',
CREDENTIAL= yourServeradminlogin
);
CREATE EXTERNAL TABLE [dbo].[Department](
[DeptId] [int] NOT NULL,
[Name] [varchar](50) NULL
)
WITH
(
DATA_SOURCE = RefmyDemoDB2
);
/****** Script for SelectTopNRows command from SSMS ******/
SELECT *
FROM [dbo].[Employee] E
INNER JOIN [dbo].[Department] D
ON E.DeptId = D.DeptId
I referred this link https://www.c-sharpcorner.com/article/cross-database-queries-in-azure-sql/
But when I create external table it doesn't shows table in external table folder like shown in below image.
In my case it directly showing in Tables folder.
Anyone knows why I don't see Department table in External Tables folder? How can I add such tables in External Tables folder?
External tables are available in Azure SQL only to support a feature called "Elastic Queries", that may solve your problem:
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-elastic-query-overview
If this is not enough for you, and you really need full cross-database query support, you have to use an Azure SQL Managed Instance:
https://learn.microsoft.com/en-us/azure/sql-database/sql-database-managed-instance
which seems to be exactly what you need.
The creation of a SQL Server External Table requires remote tables to be made accessible by running code of the following format on the local database:
CREATE EXTERNAL TABLE [TABLE_NAME]
(
[Id] [int] NOT NULL,
[Name] [int] NULL,
)
WITH (DATA_SOURCE = [EXTERNAL_DB_NAME])
I have a quite a few external tables to work with, and am wondering if this functionality can be automated. So, is there a way to create the text for a CREATE TABLE statement from the Information Schema views on the remote database ? This could then be altered and used to populate a dynamic query to be run on the local database.
I am using a SQL Server 2008 database.
I have two databases namely db1 and db2. In both there is a table tblcountry. I create this on 1st database. Then how can I script with its data for create on 2nd database?
I use the code below
CREATE TABLE [dbo].[tblCountry]
(
[record_Id] [int] IDENTITY(1,1) NOT NULL,
[country] [nvarchar](150) NULL,
[nationality] [nvarchar](150) NULL,
[lsdMdfdOn] [datetime] NULL,
[lstMdfdBy] [nvarchar](350) NULL,
[isDeleted] [bit] NULL,
[isEnabled] [bit] NULL,
)
Then what code will I use for the getting include the data?
No, you cannot view the data if you are using the create query.
If you want to see the data of the table on second database then you can use this query on your second database db2
select * from [db1].[dbo].[tblCountry]
But you can not view the data and create query at the same time.
Although it may seem very wierd solution but I guess what you can do is you can copy the create query on the query analyzer window and beneath that write the select query and execute it. (But I guess this is how most of the programmers do that)
If you are on the same server or have a linked server:
CREATE TABLE tblCountry
SET IDENTITY_INSERT tblCountry ON
INSERT INTO [database2].tblCountry
SELECT * FROM [database1].tblCountry
SET IDENTITY_INSERT tblCountry OFF
The most Easy way for this problem is for you to
Right Click on the Database on the Object Explorer
click Generate Scripts
on the introduction click Next
Select radio button on Script entire database and all database
objects or you can just select specific tables or stored
procedures by selecting the other radio button
On the Set Scripting Options click on Advanced Select the things you
want to script
Then on the Query just change the database name after the query USE
db2
Right click on database and click on tasks and export data
you can use export data option in sql server...it will give you data with table script
I am running a few selects over adventure works.
(i.e.
SELECT *
FROM Production.Product
WHERE FREETEXT(*, 'screw washer spaner');
)
I have yet to encounter a select that uses the thesaurus and displays the finding of synonyms.
How can I know if I am using this feature?
can anyone supply a select that demonstrate the usage of the thesaurus
Here is the setup script (thanks to Itzik Ben-Gan (author Querying SQL Server)) which will allow you to use example query at bottom of thesaurus:
IF OBJECT_ID('dbo.Documents', 'table') IS NOT NULL
DROP TABLE dbo.Documents;
CREATE TABLE dbo.Documents
(
id INT NOT NULL IDENTITY,
title NVARCHAR(100) NOT NULL,
doctype NCHAR(4) NOT NULL,
docexcerpt NVARCHAR(1000) NOT NULL,
doccontent VARBINARY(MAX) NOT NULL,
CONSTRAINT PK_Documents
PRIMARY KEY CLUSTERED(id)
);
GO
INSERT INTO dbo.Documents
(title, doctype, docexcerpt, doccontent)
SELECT N'Introduction to Data Mining',
N'docx',
N'Using Data Mining is becoming more a necessity for every company
and not an advantage of some rare companies anymore. ',
bulkcolumn;
GO
--Edit the thesaurus file by adding
<expansion>
<sub>need</sub>
<sub>necessity</sub>
</expansion>
--Run the following to reload your edited thesaurus
EXEC sys.sp_fulltext_load_thesaurus_file 1033;
GO
And in a separate batch, execute the following command:
SELECT *
FROM dbo.Documents
WHERE FREETEXT(doccontent, N'FORMSOF(THESAURUS, need)');
GO
The word "need" is not in the docexcerpt, however as the synonym is loaded in the thesaurus, the row will return if the thesaurus is loaded properly. If you have issues with that, there are many StackOverflow articles and BOL entries on how to load/configure it.
I have a MS SQL 2008 database which stores data for creating a weighted, undirected graph. The data is stored in tables with the following structure:
[id1] [int] NOT NULL,
[id2] [int] NOT NULL,
[weight] [float] NOT NULL
where [id1] and [id2] represents the two connected nodes and [weight] the weight of the edge that connects these nodes.
There are several different algorithms, that create the graph from some basic data. For each algorithm, I want to store the graph-data in a separate table. Those tables all have the same structure (as shown above) and use a specified prefix (similarityALB, similaritybyArticle, similaritybyCategory, ...) so I can identify them as graph-tables.
The client program can select, which table (i.e. by which algorithm the graph is created) to use for the further operations.
Access to the data is done by stored procedures. As I have different tables, I would need to use a variable tablename e.g.:
SELECT id1, id2, weight FROM #tableName
This doesn't work because SQL doesn't support variable tablenames in the statement. I have searched the web and all solutions to this problem use the dynamic SQL EXEC() statement e.g.:
EXEC('SELECT id1, id2, weight FROM ' + #tableName)
As most of them mentioned, this makes the statement prone to SQL-injection, which I'd like to avoid. A simple redesign idea would be to put all the different graphs in one table and add a column to identify the different graphs.
[graphId] [int] NOT NULL,
[id1] [int] NOT NULL,
[id2] [int] NOT NULL,
[weight] [float] NOT NULL
My problem with this solution is, that the graphs can be very large depending on the used algorithm (up to 500 Million entries). I need to index the table over (id1, id2) and (id2, id1). Now putting them all in one big table would makes the table even huger (and requests slower). Adding a new graph would result in bad performance, because of the active indicees. Deleting a graph could not be done by TRUNCATE anymore, I would need to use
DELETE * FROM myTable WHERE graphId=#Id
which performs very bad with large tables and creates a very large logfile (which would exceed my disk space when the graph is big enough). So I'd like to keep the independent tables for each graph.
Any suggestions how to solve this problems by either find a way to parametrize the tablename or to redesign the database structure while avoiding the aforementioned problems?
SQL injection can easily be avoided in this case by comparing #tableName to the names of the existing tables. If it isn't one of them, it's bad input. (Obligatory xkcd reference: That is, unless you have a table called "bobby'; drop table students;")
Anyway, regarding your performance problems, with partitioned tables (since SQLServer 2005), you can have the same advantages like having several tables, but without the need for dynamic SQL.
Maybe I did not understand everything, but:
CREATE PROCEDURE dbo.GetMyData (
#TableName AS varchar(50)
)
AS
BEGIN
IF #TableName = 'Table_1'
BEGIN
SELECT id1
,id2
,[weight]
FROM dbo.Table_1
END
IF #TableName = 'Table_2'
BEGIN
SELECT id1
,id2
,[weight]
FROM dbo.Table_2
END
END
and then:
EXEC dbo.GetMyData #TableName = 'Table_1'
A different technique involves using synonyms dynamically, for example:
DECLARE #TableName varchar(50)
SET #TableName = 'Table_1'
-- drop synonym if it exists
IF object_id('dbo.MyCurrentTable', 'SN') IS NOT NULL
DROP SYNONYM MyCurrentTable ;
-- create synonym for the current table
IF #TableName = 'Table_1'
CREATE SYNONYM dbo.MyCurrentTable FOR dbo.Table_1 ;
IF #TableName = 'Table_2'
CREATE SYNONYM dbo.MyCurrentTable FOR dbo.Table_2 ;
-- use synonym
SELECT id1, id2, [weight]
FROM dbo.MyCurrentTable
Partioned Table may be the answer to your problem. I've got another idea, that's "the other way around":
each graph has it's own table (so you can still truncate table)
define a view (with the structured you mentioned for your redefined table) as a UNION ALL over all graph-tables
I have no idea of the performance of a select on this view and so on, but it may give you what you are looking for. I'd be interested in the results if try this out ..