ContainsTable: noise/stop word giving no results - sql-server

Suppose you have the following queries (using PT language):
declare #textoPesquisa as nvarchar(200) = 'haiti and florida and à'
if(#textoPesquisa is null or #textoPesquisa='') set #textoPesquisa = '*'
select * from Noticias n
left join containstable(NoticiasParaEstatistica, (Titulo), #textoPesquisa,
LANGUAGE N'Portuguese') s on n.IdNoticia=s.[key]
where s.[key] is not null
select * from NoticiasParaEstatistica where contains(titulo,
#textoPesquisa,LANGUAGE N'Portuguese')
I'm under the impression that à is considered a stop word, so the previous queries return no results (due to the fact that I'm using AND). Now, if I turn off the stopword list, everything works fine, but that doesn't look like a good option.
After looking at the docs, I've found the transform noise words option. I've activated it in the server and I've rebuilt the catalog, but I'm still getting 0 results.
Btw, here's table + insert that might be able to reproduce this scenario:
CREATE TABLE [dbo].[NoticiasParaEstatistica](
[IdNoticia] [bigint] NOT NULL,
[Titulo] [varchar](400) NOT NULL
CONSTRAINT [PK_NoticiasParaEstatistica] PRIMARY KEY CLUSTERED
(
[IdNoticia] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
INSERT INTO NoticiasParaEstatistica (Titulo)
values ('haiti florida à')
What am I missing?
thanks!

Related

Unique constraint on multiple columns when any columns are not null

I am trying to add a unique constraint on 2 columns that allows multiple (null, null), but doesn't allow multiple ("a", null). I wrote the following SQL statement but I get an error
Incorrect syntax near the keyword 'with'
SQL statement:
CREATE UNIQUE NONCLUSTERED INDEX [UQ]
ON [dbo].[MyTable] ([Column_A], [Column_B])
WHERE [Column_A] IS NOT NULL OR [Column_B] IS NOT NULL
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY];
However in
WHERE [Column_A] IS NOT NULL OR [Column_B] IS NOT NULL
when I replace OR with AND, there is no syntax error anymore. But what I want is OR logic, not AND because using AND allows multiple ("a", null) entries.
So why is there an incorrect syntax error using OR? Thanks.
It's a bit clunky, but if the point is to enforce uniqueness (except for NULL/NULL) and you're happy having two indexes, you can do it via putting the NOT NULL filters on separate indexes e.g.,
CREATE UNIQUE NONCLUSTERED INDEX
[UQ_A] ON [dbo].[MyTable]
(
[Column_A],
[Column_B]
)
WHERE [Column_A] IS NOT NULL
GO
CREATE UNIQUE NONCLUSTERED INDEX
[UQ_B] ON [dbo].[MyTable]
(
[Column_A],
[Column_B]
)
WHERE [Column_B] IS NOT NULL
GO
While it's not necessary for the uniqueness check, you may want to change the order of the fields in the second index to potentially gain additional advantages of the index (depending on how these fields are used in your application - it may be better swapping them in the first index), e.g.,
CREATE UNIQUE NONCLUSTERED INDEX
[UQ_B] ON [dbo].[MyTable]
(
[Column_B],
[Column_A]
)
WHERE [Column_B] IS NOT NULL
GO
According to Martin Smith's comment, I think I can implement it by:
CREATE VIEW [dbo].[TableAView]
WITH SCHEMABINDING AS
SELECT
[ColumnA] = [ColumnA],
[ColumnB] = [ColumnB]
FROM [dbo].[TableA]
WHERE [ColumnA] IS NOT NULL OR [ColumnB] IS NOT NULL;
GO
CREATE UNIQUE CLUSTERED INDEX
[UQ] ON [dbo].[TableAView]
(
[ColumnA],
[ColumnB]
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON [PRIMARY];
GO

How can I add values of a column but only where the artist is the same, in SQL Server?

I have 2 tables, first table:
CREATE TABLE [dbo].[songs]
(
[ID_Song] [INT] IDENTITY(1,1) NOT NULL,
[SongTitle] [NVARCHAR](100) NOT NULL,
[ListenedCount] [INT] NOT NULL,
[Artist] [INT] NOT NULL,
CONSTRAINT [PK_songs]
PRIMARY KEY CLUSTERED ([ID_Song] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[songs] WITH CHECK
ADD CONSTRAINT [FK_songs_artists]
FOREIGN KEY([Artist]) REFERENCES [dbo].[artists] ([ID_Artist])
GO
And second table:
CREATE TABLE [dbo].[artists]
(
[ID_Artist] [INT] IDENTITY(1,1) NOT NULL,
[Name] [NVARCHAR](100) NOT NULL,
CONSTRAINT [PK_artists]
PRIMARY KEY CLUSTERED ([ID_Artist] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
As you can see column Artist in table Songs references column ID_Artist of table Artists.
I want to get all Artists by summing up ListenedCount of all their songs where it's value is greater than a value.
I have trouble writing the query.
There are many ways to achieve it.
One is by summing in a subquery and using it, the sum, as a filter in the query.
select art.[Name], gba.[ListenedSum]
from [dbo].[artists] art
join
(
select sg.[Artist], sum(sg.[ListenedCount]) as [ListenedSum]
from [dbo].[songs] sg
group by sg.[Artist]
) as gba on gba.[Artist] = art.[ID_Artist]
where gba.[ListenedSum] > 1000000
A more direct way can be using HAVING
select art.[Name], sum(sg.[ListenedCount]) as [ListenedSum]
from [dbo].[artists] art
join [dbo].[songs] sg on sg.[Artist] = art.[ID_Artist]
group by art.[Name]
having sum(sg.[ListenedCount]) > 1000000
It's interesting to note the engine can end running these two queries in different ways (not guaranteed) and they can end with different performances.
There's another interesting way, like using a CTE but I think it's a bit more complicated.

How to make entity framework respect the true nature of default values on DB columns

Consider the following table definition which creates a table with default values which are not nullable and which have default values for when a value is not supplied
drop table [defaultTest]
CREATE TABLE [defaultTest](
[TestId] [int] IDENTITY(1,1) NOT NULL,
[TestData] [nvarchar](100) NOT NULL,
[TestKey] [int] NOT NULL,
[TestTimeStamp] [datetimeoffset](7) NOT NULL,
PRIMARY KEY CLUSTERED
(
[TestId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [defaultTest] ADD CONSTRAINT [DF_Test_TestKey] DEFAULT (NEXT VALUE FOR [SomeSequence]) FOR [TestKey]
GO
ALTER TABLE [defaultTest] ADD CONSTRAINT [DF_Test_TestTimeStamp] DEFAULT (sysdatetimeoffset()) FOR [TestTimeStamp]
GO
CREATE UNIQUE NONCLUSTERED INDEX [defaultTest_TestKey_Insert_UK] ON [defaultTest]
(
[TestKey] ASC,
[TestTimeStamp] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
declare #testkey int;
declare #id int;
insert into [defaultTest]([TestData]) values ('Original');
set #id = ##IDENTITY;
select #testkey = [TestKey] from [defaultTest] where [TestId] = #id;
insert into [defaultTest]([TestData], [TestKey] ) values ('Updated', #testkey);
select * from [defaultTest];
TestId TestData TestKey TestTimeStamp
1 Original 27 2019-06-26 14:40:22.1042605 +10:00
2 Updated 27 2019-06-26 14:40:22.1062673 +10:00
In the database this works perfectly. An insert can supply a value or not and the database will ensure that a value is always there.
But when this database table is referenced in entity framework database first i'm struggling to get entity framework to respect the true nature of the situation.
Observed behaviour is that it will either always pass zero values if StoredGeneratedPattern = none and the fields are empty and always pass null if StoredGeneratedPattern = Computed (or Identity) even if a value is supplied.
That is not how the definition ever works at the database level so why entity framework was programmed that way is a mystery?
Is there a way to get entity framework to behave properly for this scenario?
EDIT: I tried removing the not null constraint from the Key but entity framework refuses to pull the generated value back from the database.

violation of primary key constraint in insert query not touching the PK column

I have a query that inserts record in a table. the primary key column of that table is an Identity field that auto-increments. the select part of the query will have duplicates, but I have an an unique constraint with ignore_dup_key=on on fields (city_nm, prov_en_nm) that should skip them on insert. this used to work fine, but for some reason now it gives me this message. this is the first time I try it since the database was moved from a 2012 sql server to a 2014 if that can have an impact
Violation of PRIMARY KEY constraint 'Dim_city_province_country_pk'. Cannot insert duplicate key in object 'HD_DtlClm.dim_city_province_country_t'. The duplicate key value is (###). (where ### is an ID, a different one every time I run it)
Here is the query.
INSERT INTO HD_DtlClm.[dim_city_province_country_t] (
city_nm, prov_en_nm, prov_fr_nm, contry_fr_nm, contry_en_nm
)
SELECT gr_mbr_city_nm, PROV_ENG_NM, PROV_FR_NM, CONTRY_ENG_NM, CONTRY_FR_NM
FROM isu.gr_dentl_clm_v
LEFT JOIN HD_DtlClm.province_information_t
ON gr_dentl_clm_v.gr_mbr_prov_cd = HD_DtlClm.province_information_t.PROV_CLM_CD
UNION
SELECT gr_prvdr_city_nm, PROV_ENG_NM, PROV_FR_NM, CONTRY_ENG_NM, CONTRY_FR_NM
FROM isu.gr_dentl_clm_v
LEFT JOIN HD_DtlClm.province_information_t
ON gr_dentl_clm_v.gr_prvdr_prov_cd IN (HD_DtlClm.province_information_t.PROV_ENG_CD, HD_DtlClm.province_information_t.PROV_CLM_CD)
Any idea why I get this error that I didn't get in the past?
EDIT to add primary key creation script:
ALTER TABLE [HD_DtlClm].[dim_city_province_country_t] ADD CONSTRAINT [Dim_city_province_country_pk] PRIMARY KEY CLUSTERED
( [cpc_key] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
EDIT2 to add table creation script
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [HD_DtlClm].[dim_city_province_country_t](
[cpc_key] [int] IDENTITY(1,1) NOT NULL,
[city_nm] [char](50) NOT NULL,
[prov_en_nm] [char](50) NULL,
[prov_fr_nm] [char](50) NULL,
[contry_en_nm] [char](75) NULL,
[contry_fr_nm] [char](75) NULL,
[create_ts] [datetime] NOT NULL,
[update_ts] [datetime] NOT NULL,
CONSTRAINT [Dim_city_province_country_pk] PRIMARY KEY CLUSTERED
(
[cpc_key] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
CONSTRAINT [dim_city_province_country_ak1] UNIQUE NONCLUSTERED
(
[city_nm] ASC,
[prov_en_nm] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = ON, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [HD_DtlClm].[dim_city_province_country_t] ADD DEFAULT (getdate()) FOR [create_ts]
GO
ALTER TABLE [HD_DtlClm].[dim_city_province_country_t] ADD DEFAULT (getdate()) FOR [update_ts]
GO
Try running: DBCC CHECKIDENT ('HD_DtlClm.[dim_city_province_country_t]'); look at the results returned in the messages tab & make sure the current identity value is equal to or higher than the current column value. NB running this may even fix the problem itself.
To expand: looks like something had reseeded your identity column, so the insert was causing duplicates to be picked up. Don't think there's any way to check historically what changed it; the most likely candidates are the DBCC CHECKIDENT command with RESEED option, or a TRUNCATE operation (will reseed to the original value).

SQL Server rows not in order of clustered index

I have a table that has a clustered index on the id
[SomeID] [bigint] IDENTITY(1,1) NOT NULL,
When I do
select top 1000 * from some where date > '20150110'
My records are not in order
When I do:
select top 1000 * from some where date > '20150110' and date < '20150111'
They are in order?
Index is :
CONSTRAINT [PK_Some] PRIMARY KEY CLUSTERED
(
[SomeID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
I have never come across this before, does anyone have an idea of what is happening and how I can fix this.
Thanks
You can't rely on an order if you do not specify one. Add an order by clause.
Otherwise the DB will just grab the result as fast as possible and that is not always in the order of the index.

Resources