Find columns that match in two tables - sql-server

I need to query two tables of companies in the first table are the full names of companies, and the second table are also the names but are incomplete. The idea is to find the fields that are similar. I put pictures of the reference and SQL code I'm using.
The result I want is like this
The closest way I found to do so:
SELECT DISTINCT
RTRIM(a.NombreEmpresaBD_A) as NombreReal,
b.EmpresaDB_B as NombreIncompleto
FROM EmpresaDB_A a, EmpresaDB_B b
WHERE a.NombreEmpresaBD_A LIKE 'VoIP%' AND b.EmpresaDB_B LIKE 'VoIP%'
The problem with the above code is that it only returns the record specified in the WHERE and if I put this LIKE '%' it returns the Cartesian product of two tables. The RDBMS is Microsoft SQL Server. I would greatly appreciate if you help me with any proposed solution.

Use the short name plus appended '%' as argument in the LIKE expression:
Edit with info that we deal with SQL Server:
SELECT a.NombreEmpresaBD_A as NombreReal
,b.NombreEmpresaBD_B as NombreIncompleto
FROM EmpresaDB_A a, EmpresaDB_B b
WHERE a.NombreEmpresaBD_A LIKE (b.NombreEmpresaBD_B + '%');
According to your screenshot you had the column name wrong!
String concatenation in T-SQL with + operator.
Above query finds a case where
'Computex S.A' LIKE 'Computex%'
but not:
'Voip Service Mexico' LIKE 'VoipService%'
For that you would have to strip blanks first or use more powerful pattern matching functions.
I have created a demo for you on data.SE.
Look up pattern matching or the LIKE operator in the manual.

I would suggest adding a foreign key between the tables linking the data. Then you can just search for the one table and join the second to get the other results.

Related

How to match a substring exactly in a string in SQL server?

I have a column workId in my table which has values like :
W1/2009/12345, G2/2018/2345
Now a user want to get this particular id G2/2018/2345. I am using like operator in my query as below:
select * from u_table as s where s.workId like '%2345%' .
It is giving me both above mentioned workids. I tried following query:
select * from u_table as s where s.workId like '%2345%' and s.workId not like '_2345'
This query also giving me same result.
If anyone please provide me with the correct query. Thanks!
Why not use the existing delimiters to match with your criteria?
select *
from u_table
where concat('/', workId, '/') like concat('%/', '2345', '/%');
Ideally of course your 3 separate values would be 3 separate columns; delimiting multiple values in a single column goes against first-normal form and prevents the optimizer from performing an efficient index seek, forcing a scan of all rows every time, hurting performance and concurrency.

How to query multiple JSON document schemas in Snowflake?

Could anyone tell me how to change the Stored Procedure in the article below to recursively expand all the attributes of a json file (multiple JSON document schemas)?
https://support.snowflake.net/s/article/Automating-Snowflake-Semi-Structured-JSON-Data-Handling-part-2
Craig Warman's stored procedure posted in that blog is a great idea. I asked him if it was okay to refactor his code, and he agreed. I've used the refactored version in the field, so I know the SP well as well as how it works.
It may be possible to modify the SP to work on your JSON. It will depend on whether or not Snowflake types the JSON in your variant column. The way you have it structured, it may not type everything. You can check by running this SQL and seeing if the result set includes all the columns you need:
set VARIANT_TABLE = 'WEATHER';
set VARIANT_COLUMN = 'V';
with MAIN_TABLE as
(
select * from identifier($VARIANT_TABLE) sample (1000 rows)
)
select distinct REGEXP_REPLACE(REGEXP_REPLACE(f.path, '\\[(.+)\\]'),'[^a-zA-Z0-9]','_') AS path_name, -- This generates paths with levels enclosed by double quotes (ex: "path"."to"."element"). It also strips any bracket-enclosed array element references (like "[0]")
typeof(f.value) AS attribute_type, -- This generates column datatypes.
path_name AS alias_name -- This generates column aliases based on the path
from
MAIN_TABLE,
LATERAL FLATTEN(identifier($VARIANT_COLUMN), RECURSIVE=>true) f
where TYPEOF(f.value) != 'OBJECT'
AND NOT contains(f.path, '[');
Be sure to replace the variables to your table and column names. If this picks up the type information for the columns in your JSON, then it's possible to modify this SP to do what you need. If it doesn't but there's a way to modify the query to get it to pick up the columns, that would work too.
If it doesn't pick up the columns, based on Craig's idea I decided to write type inference for non variant (such as strings from CSV log files without type information). Try the SQL above and see what results first.

MS SQL Excel Query wildcards

I'm trying to introduce LIKE clause with wildcards in SQL query that runs within Excel 2007, where parameters are taken from specific Excel cells:
SELECT Elen_SalesData_View.ItemCode, Elen_SalesData_View.ItemDescription,
Elen_SalesData_View.ItemValue, Elen_SalesData_View.Quantity,
Elen_SalesData_View.CustomerId, Elen_SalesData_View.CustomerName,
Elen_SalesData_View.SalesInvoiceId, Elen_SalesData_View.EffectiveDate,
Elen_SalesData_View.CountryId
FROM SM_Live.dbo.Elen_SalesData_View Elen_SalesData_View
WHERE (Elen_SalesData_View.EffectiveDate>=? And Elen_SalesData_View.EffectiveDate<=?)
AND (Elen_SalesData_View.CustomerName<>'PROMO')
AND (Elen_SalesData_View.ItemDescription LIKE '%'+?+'%')
The EffectiveDate parameters are running fine and bringing back data as expected. But since I introduced LIKE - query runs, but returns nothing.
It doesn't return any results without wildcards either (full description entered):
(Elen_SalesData_View.ItemDescription LIKE ?)
Is there a restriction to wildcards or LIKE clause? If so, is there a way around it? (I cannot use CONTAINS, as the ItemDescription field is not FULLTEXT)
Have a look at this reference which suggests that % itself is the wildcard character, although it may depend on the dialect of SQL you are using. If this is the case then your LIKE clause will simply be LIKE '%' but untested.
I've just got this to work by using the (Elen_SalesData_View.ItemDescription LIKE ?) syntax then having the cell that contains the parameter value include the wildcard characters. If you don't/can't include the wildcards then create a formula in a separate cell to wrap the value in % characters and use this cell for the parameter value.
Rhys
My query was correct. There was something wrong with the actual spreadsheet. After redoing all from scratch - it worked!
SELECT Elen_SalesData_View.ItemCode, Elen_SalesData_View.ItemDescription,
Elen_SalesData_View.ItemValue, Elen_SalesData_View.Quantity,
Elen_SalesData_View.CustomerId, Elen_SalesData_View.CustomerName,
Elen_SalesData_View.SalesInvoiceId, Elen_SalesData_View.EffectiveDate,
Elen_SalesData_View.CountryId
FROM SM_Live.dbo.Elen_SalesData_View Elen_SalesData_View
WHERE (Elen_SalesData_View.ItemDescription Like '%'+?+'%')
AND (Elen_SalesData_View.EffectiveDate>=?) AND (Elen_SalesData_View.EffectiveDate<=?)
AND (Elen_SalesData_View.CustomerName<>'PROMO')

Can SQL Server index a text string by delimiter?

I need to store content keyed by strings, so a database table of key/value pairs, essentially. The keys, however, will be of a hierarchical format, like this:
foo.bar.baz
They'll have multiple categories, delimited by dots. The above value is in a category called "baz" which is in a parent category called "bar" which is in a parent category called "foo."
How can I index this in such a way that it's rapidly searchable for different permutations of the key/dot combo? For example, I want to be able to very quick find everything that starts
foo
Or
foo.bar
Yes, I could do a LIKE query, but I never need find anything like:
fo
So that seems like a waste to me.
Is there any way that SQL would index all permutation of a string delimited by the dots? So, in the above case we have:
foo
foo.bar
foo.bar.baz
Is there any type of index that would facilitate searching like that?
Edit
I will never need to search backwards or from the middle. My searches will always begin from the front of the string:
foo.bar
Never:
bar.baz
SQL Server can't really index substrings, no. If you only ever want to search on the first string, this will work fine, and will perform an index seek (depending on other query semantics of course):
WHERE col LIKE 'foo.%';
-- or
WHERE col LIKE 'foo.bar.%';
However when you start needing to search for bar or baz following any leading string, you will need to search on the substring:
WHERE col LIKE '%.bar.%';
-- or
WHERE PATINDEX('%.bar.%', col) > 0;
This won't work well with regular B-tree indexes, and I don't think Full-Text Search will be much help either, because of the special characters (periods) - but you should try it out if this is a requirement.
In general, storing data this way smells wrong to me. Seems to me that you should either have separate columns instead of jamming all the data into one column, or using a more relational EAV design.
Its appears to be a work for CTE!
create TableA(
id int identity,
parentid int null,
name varchar(50)
)
for a (fixed) two level its easy
select t2.name, t1.name
from tableA t1
join tableA t2 on t2.id = t1.parentid
where t2.name = 'father'
To find that kind of hierarchical values for a most general case you ill need some kind of recursion in self-join table by using a CTE.
http://msdn.microsoft.com/pt-br/library/ms175972.aspx

contains clause sql server

Assume that I had the following t-sql:
select * from customer where contains(suburb, '"mount*"').
Now it brings back both: "Mount Hills" and "Blue Mountain". How do I strict it to search only the very beginning of the word which in this case is only "Mount Hills"?
thanks
'contains' is used for exactly that. Use
select * from customer where charindex('mount', suburb)=1
or
select * from customer where suburb like 'mount%'
but that's slower.
Your query works correctly, you asked server "give me all record where ANY word in column 'suburb'" starts with 'mount'.
You need to be more specific what are you trying to accomplish. Match beginning of the entire value stored in column? LIKE is your friend then.

Resources