LIKE operator in Snowflake - snowflake-cloud-data-platform

Is there any way to code this in snowflake?
Table 1
Based on look up table's DEPT, I want to have values like below in main table
This means, when it is SALES INTERNAL and then extra texts it should take the value of "SALES INTERNAL" as data access.
Below code produces duplicates.
ON LOWER (LTRIM(RTRIM(nvl(MainTable."DEPT", 'NA')))) ilike LOWER (LTRIM(RTRIM(nvl(LookUpTable."DEPT,'NA')))||'%')
Any help is appreciated.

Related

Sorting the view based on frequency in SQL Server

I have a StockinHand view generated from stock_Outward & Stock_Inward tables right now needs the sorting based on frequency i.e most moving stock items should be on top of the table
My tables are like below:
tbl_StockInward:
ID, Stock_Code,Units,Rate, Description, Vendor, DateOfPurchase, DateOfUpdate, Purchased_By, WareHouse, Remarks,
vice versa tbl_StockOutward
Please help me
Thanks in advance
Just like in sub queries, you can't use ORDER BY in a view definition in sql server unless you also use TOP.
The reason for this is that Views are acted upon as if they where tables, and tables in sql server (in fact, in any relational database) are considered as not ordered sets.
Just like there is no meaning to the order of records stored in a table, there is also no meaning to the order of records fetched by a view.
You can use a dirty hack and write SELECT TOP 100 PERCENT ... and then use ORDER BY, but I doubt if it has any meaning at all.
Having said all that, you can of course use ORDER BY in any query that selects from a view.

Dynamic PIVOT with varchar columns

I'm trying to to pivot rows into columns. I basically have lots of lines where every N rows means a row on a table I'd like to list as a result set. I'll give a short example:
I have a table structure like this:
Keep it in mind that I removed lots of rows to simplify this example. Every 6 rows means 1 row in the result set, which I would like to be like this:
All columns are varchar types (that's why I couldn't get it done with pivot)
Number os columns are dynamic, so it's the number of rows in source table
Logically, Number of rows (table rows in result set) are equally dynamic
(Not really an answer, but it's what I've got.)
This is a name/value pair table, right? Your query will require something that identifies which "set" of rows is associated with one another. Without something like this, I don't see how the query can be written. The key factor is that you must never assume that data will be returned from SQL (Server, at least) in any particular order. How the data is stored internally generally, but not always, determines how it is returned when order is not specified.
Another consideration: what if (when?) a row is missing -- say, Product 4 has no Price B column? That would break a simple "every six rows" rule. "Start fresh with every new Code row" would it problems if a Code is missed or when (not if) data is not returned in the anticipated order.
If you have some means of grouping items, let us know in an updated question, but otherwise I don't think this one is particularly solvable.
I actually did it.
I wrote a SQL while...do based on the number of columnns registered for the resultset. This way I could write a dynamic SQL clause for N columns based on the values read. In the end I just inserted the resultset in a temp table, and voi lá.
Thanks anyways!

How to group rows (bassed on CustomerID) using Pivot in SSIS?

I am practicing SSIS and currently working on Pivot transformation. Here's what i am working on.
I created a Data Source (Table name: Pivot) with the following data.
Using SSIS, i created a package for Pivoting the data to have the following columns
PersonID --- Product1 --- Product2 --- Product3.
Here's where am at, I was able to create the pivot data to text file. But The output is not grouped by PersonID.
My Current Output is
As we can see the Transformation does not group the based on
SetKey(PersonID : PivotUsage =1)
The output i am hoping to get is
Where the data is grouped based on PersonID.
What am i missing here?
Edit:
Going back to the example i was following, I re-ordered the input data as follows.
Does the Input data need to be in this order/pattern, every time? Most of the examples i came across follow the similar pattern.
Yes, the input data needs to be sorted by whatever you're pivoting on:
To pivot data efficiently, which means creating as few records in the
output dataset as possible, the input data must be sorted on the pivot
column. If the data is not sorted, the Pivot transformation might
generate multiple records for each value in the set key, which is the
column that defines set membership. For example, if the dataset is
pivoted on a Name column but the names are not sorted, the output
dataset could have more than one row for each customer, because a
pivot occurs every time that the value in Name changes.
That's a direct quote from the Pivot Transformation documentation on MSDN. (Emphasis added.)
When I first read this answer, I thought that the sorted column should be the one with PivotUsage=2 in the pivot. That's what I understood the pivot column to be. However, what finally worked for me was to sort by a column with pivot usage=1. It's a column I would group by if writing the sql by hand.

SQL Server Select Query

I have to write a query to get the following data as result.
I have four columns in my database. ID is not null, all others can have null values.
EMP_ID EMP_FIRST_NAME EMP_LAST_NAME EMP_PHONE
1 John Williams +123456789
2 Rodney +124568937
3 Jackson +124578963
4 Joyce Nancy
Now I have to write a query which returns the columns which are not null.
I do not want to specify the column name in my query.
I mean, I want to use SELECT * FROM TABLE WHERE - and add the filter, but I do not want to specify the column name after the WHERE clause.
This question may be foolish but correct me wherever necessary. I'm new to SQL and working on a project with c# and sql.
Why I do not want to use the column name because, I have more than 250 columns and 1500 rows. Now if I select any row, at least one column will have null value. I want to select the row, but the column which has null values for that particular row should not appear in the result.
Please advice. Thank you in advance.
Regards,
Vinay S
Every row returned from a SQL query must contain exactly the same columns as the other rows in the set. There is no way to select only those columns which do not return null unless all of the results in the set have the same null columns and you specify that in your select clause (not your where clause).
To Anders Abels's comment on your question, you could avoid a good deal of the query complexity by separating your data into tables which serve common purposes (called normalizing).
For example, you could put names in one table (Employee_ID, First_Name, Last_Name, Middle_Name, Title), places in another (Address_ID, Address_Name, Street, City, State), relationships in another, then tiny 2-4 column tables which link them all together. Structuring your data this way avoids duplication of individual facts, like, "who is John Williams's supervisor and how do I contact that person."
Your question reads:
I want to get all the columns that don't have a null value.
And at the same time:
But I don't want to specify column names in the WHERE clause.
These are conflicting goals. Your only option is to use the sys.tables and sys.columns DMVs to build a series of dynamic SQL statements. In the end, this is going to be more work that just writing one query by hand the first time.
You can do this with a dynamic PIVOT / UNPIVOT approach, assuming your version of SQL Server supports it (you'll need SQL Server 2005 or better), which would be based on the concepts found in these links:
Dynamic Pivot
PIVOT / UNPIVOT
Effectively, you'll select a row, transform your columns into rows in a pivot table, filter out the NULL entries, and then unpivot it back into a single row. It's going to be ugly and complex code, though.

SQL Query - 20mil records - Best practice to return information

I have a SQL database that has the following table:
Table: PhoneRecords
--------------
ID(identity Seed)
FirstName
LastName
PhoneNumber
ZipCode
Very simple straight forward table. This table has over 20million records. I am looking for the best way to do queries that pull out records based off area codes from the table. For instance here is an example query that I have done.
SELECT phonenumber, firstname
FROM [PhoneRecords]
WHERE (phone LIKE '2012042%') OR
(phone LIKE '2012046%') OR
(phone LIKE '2012047%') OR
(phone LIKE '2012083%') OR
(phone LIKE '2012088%') OR
(phone LIKE '2012841%')
As you can see this is an ugly query, but it would get the job done (if I wasn't running into timeout issues)
Can anyone tell me the best way for speed/optimization to do the above query to display the results? Currently that query above takes around 2 hours to complete on a 9gb 1600mhz ram, i7 930 quadcore OC'd 4.01ghz. I obviously have the computer power required to do such a query, but still takes too long for queries.
You are probably missing an index on the phonenumber column.
CREATE INDEX IX_PHONERECORDS_PHONENUMBER_FIRSTNAME
ON dbo.PhoneRecords (PhoneNumber) INCLUDE (FirstName)
If that does not help, post the execution plan (CTRL+M).
First you need an index on the column phone. If you don't have one, add it.
If it still runs slowly you might try to use UNION ALL instead of OR as this can be easier for the optimizer to work with. This works because the way you have constructed your conditions guarantee that results will be distinct. So your query could be rewritten as:
SELECT phonenumber, firstname FROM [PhoneRecords] WHERE phone LIKE '2012042%'
UNION ALL
SELECT phonenumber, firstname FROM [PhoneRecords] WHERE phone LIKE '2012046%'
UNION ALL
SELECT phonenumber, firstname FROM [PhoneRecords] WHERE phone LIKE '2012047%'
UNION ALL
SELECT phonenumber, firstname FROM [PhoneRecords] WHERE phone LIKE '2012083%'
UNION ALL
SELECT phonenumber, firstname FROM [PhoneRecords] WHERE phone LIKE '2012088%'
UNION ALL
SELECT phonenumber, firstname FROM [PhoneRecords] WHERE phone LIKE '2012041%'
This query should be able to use the index to run efficiently.
You should look at the execution plan before running the actual query and make sure that there is no TABLE SCAN or INDEX SCAN.
Do you have any indexes? A first step is to put an index on the PhoneNumber column. If that isn't enough (I don't know the exact details of searching on part of strings in indexed columns) I would suggest adding another column named "AreaCode" which can be automatically computed from the PhoneNumber column. Then you can add an index on the AreaCode column.
The first and very obvious question is do you have indexes? You need to create indexes on at least phone number if you are going to be query against it. You should probably create an covering index which includes the fields you want and the fields that are in the where clause so the computer doesn't have to waste time fetching the row after it has found in the index to get to the information you want. Obviously the flip side to that is the bigger your index the slower your query.
You may split your phone number column : [Area Code], [Phone Number]
Then, if this query is the "most important" in your application for this table and the ratio returned rows/total rows is high, add a CLUSTERED index on [Area Code] otherwise, add a standard index.
You may also keep the Phone Number column as is and index it directly, it depends on your app.
First I would split the phone column to "Area code" and "Phone number".
Also, I would convert this numbers to int; Indexes will perform faster.
AreaCode = 2012042
should be much faster then
PhoneNumber LIKE '2012042%'
Even if you are doing a table scan (and it can happen even if you have an index, if the selectivity is low) your query should execute way faster than 2 hours. Your table is small enough to fit entirely in the sql server buffer pools, if there is no competition with others tables scanned by another queries and if the sqlserver max memory is large enough. So while you can do some tricks like adding indexes or splitting the phone number in area+phone you should investigate the sql server configuration and also your system configuration.
http://igoro.com/archive/precomputed-view-a-cool-and-useful-sql-pattern
create a materialized view which includes the first n numbers of the phone number as it's own column. Then you can query against the area code column and include the names. Precompute the area codes so it doesn't have to be done on every select. Don't use the or operator if you can help it. Use union to help the query plan use the index.
As it is, the query you're running will do 20,000,000 times x comparisons, where x is the number of area codes you're searching for every time you do the select. By querying an exact indexed column, you won't need to go to the table at all and the index can be searched in an efficient manner O(log n) I think.

Resources