Looker Studio (Data Studio) not filtering empty string - google-data-studio

I'm trying to filter an empty string in Data Studio's data (Comes from BigQuery).
Basically when running on BigQuery
SELECT serviceId, COUNT(DISTINCT ip) qty FROM `nyTable` WHERE DATE(date) BETWEEN "2022-11-05" AND "2022-11-08"
GROUP BY serviceId
HAVING serviceId IS NOT NULL
ORDER BY qty DESC
I get the following
serviceId
qty
3323593
59200
309732
... (Other irrelevant data)
...
Adding AND serviceId != "" as following correctly removes the empty string row (the first one)
SELECT serviceId, COUNT(DISTINCT ip) qty FROM `myTable` WHERE DATE(date) BETWEEN "2022-11-05" AND "2022-11-08"
GROUP BY serviceId
HAVING serviceId IS NOT NULL AND serviceId != ""
ORDER BY qty DESC
But when i try the same approach on DataStudio it doesn't removes the empty string data
Data Studio's field query i'm trying to use:
CASE
WHEN ( REGEXP_MATCH(ServiceId, R"^\d+$") AND ServiceId != "null" AND ServiceId IS NOT NULL AND ServiceId != "" AND LENGTH(ServiceId) > 3 ) THEN ServiceId
END
In the Data Studio's table it appears like a null value (That's the reason i tried filtering other values)
My objective with this is to add a parameter (checkbox) to control whether to show invalid values or not. (That's the reason why i'm not using it as a filter)
Any tips?

Related

Sqlite Where clause with or shows deleted records

I am using the following SQLite query to fetch data which is of type Farmer or Buyer.But it shows the deleted data also in that query.In the table, i am having the type as Farmer, Worker and Buyer in which i would like to take the data which is not deleted of type Farmer and Buyer.can anyone tell me what is wrong in that query pls.
SELECT * FROM contacts WHERE user_id = 317 AND (is_deleted IS NULL or is_deleted != 'true') AND type = 'Farmer' OR type = 'Buyer' ORDER BY name COLLATE NOCASE ASC
Table Structure:
Try this code
SELECT * FROM contacts WHERE user_id = 317 AND type != 'Worker' AND (is_deleted IS NULL or is_deleted != 'true') ORDER BY name COLLATE NOCASE ASC
Instead of using or you can use the above query.

Query for item with subset of related items

I've got two tables:
Part (Table)
----
PartID
SerialNumber
CreationDate
Test (Table)
----
PartID
TestName
TestDateTime
TestResult
The tables have a one to many relationship on PartID, one part may have many Test entries.
What I'm trying to do is return a list of parts with the information of only the last test performed on that part.
Part Test
PartID SerialNumber CreationDate PartID TestName TestDateTime TestResult
-------------------------------- -------------------------------------------
1 555 12/9/2013 1 Test 1 1/1/2014 Pass
1 Test 2 2/2/2014 Fail
I would like to return the last test data with the part's information:
PartID SerialNumber CreationDate TestName TestDateTime TestResult
-----------------------------------------------------------------
1 555 12/9/2013 Test 2 2/2/2014 Fail
I can currently get the TestDateTime of the part's last test, but no other information with this query (as a subquery cannot return more than more item):
SELECT PartID, SerialNumber, CreationDate,
(SELECT TOP (1) TestDateTime
FROM Test
WHERE (PartID = Part.PartID)
ORDER BY TestDateTime DESC) AS LastDateTime
FROM Part
ORDER BY SerialNumber
Is there a different approach I can take to get the data I'm looking for?
Here is another way to do that only hits the Test table one time.
with SortedData as
(
SELECT PartID
, SerialNumber
, CreationDate
, TestDateTime
, ROW_NUMBER() over (Partition by PartID ORDER BY TestDateTime DESC) AS RowNum
FROM Part p
join Test t on t.PartID = p.PartID
)
select PartID
, SerialNumber
, CreationDate
, TestDateTime
from SortedData
where RowNum = 1
ORDER BY SerialNumber
If you are on 2012 or later you can also use FIRST_VALUE
Try using a sub query in your join and then filter based on that. Your Sub query should select the PardID and Max(TestDateTime)
Select TestSubQ.PartID, Max(TestSubQ.TestDateTime)
From Test TestSubQ
group by TestSubQ.PartID
Then just filter your main query by joining this table
Select Part.PartID, SerialNumber, CreationDate,
TestMain.PartID, TestMain.TestName, TestMain.TestDateTime, TestMain.TestResult
From Part
Left Outer Join (Select TestSubQ.PartID, Max(TestSubQ.TestDateTime)
From Test TestSubQ
group by TestSubQ.PartID) TestPartSub
On Part.PartID = TestPartSub.PartID
Left Outer Join Test TestMain
On TestPartSub.PartID = TestMain.PartID
And TestPartSub.TestDateTime = TestMain.TestDateTime
Order By SerialNumber
Note though that if your data only contains dates and not times then you may still end up with 2 entries if two tests were done on the same date. If time is included though it is highly unlikely that two exact datetimes will match for two different tests for any one part.

TSQL Order By Specific Value

I need to order my results such that all items with the status column being a specific value come up first, then by date.
I tried this:
SELECT Id, Status, CreatedAt FROM Table
ORDER BY (Status=1) DESC, CreatedAt
I figured I'd get a bool value on (Status=1) so ordering by DESC to put the true (1) values on the top.
But I'm getting a syntax error. Is this possible and if so what is the correct syntax?
Thanks!
You can use CASE also in the ORDER BY:
SELECT Id, Status, CreatedAt
FROM Table
ORDER BY
CASE WHEN Status = 1 THEN 0 ELSE 1 END ASC,
CreatedAt ASC
Try this
SELECT Id, Status, CreatedAt FROM Table
ORDER BY (case when Status=1 then 1 else 2 end), CreatedAt

JOIN ON subselect returns what I want, but surrounding select is missing records when subselect returns NULL

I have a table where I am storing records with a Created_On date and a Last_Updated_On date. Each new record will be written with a Created_On, and each subsequent update writes a new row with the same Created_On, but an updated Last_Updated_On.
I am trying to design a query to return the newest row of each. What I have looks something like this:
SELECT
t1.[id] as id,
t1.[Store_Number] as storeNumber,
t1.[Date_Of_Inventory] as dateOfInventory,
t1.[Created_On] as createdOn,
t1.[Last_Updated_On] as lastUpdatedOn
FROM [UserData].[dbo].[StoreResponses] t1
JOIN (
SELECT
[Store_Number],
[Date_Of_Inventory],
MAX([Created_On]) co,
MAX([Last_Updated_On]) luo
FROM [UserData].[dbo].[StoreResponses]
GROUP BY [Store_Number],[Date_Of_Inventory]) t2
ON
t1.[Store_Number] = t2.[Store_Number]
AND t1.[Created_On] = t2.co
AND t1.[Last_Updated_On] = t2.luo
AND t1.[Date_Of_Inventory] = t2.[Date_Of_Inventory]
WHERE t1.[Store_Number] = 123
ORDER BY t1.[Created_On] ASC
The subselect works fine...I see X number of rows, grouped by Store_Number and Date_Of_Inventory, some of which have luo (Last_Updated_On) values of NULL. However, those rows in the sub-select where luo is null do not appear in the overall results. In other words, where I get 6 results in the sub-select, I only get 2 in the overall results, and its only those rows where the Last_Updated_On is not NULL.
So, as a test, I wrote the following:
SELECT 1 WHERE NULL = NULL
And got no results, but, when I run:
SELECT 1 WHERE 1 = 1
I get back a result of 1. Its as if SQL Server is not relating NULL to NULL.
How can I fix this? Why wouldn't two fields compare when both values are NULL?
You could use Coalesce (example assuming Store_Number is an integer)
ON
Coalesce(t1.[Store_Number],0) = Coalesce(t2.[Store_Number],0)
The ANSI Null comparison is not enabled by default; NULL doesn't equal NULL.
You can enable this (if your business case and your Database design usage of NULL requires this) by the Hint:
SET ansi_nulls off
Another alternative basic turn around using:
ON ((t1.[Store_Number] = t2.[Store_Number]) OR
(t1.[Store_Number] IS NULL AND t2.[Store_Number] IS NULL))
Executing your POC:
SET ansi_nulls off
SELECT 1 WHERE NULL = NULL
Returns:
1
This also works:
AND EXISTS (SELECT t1.Store_Number INTERSECT SELECT t2.Store_Number)

Assigning NULL value to column name using Case Statement of where is SQL Server 2008

I have a table GrowDaysLocation. In this table there can be two records with respect to DistrictId and RegionId
i.e;
RegionId=1 and DistrictId = NULL
RegionId=1 and DistrictId = 1
I have a condition that if the row doesn't exist for RegionId = 1 and DistrictId = 1 then get the row for RegionId = 1 and DistrictId = NULL.
How can I accomplish this using a single query?
The below is the query I have tried out.
In this query, I have used CASE in Where clause and Use sub-query to find out the row's existence but the problem is when I return NULL from the case it will not return any rows.
==================================================
SET ANSI_NULLS OFF
Select * From GrowDaysLocations
Where DistrictId =
( CASE WHEN (Select Count(*) From GrowDaysLocations
Where RegionId = '38D95A68-4A92-4D11-9A88-464CF1492880' AND DistrictId = 'F4B67A07-1BF7-42F5-9F19-77329A215D8B' AND
GrowDaysProfileId = '79F8BDBF-67D3-44A7-A790-1C10EE8B2AD0') > 0 THEN DistrictId
ELSE
NULL
END
)
AND RegionId = '38D95A68-4A92-4D11-9A88-464CF1492880' AND GrowDaysProfileId = '79F8BDBF-67D3-44A7-A790-1C10EE8B2AD0'
===========================================
The RANK function may give you the results you are looking for:
SELECT RegionId, DistrictId, GrowDaysProfileId
FROM
(SELECT RegionId
,DistrictId
,GrowDaysProfileId
,RANK() OVER(PARTITION BY RegionId, GrowDaysProfileId
ORDER BY DistrictId DESC) AS rankVal
From GrowDaysLocation) sub
WHERE rankVal = 1
This query will give you a result set with one row for each distinct RegionId and GrowDaysProfileId. If a RegionId/GrowDaysProfileId combination has more than one row in the table, the query will select a result based on the value of DistrictId. The row with the highest DistrictId value will be used first and the row with the lowest DistrictId (NULL being the lowest) last.
This is because when evaluating the condition
WHERE ColumnName = NULL
all rows are filtered out because when checking for equality, if any side is NULL, the row is immediately filtered out regardless of the condition (see Data Manipulation Language section) . You have to use
WHERE ColumnName IS NULL
In this case you can use
WHERE ISNULL(DistrictID, 'Empty') = ...
And use the 'Empty' string instead of null in the case statement.

Resources